spateo.alignment.methods.utils
#
Module Contents#
Functions#
|
Check the proper backend for the device. |
|
Check spatial coordinate information. |
|
Check expression matrix. |
|
Filters for the intersection of genes between all samples. |
|
Normalize the spatial coordinate. |
|
Normalize the gene expression. |
|
Data preprocessing before alignment. |
|
|
|
|
|
Returns pairwise KL divergence (over all pairs of samples) of two matrices X and Y. |
|
Calculate the KL distance between two vectors |
|
Calculate expression dissimilarity. |
|
Calculate the distance between two vectors |
|
Calculate the matrix multiplication of two matrices |
|
Get the optimal rotation matrix R |
|
|
|
PCA dimensionality reduction using SVD decomposition |
|
|
|
|
|
|
|
|
|
|
|
|
|
Voxelization of the data. |
|
Attributes#
- spateo.alignment.methods.utils.check_backend(device: str = 'cpu', dtype: str = 'float32', verbose: bool = True)[source]#
Check the proper backend for the device.
- Parameters:
- device
Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.
- dtype
The floating-point number type. Only float32 and float64.
- verbose
If
True
, print progress updates.
- Returns:
The proper backend. type_as: The type_as.device is the device used to run the program and the type_as.dtype is the floating-point number type.
- Return type:
backend
- spateo.alignment.methods.utils.check_spatial_coords(sample: anndata.AnnData, spatial_key: str = 'spatial') numpy.ndarray [source]#
Check spatial coordinate information.
- Parameters:
- sample
An anndata object.
- spatial_key
The key in .obsm that corresponds to the raw spatial coordinates.
- Returns:
The spatial coordinates.
- spateo.alignment.methods.utils.check_exp(sample: anndata.AnnData, layer: str = 'X') numpy.ndarray [source]#
Check expression matrix.
- Parameters:
- sample
An anndata object.
- layer
The key in .layers that corresponds to the expression matrix.
- Returns:
The expression matrix.
- spateo.alignment.methods.utils.filter_common_genes(*genes, verbose: bool = True) list [source]#
Filters for the intersection of genes between all samples.
- Parameters:
- genes
List of genes.
- verbose
If
True
, print progress updates.
- spateo.alignment.methods.utils.normalize_coords(coords: Union[List[np.ndarray or torch.Tensor], numpy.ndarray, torch.Tensor], nx: ot.backend.TorchBackend | ot.backend.NumpyBackend = ot.backend.NumpyBackend, verbose: bool = True) Tuple[List[numpy.ndarray], List[numpy.ndarray], List[numpy.ndarray]] [source]#
Normalize the spatial coordinate.
- Parameters:
- coords
Spatial coordinate of sample.
- nx
The proper backend.
- verbose
If
True
, print progress updates.
- spateo.alignment.methods.utils.normalize_exps(matrices: List[np.ndarray or torch.Tensor], nx: ot.backend.TorchBackend | ot.backend.NumpyBackend = ot.backend.NumpyBackend, verbose: bool = True) List[numpy.ndarray] [source]#
Normalize the gene expression.
- Parameters:
- matrices
Gene expression of sample.
- nx
The proper backend.
- verbose
If
True
, print progress updates.
- spateo.alignment.methods.utils.align_preprocess(samples: List[anndata.AnnData], genes: list | numpy.ndarray | None = None, spatial_key: str = 'spatial', layer: str = 'X', normalize_c: bool = False, normalize_g: bool = False, select_high_exp_genes: bool | float | int = False, dtype: str = 'float64', device: str = 'cpu', verbose: bool = True, **kwargs) Tuple[ot.backend.TorchBackend or ot.backend.NumpyBackend, torch.Tensor or np.ndarray, list, list, list, Optional[float], Optional[list]] [source]#
Data preprocessing before alignment.
- Parameters:
- samples
A list of anndata object.
- genes
Genes used for calculation. If None, use all common genes for calculation.
- spatial_key
The key in .obsm that corresponds to the raw spatial coordinates.
- layer
If ‘X’, uses
sample.X
to calculate dissimilarity between spots, otherwise uses the representation given bysample.layers[layer]
.- normalize_c
Whether to normalize spatial coordinates.
- normalize_g
Whether to normalize gene expression.
- select_high_exp_genes
Whether to select genes with high differences in gene expression.
- dtype
The floating-point number type. Only float32 and float64.
- device
Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.
- verbose
If
True
, print progress updates.
- spateo.alignment.methods.utils.shape_align_preprocess(coordsA, coordsB, dtype: str = 'float64', device: str = 'cpu', verbose: bool = True, **kwargs)[source]#
- spateo.alignment.methods.utils._mask_from_label_prior(adataA: anndata.AnnData, adataB: anndata.AnnData, label_key: str | None = 'cluster')[source]#
- spateo.alignment.methods.utils.kl_divergence_backend(X, Y, probabilistic=True)[source]#
Returns pairwise KL divergence (over all pairs of samples) of two matrices X and Y. Takes advantage of POT backend to speed up computation. :param X: np array with dim (n_samples by n_features) :param Y: np array with dim (m_samples by n_features)
- Returns:
np array with dim (n_samples by m_samples). Pairwise KL divergence matrix.
- Return type:
D
- spateo.alignment.methods.utils.kl_distance(X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, use_gpu: bool = True, chunk_num: int = 1, symmetry: bool = True) numpy.ndarray | torch.Tensor [source]#
Calculate the KL distance between two vectors
- Parameters:
- X_A Union[np.ndarray, torch.Tensor]
The first input vector with shape n x d
- X_B Union[np.ndarray, torch.Tensor]
The second input vector with shape m x d
- use_gpu bool, optional
Whether to use GPU for chunk. Defaults to True.
- chunk_num int, optional
The number of chunks. The larger the number, the smaller the GPU memory usage, but the slower the calculation speed. Defaults to 20.
- symmetry bool, optional
Whether to use symmetric KL divergence. Defaults to True.
- Returns:
KL distance matrix of two vectors with shape n x m.
- Return type:
Union[np.ndarray, torch.Tensor]
- spateo.alignment.methods.utils.calc_exp_dissimilarity(X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, dissimilarity: str = 'kl', chunk_num: int = 1) numpy.ndarray | torch.Tensor [source]#
Calculate expression dissimilarity. :param X_A: Gene expression matrix of sample A. :param X_B: Gene expression matrix of sample B. :param dissimilarity: Expression dissimilarity measure:
'kl'
or'euclidean'
.- Returns:
The dissimilarity matrix of two feature samples.
- Return type:
Union[np.ndarray, torch.Tensor]
- spateo.alignment.methods.utils.cal_dist(X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, use_gpu: bool = True, chunk_num: int = 1, return_gpu: bool = True) numpy.ndarray | torch.Tensor [source]#
Calculate the distance between two vectors
- Parameters:
- X_A Union[np.ndarray, torch.Tensor]
The first input vector with shape n x d
- X_B Union[np.ndarray, torch.Tensor]
The second input vector with shape m x d
- use_gpu bool, optional
Whether to use GPU for chunk. Defaults to True.
- chunk_num int, optional
The number of chunks. The larger the number, the smaller the GPU memory usage, but the slower the calculation speed. Defaults to 1.
- Returns:
Distance matrix of two vectors with shape n x m.
- Return type:
Union[np.ndarray, torch.Tensor]
- spateo.alignment.methods.utils.cal_dot(mat1: numpy.ndarray | torch.Tensor, mat2: numpy.ndarray | torch.Tensor, use_chunk: bool = False, use_gpu: bool = True, chunk_num: int = 20) numpy.ndarray | torch.Tensor [source]#
Calculate the matrix multiplication of two matrices
- Parameters:
- mat1 Union[np.ndarray, torch.Tensor]
The first input matrix with shape n x d
- mat2 Union[np.ndarray, torch.Tensor]
The second input matrix with shape d x m. We suppose m << n and does not require chunk.
- use_chunk bool, optional
Whether to use chunk to reduce the GPU memory usage. Note that if set to ``True’’ it will slow down the calculation. Defaults to False.
- use_gpu bool, optional
Whether to use GPU for chunk. Defaults to True.
- chunk_num int, optional
The number of chunks. The larger the number, the smaller the GPU memory usage, but the slower the calculation speed. Defaults to 20.
- Returns:
Matrix multiplication result with shape n x m
- Return type:
Union[np.ndarray, torch.Tensor]
- spateo.alignment.methods.utils.get_optimal_R(coordsA: numpy.ndarray | torch.Tensor, coordsB: numpy.ndarray | torch.Tensor, P: numpy.ndarray | torch.Tensor, R_init: numpy.ndarray | torch.Tensor)[source]#
Get the optimal rotation matrix R
- Parameters:
- coordsA Union[np.ndarray, torch.Tensor]
The first input matrix with shape n x d
- coordsB Union[np.ndarray, torch.Tensor]
The second input matrix with shape n x d
- P Union[np.ndarray, torch.Tensor]
The optimal transport matrix with shape n x n
- Returns:
The optimal rotation matrix R with shape d x d
- Return type:
Union[np.ndarray, torch.Tensor]
- spateo.alignment.methods.utils._dist(mat1: numpy.ndarray | torch.Tensor, mat2: numpy.ndarray | torch.Tensor, metric: str = 'euc') numpy.ndarray | torch.Tensor [source]#
- spateo.alignment.methods.utils.PCA_reduction(data_mat: numpy.ndarray | torch.Tensor, reduced_dim: int = 64, center: bool = True) Tuple[numpy.ndarray | torch.Tensor, numpy.ndarray | torch.Tensor, numpy.ndarray | torch.Tensor] [source]#
PCA dimensionality reduction using SVD decomposition
- Parameters:
- data_mat Union[np.ndarray, torch.Tensor]
Input data matrix with shape n x k, where n is the data point number and k is the feature dimension.
- reduced_dim int, optional
Size of dimension after dimensionality reduction. Defaults to 64.
- center bool, optional
if True, center the input data, otherwise, assume that the input is centered. Defaults to True.
- Returns:
Data matrix after dimensionality reduction with shape n x r. V_new_basis (Union[np.ndarray, torch.Tensor]): New basis with shape k x r. mean_data_mat (Union[np.ndarray, torch.Tensor]): The mean of the input data matrix.
- Return type:
projected_data (Union[np.ndarray, torch.Tensor])
- spateo.alignment.methods.utils.PCA_project(data_mat: numpy.ndarray | torch.Tensor, V_new_basis: numpy.ndarray | torch.Tensor, center: bool = True)[source]#
- spateo.alignment.methods.utils.PCA_recover(projected_data: numpy.ndarray | torch.Tensor, V_new_basis: numpy.ndarray | torch.Tensor, mean_data_mat: numpy.ndarray | torch.Tensor) numpy.ndarray | torch.Tensor [source]#
- spateo.alignment.methods.utils.coarse_rigid_alignment(coordsA: numpy.ndarray | torch.Tensor, coordsB: numpy.ndarray | torch.Tensor, X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, transformed_points: numpy.ndarray | torch.Tensor | None = None, dissimilarity: str = 'kl', top_K: int = 10, verbose: bool = True) Tuple[Any, Any, Any, Any, numpy.ndarray | Any, numpy.ndarray | Any] [source]#
- spateo.alignment.methods.utils.coarse_rigid_alignment_debug(coordsA: numpy.ndarray | torch.Tensor, coordsB: numpy.ndarray | torch.Tensor, DistMat: numpy.ndarray | torch.Tensor, nx: ot.backend.TorchBackend or ot.backend.NumpyBackend, sub_sample_num: int = -1, top_K: int = 10, transformed_points: numpy.ndarray | torch.Tensor | None = None) numpy.ndarray | torch.Tensor [source]#
- spateo.alignment.methods.utils.voxel_data(coords: numpy.ndarray | torch.Tensor, gene_exp: numpy.ndarray | torch.Tensor, voxel_size: float | None = None, voxel_num: int | None = 10000)[source]#
Voxelization of the data. :param coords: The coordinates of the data points. :type coords: np.ndarray or torch.Tensor :param gene_exp: The gene expression of the data points. :type gene_exp: np.ndarray or torch.Tensor :param voxel_size: The size of the voxel. :type voxel_size: float :param voxel_num: The number of voxels. :type voxel_num: int
- Returns:
voxel_coords (np.ndarray or torch.Tensor) – The coordinates of the voxels.
voxel_gene_exp (np.ndarray or torch.Tensor) – The gene expression of the voxels.