spateo.alignment.methods#

Submodules#

Package Contents#

Functions#

BA_align(→ Tuple[Optional[Tuple[anndata.AnnData, ...)

_summary_

BA_align_sparse(...)

generalized_procrustes_analysis(X, Y, pi)

Finds and applies optimal rotation between spatial coordinates of two layers (may also do a reflection).

paste_center_align(→ Tuple[anndata.AnnData, ...)

Computes center alignment of slices.

paste_pairwise_align(→ Tuple[numpy.ndarray, Optional[int]])

Calculates and returns optimal alignment of two slices.

PCA_project(data_mat, V_new_basis[, center])

PCA_recover(→ Union[numpy.ndarray, torch.Tensor])

PCA_reduction(→ Tuple[Union[numpy.ndarray, ...)

PCA dimensionality reduction using SVD decomposition

align_preprocess(...)

Data preprocessing before alignment.

cal_dist(→ Union[numpy.ndarray, torch.Tensor])

Calculate the distance between two vectors

cal_dot(→ Union[numpy.ndarray, torch.Tensor])

Calculate the matrix multiplication of two matrices

calc_exp_dissimilarity(→ Union[numpy.ndarray, ...)

Calculate expression dissimilarity.

coarse_rigid_alignment(→ Tuple[Any, Any, Any, Any, ...)

empty_cache([device])

Attributes#

spateo.alignment.methods.BA_align(sampleA: anndata.AnnData, sampleB: anndata.AnnData, genes: List | torch.Tensor | None = None, spatial_key: str = 'spatial', key_added: str = 'align_spatial', iter_key_added: str | None = 'iter_spatial', vecfld_key_added: str | None = 'VecFld_morpho', layer: str = 'X', dissimilarity: str = 'kl', keep_size: bool = False, max_iter: int = 200, lambdaVF: int | float = 100.0, beta: int | float = 0.01, K: int | float = 15, normalize_c: bool = True, normalize_g: bool = True, select_high_exp_genes: bool | float | int = False, dtype: str = 'float32', device: str = 'cpu', inplace: bool = True, verbose: bool = True, nn_init: bool = True, SVI_mode: bool = True, batch_size: int = 1000, partial_robust_level: float = 25) Tuple[Tuple[anndata.AnnData, anndata.AnnData] | None, numpy.ndarray, numpy.ndarray][source]#

_summary_

Parameters:
sampleA

Sample A that acts as reference.

sampleB

Sample B that performs alignment.

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the raw spatial coordinate.

key_added

.obsm key under which to add the aligned spatial coordinate.

iter_key_added

.uns key under which to add the result of each iteration of the iterative process. If iter_key_added is None, the results are not saved.

vecfld_key_added

The key that will be used for the vector field key in .uns. If vecfld_key_added is None, the results are not saved.

layer

If 'X', uses .X to calculate dissimilarity between spots, otherwise uses the representation given by .layers[layer].

dissimilarity

Expression dissimilarity measure: 'kl' or 'euclidean'.

small_variance

When approximating the assignment matrix, if True, we use small sigma2 (0.001) rather than the infered sigma2

max_iter

Max number of iterations for morpho alignment.

lambdaVF

Hyperparameter that controls the non-rigid distortion degree. Smaller means more flexibility.

beta

The length-scale of the SE kernel. Higher means more flexibility.

K

The number of sparse inducing points used for Nystr ̈om approximation. Smaller means faster but less accurate.

normalize_c

Whether to normalize spatial coordinates.

normalize_g

Whether to normalize gene expression. If dissimilarity == 'kl', normalize_g must be False.

select_high_exp_genes

Whether to select genes with high differences in gene expression.

samples_s

The space size of each sample. Area size for 2D samples and volume size for 3D samples.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: '0'.

inplace

Whether to copy adata or modify it inplace.

verbose

If True, print progress updates.

nn_init

If True, use nearest neighbor matching to initialize the alignment.

SVI_mode

Whether to use stochastic variational inferential (SVI) optimization strategy.

batch_size

The size of the mini-batch of SVI. If set smaller, the calculation will be faster, but it will affect the accuracy, and vice versa. If not set, it is automatically set to one-tenth of the data size.

partial_robust_level

The robust level of partial alignment. The larger the value, the more robust the alignment to partial cases is. Recommended setting from 1 to 50.

spateo.alignment.methods.BA_align_sparse(sampleA: anndata.AnnData, sampleB: anndata.AnnData, genes: List | torch.Tensor | None = None, spatial_key: str = 'spatial', key_added: str = 'align_spatial', iter_key_added: str | None = 'iter_spatial', vecfld_key_added: str | None = 'VecFld_morpho', layer: str = 'X', dissimilarity: str = 'kl', max_iter: int = 200, lambdaVF: int | float = 100.0, beta: int | float = 0.01, K: int | float = 15, beta2: int | float | None = None, beta2_end: int | float | None = None, normalize_c: bool = True, normalize_g: bool = True, dtype: str = 'float32', device: str = 'cpu', inplace: bool = True, verbose: bool = True, nn_init: bool = False, partial_robust_level: float = 25, use_label_prior: bool = False, label_key: str | None = 'cluster', label_transfer_prior: dict | None = None, SVI_mode: bool = True, batch_size: int = 1024, use_sparse: bool = True, pre_compute_dist: bool = False) Tuple[Tuple[anndata.AnnData, anndata.AnnData] | None, numpy.ndarray, numpy.ndarray][source]#
spateo.alignment.methods.generalized_procrustes_analysis(X, Y, pi)[source]#

Finds and applies optimal rotation between spatial coordinates of two layers (may also do a reflection).

Parameters:
X

np array of spatial coordinates.

Y

np array of spatial coordinates.

pi

mapping between the two layers output by PASTE.

Returns:

Aligned spatial coordinates of X, Y and the mapping relations.

spateo.alignment.methods.paste_center_align(init_center_sample: anndata.AnnData, samples: List[anndata.AnnData], layer: str = 'X', genes: list | numpy.ndarray | None = None, spatial_key: str = 'spatial', lmbda: numpy.ndarray | None = None, alpha: float = 0.1, n_components: int = 15, threshold: float = 0.001, max_iter: int = 10, numItermax: int = 200, numItermaxEmd: int = 100000, dissimilarity: str = 'kl', norm: bool = False, random_seed: int | None = None, pis_init: List[numpy.ndarray] | None = None, distributions: List[numpy.ndarray] | None = None, dtype: str = 'float32', device: str = 'cpu', verbose: bool = True) Tuple[anndata.AnnData, List[numpy.ndarray]][source]#

Computes center alignment of slices.

Parameters:
init_center_sample

Sample to use as the initialization for center alignment; Make sure to include gene expression and spatial information.

samples

List of samples to use in the center alignment.

layer

If ‘X’, uses sample.X to calculate dissimilarity between spots, otherwise uses the representation given by sample.layers[layer].

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the raw spatial coordinates.

lmbda

List of probability weights assigned to each slice; If None, use uniform weights.

alpha

Alignment tuning parameter. Note: 0 <= alpha <= 1. When α = 0 only the gene expression data is taken into account, while when α =1 only the spatial coordinates are taken into account.

n_components

Number of components in NMF decomposition.

threshold

Threshold for convergence of W and H during NMF decomposition.

max_iter

Maximum number of iterations for our center alignment algorithm.

numItermax

Max number of iterations for cg during FGW-OT.

numItermaxEmd

Max number of iterations for emd during FGW-OT.

dissimilarity

Expression dissimilarity measure: 'kl' or 'euclidean'.

norm

If True, scales spatial distances such that neighboring spots are at distance 1. Otherwise, spatial distances remain unchanged.

random_seed

Set random seed for reproducibility.

pis_init

Initial list of mappings between ‘A’ and ‘slices’ to solver. Otherwise, default will automatically calculate mappings.

distributions

Distributions of spots for each slice. Otherwise, default is uniform.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.

verbose

If True, print progress updates.

Returns:

  • Inferred center sample with full and low dimensional representations (W, H) of the gene expression matrix.

  • List of pairwise alignment mappings of the center sample (rows) to each input sample (columns).

spateo.alignment.methods.paste_pairwise_align(sampleA: anndata.AnnData, sampleB: anndata.AnnData, layer: str = 'X', genes: list | numpy.ndarray | None = None, spatial_key: str = 'spatial', alpha: float = 0.1, dissimilarity: str = 'kl', G_init=None, a_distribution=None, b_distribution=None, norm: bool = False, numItermax: int = 200, numItermaxEmd: int = 100000, dtype: str = 'float32', device: str = 'cpu', verbose: bool = True) Tuple[numpy.ndarray, int | None][source]#

Calculates and returns optimal alignment of two slices.

Parameters:
sampleA

Sample A to align.

sampleB

Sample B to align.

layer

If ‘X’, uses sample.X to calculate dissimilarity between spots, otherwise uses the representation given by sample.layers[layer].

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the raw spatial coordinates.

alpha

Alignment tuning parameter. Note: 0 <= alpha <= 1. When α = 0 only the gene expression data is taken into account, while when α =1 only the spatial coordinates are taken into account.

dissimilarity

Expression dissimilarity measure: 'kl' or 'euclidean'.

G_init

Initial mapping to be used in FGW-OT, otherwise default is uniform mapping.

a_distribution

Distribution of sampleA spots, otherwise default is uniform.

b_distribution

Distribution of sampleB spots, otherwise default is uniform.

norm

If True, scales spatial distances such that neighboring spots are at distance 1. Otherwise, spatial distances remain unchanged.

numItermax

Max number of iterations for cg during FGW-OT.

numItermaxEmd

Max number of iterations for emd during FGW-OT.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.

verbose

If True, print progress updates.

Returns:

Alignment of spots. obj: Objective function output of FGW-OT.

Return type:

pi

spateo.alignment.methods.PCA_project(data_mat: numpy.ndarray | torch.Tensor, V_new_basis: numpy.ndarray | torch.Tensor, center: bool = True)[source]#
spateo.alignment.methods.PCA_recover(projected_data: numpy.ndarray | torch.Tensor, V_new_basis: numpy.ndarray | torch.Tensor, mean_data_mat: numpy.ndarray | torch.Tensor) numpy.ndarray | torch.Tensor[source]#
spateo.alignment.methods.PCA_reduction(data_mat: numpy.ndarray | torch.Tensor, reduced_dim: int = 64, center: bool = True) Tuple[numpy.ndarray | torch.Tensor, numpy.ndarray | torch.Tensor, numpy.ndarray | torch.Tensor][source]#

PCA dimensionality reduction using SVD decomposition

Parameters:
data_mat Union[np.ndarray, torch.Tensor]

Input data matrix with shape n x k, where n is the data point number and k is the feature dimension.

reduced_dim int, optional

Size of dimension after dimensionality reduction. Defaults to 64.

center bool, optional

if True, center the input data, otherwise, assume that the input is centered. Defaults to True.

Returns:

Data matrix after dimensionality reduction with shape n x r. V_new_basis (Union[np.ndarray, torch.Tensor]): New basis with shape k x r. mean_data_mat (Union[np.ndarray, torch.Tensor]): The mean of the input data matrix.

Return type:

projected_data (Union[np.ndarray, torch.Tensor])

spateo.alignment.methods._chunk[source]#
spateo.alignment.methods._unsqueeze[source]#
spateo.alignment.methods.align_preprocess(samples: List[anndata.AnnData], genes: list | numpy.ndarray | None = None, spatial_key: str = 'spatial', layer: str = 'X', normalize_c: bool = False, normalize_g: bool = False, select_high_exp_genes: bool | float | int = False, dtype: str = 'float64', device: str = 'cpu', verbose: bool = True, **kwargs) Tuple[ot.backend.TorchBackend or ot.backend.NumpyBackend, torch.Tensor or np.ndarray, list, list, list, Optional[float], Optional[list]][source]#

Data preprocessing before alignment.

Parameters:
samples

A list of anndata object.

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the raw spatial coordinates.

layer

If ‘X’, uses sample.X to calculate dissimilarity between spots, otherwise uses the representation given by sample.layers[layer].

normalize_c

Whether to normalize spatial coordinates.

normalize_g

Whether to normalize gene expression.

select_high_exp_genes

Whether to select genes with high differences in gene expression.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.

verbose

If True, print progress updates.

spateo.alignment.methods.cal_dist(X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, use_gpu: bool = True, chunk_num: int = 1, return_gpu: bool = True) numpy.ndarray | torch.Tensor[source]#

Calculate the distance between two vectors

Parameters:
X_A Union[np.ndarray, torch.Tensor]

The first input vector with shape n x d

X_B Union[np.ndarray, torch.Tensor]

The second input vector with shape m x d

use_gpu bool, optional

Whether to use GPU for chunk. Defaults to True.

chunk_num int, optional

The number of chunks. The larger the number, the smaller the GPU memory usage, but the slower the calculation speed. Defaults to 1.

Returns:

Distance matrix of two vectors with shape n x m.

Return type:

Union[np.ndarray, torch.Tensor]

spateo.alignment.methods.cal_dot(mat1: numpy.ndarray | torch.Tensor, mat2: numpy.ndarray | torch.Tensor, use_chunk: bool = False, use_gpu: bool = True, chunk_num: int = 20) numpy.ndarray | torch.Tensor[source]#

Calculate the matrix multiplication of two matrices

Parameters:
mat1 Union[np.ndarray, torch.Tensor]

The first input matrix with shape n x d

mat2 Union[np.ndarray, torch.Tensor]

The second input matrix with shape d x m. We suppose m << n and does not require chunk.

use_chunk bool, optional

Whether to use chunk to reduce the GPU memory usage. Note that if set to ``True’’ it will slow down the calculation. Defaults to False.

use_gpu bool, optional

Whether to use GPU for chunk. Defaults to True.

chunk_num int, optional

The number of chunks. The larger the number, the smaller the GPU memory usage, but the slower the calculation speed. Defaults to 20.

Returns:

Matrix multiplication result with shape n x m

Return type:

Union[np.ndarray, torch.Tensor]

spateo.alignment.methods.calc_exp_dissimilarity(X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, dissimilarity: str = 'kl', chunk_num: int = 1) numpy.ndarray | torch.Tensor[source]#

Calculate expression dissimilarity. :param X_A: Gene expression matrix of sample A. :param X_B: Gene expression matrix of sample B. :param dissimilarity: Expression dissimilarity measure: 'kl' or 'euclidean'.

Returns:

The dissimilarity matrix of two feature samples.

Return type:

Union[np.ndarray, torch.Tensor]

spateo.alignment.methods.coarse_rigid_alignment(coordsA: numpy.ndarray | torch.Tensor, coordsB: numpy.ndarray | torch.Tensor, X_A: numpy.ndarray | torch.Tensor, X_B: numpy.ndarray | torch.Tensor, transformed_points: numpy.ndarray | torch.Tensor | None = None, dissimilarity: str = 'kl', top_K: int = 10, verbose: bool = True) Tuple[Any, Any, Any, Any, numpy.ndarray | Any, numpy.ndarray | Any][source]#
spateo.alignment.methods.empty_cache(device: str = 'cpu')[source]#