spateo.alignment.methods.morpho_class

Classes

Morpho_pairwise

A class to align two spatial transcriptomics AnnData objects using the Spateo alignment algorithm.

Module Contents

class spateo.alignment.methods.morpho_class.Morpho_pairwise(sampleA: anndata.AnnData, sampleB: anndata.AnnData, rep_layer: str | List[str] = 'X', rep_field: str | List[str] = 'layer', genes: List[str] | numpy.ndarray | torch.Tensor | None = None, spatial_key: str = 'spatial', key_added: str = 'align_spatial', iter_key_added: str | None = None, save_concrete_iter: bool = False, vecfld_key_added: str | None = None, dissimilarity: str | List[str] = 'kl', probability_type: str | List[str] = 'gauss', probability_parameters: float | List[float] | None = None, label_transfer_dict: dict | List[dict] | None = None, nn_init: bool = True, init_transform: bool = True, allow_flip: bool = False, init_layer: str = 'X', init_field: str = 'layer', nn_init_top_K: int = 10, nn_init_weight: float = 1.0, max_iter: int = 200, nonrigid_start_iter: int = 80, SVI_mode: bool = True, batch_size: int | None = None, pre_compute_dist: bool = True, sparse_calculation_mode: bool = False, sparse_top_k: int = 1024, lambdaVF: int | float = 100.0, beta: int | float = 0.01, K: int | float = 15, kernel_type: str = 'euc', graph: networkx.Graph | None = None, graph_knn: int = 10, sigma2_init_scale: int | float | None = 0.1, sigma2_end: int | float | None = None, gamma_a: float = 1.0, gamma_b: float = 1.0, kappa: float | numpy.ndarray = 1.0, partial_robust_level: float = 10, normalize_c: bool = True, normalize_g: bool = False, separate_mean: bool = True, separate_scale: bool = False, dtype: str = 'float32', device: str = 'cpu', verbose: bool = True, guidance_pair: List[numpy.ndarray] | numpy.ndarray | None = None, guidance_effect: bool | str | None = False, guidance_weight: float = 1.0, use_chunk: bool = False, chunk_capacity: float = 1.0, return_mapping: bool = False)[source]

A class to align two spatial transcriptomics AnnData objects using the Spateo alignment algorithm.

sampleA[source]

The first AnnData object that acts as the reference.

Type:

AnnData

sampleB[source]

The second AnnData object that acts as the reference.

Type:

AnnData

rep_layer[source]

Representation layer(s) used for alignment. Default is “X”.

Type:

Union[str, List[str]]

rep_field[source]

Representation layer field(s) in AnnData to be used for alignment. “layer” means gene expression, “obsm” means embdedding like pca or VAE, “obs” means discrete label annotation. Note that Spateo only accept one label annotation. Defaults to “layer”.

Type:

Union[str, List[str]]

genes[source]

List or tensor of genes to be used for alignment. For example, you can input the genes you are interested or spatially variabe genes here. Defaults to None.

Type:

Optional[Union[List[str], torch.Tensor]]

spatial_key[source]

Key in .obsm of AnnData corresponding to the spatial coordinates. Defaults to “spatial”.

Type:

str

key_added[source]

Key under which the aligned spatial coordinates are added in .obsm. Defaults to “align_spatial”.

Type:

str

iter_key_added[source]

Key under which to store intermediate iteration results in .uns. Defaults to None.

Type:

Optional[str]

save_concrete_iter[source]

Whether to save more concrete intermediate iteration results. Default is False.

Type:

bool

vecfld_key_added[source]

Key under which to store vector field results in .uns. Defaults to None.

Type:

Optional[str]

dissimilarity[source]

Measure(s) of pairwise dissimilarity of each observation to be used. Defaults to “kl”.

Type:

Union[str, List[str]]

probability_type[source]

Type(s) of probability distribution used. Defaults to “gauss”.

Type:

Union[str, List[str]]

probability_parameters[source]

Parameters for the probability distribution. Defaults to None.

Type:

Optional[Union[float, List[float]]]

label_transfer_dict[source]

Dictionary that stores the label transfer probability. Defaults to None.

Type:

Optional[Union[dict, List[dict]]]

nn_init[source]

Whether to use nearest neighbor matching to initialize the alignment. Default is True.

Type:

bool

allow_flip[source]

Whether to allow flipping of coordinates. Default is False.

Type:

bool

init_layer[source]

Layer for initialize alignment. Defaults to “X”.

Type:

str

init_field[source]

Layer field for initialize alignment. Defaults to ‘layer’.

Type:

str

nn_init_weight[source]

Weight for nn_init guidance. Larger means that the nn_init guidance has more impact on the alignment, vice versa. Default is 1.0.

Type:

float

nn_init_top_K[source]

The number of top K nearest neighbors to consider in the nn_init. Defaults to 10.

Type:

int, optional

guidance_pair[source]

List of guidance pairs for alignment. Default is None.

Type:

Optional[Union[List[np.ndarray], np.ndarray]]

guidance_effect[source]

Effect of guidance for the transformation. Valid value: False, “rigid”, “nonrigid”, and “both”. Default is False.

Type:

Optional[Union[bool, str]]

guidance_weight[source]

Weight for guidance. Larger means that the guidance has more impact on the alignment, vice versa. Default is 1.

Type:

float

max_iter[source]

Maximum number of iterations. Defaults to 200.

Type:

int

SVI_mode[source]

Whether to use Stochastic Variational Inference mode. Default is True.

Type:

bool

batch_size[source]

Size of the mini-batch of SVI. Default is 1000.

Type:

int

pre_compute_dist[source]

Whether to precompute the distance matrix when using SVI mode. This will significantly speed up the calculation process but will also take more (GPU) memory. Default is True.

Type:

bool

sparse_calculation_mode[source]

Whether to use sparse matrix calculation. This will significantly reduce the (GPU) memory but will also slow down the speed. Default is False.

Type:

bool

sparse_top_k[source]

The top k elements to keep in sparse calculation mode. Default is 1024.

Type:

int

use_chunk[source]

Whether to use chunking in calculations. This will reduce the (GPU) memory but will also slow down the speed. Default is False.

Type:

bool

chunk_capacity[source]

Chunk size scale to the chunk_base.

Type:

float

lambdaVF[source]

Regularization parameter for the vector field of the non-rigid transformation. Smaller means that non-rigid deformation gets fewer constraints, then deformation can be larger and more flexible, vice versa. Default is 1e2. Recommended setting range [1e-1, 1e4].

Type:

Union[int, float]

beta[source]

Length-scale of the SE kernel. Larger means less correlation between points and more flexible non-rigid deformation, and vice versa. Default is 0.01. Recommended setting range [1e-4, 1e0].

Type:

Union[int, float]

K[source]

Number of sparse inducing points used for Nyström approximation for the kernel. Default is 15.

Type:

Union[int, float]

kernel_type[source]

Type of kernel used. Default is “euc”.

Type:

str

sigma2_init_scale[source]

Initial value for the spatial dispersion level. Default is 0.1.

Type:

Optional[Union[int, float]]

partial_robust_level[source]

Robust level of partial alignment. Default is 10.

Type:

float

normalize_c[source]

Whether to normalize spatial coordinates. Default is True.

Type:

bool

normalize_g[source]

Whether to normalize gene expression. Default is True.

Type:

bool

dtype[source]

Data type for computations. Default is “float32”.

Type:

str

device[source]

Device used to run the program. Default is “cpu”.

Type:

str

verbose[source]

Whether to print verbose messages. Default is True.

Type:

bool

verbose[source]
sampleA[source]
sampleB[source]
rep_layer[source]
rep_field[source]
genes[source]
spatial_key[source]
key_added[source]
iter_key_added[source]
save_concrete_iter[source]
vecfld_key_added[source]
dissimilarity[source]
probability_type[source]
probability_parameters[source]
label_transfer_dict[source]
nn_init[source]
init_transform[source]
nn_init_top_K[source]
max_iter[source]
allow_flip[source]
init_layer[source]
init_field[source]
SVI_mode[source]
batch_size[source]
pre_compute_dist[source]
sparse_calculation_mode[source]
sparse_top_k[source]
beta[source]
lambdaVF[source]
K[source]
kernel_type[source]
kernel_bandwidth[source]
graph[source]
graph_knn[source]
sigma2_init_scale[source]
sigma2_end[source]
partial_robust_level[source]
normalize_c[source]
normalize_g[source]
separate_mean[source]
separate_scale[source]
dtype[source]
device[source]
guidance_pair[source]
guidance_effect[source]
guidance_weight[source]
use_chunk[source]
chunk_capacity[source]
nn_init_weight[source]
gamma_a[source]
gamma_b[source]
kappa[source]
nonrigid_start_iter[source]
return_mapping[source]
run()[source]

Run the pairwise alignment process for spatial transcriptomics data.

Steps involved: 1. Perform coarse rigid alignment if nearest neighbor (nn) initialization is enabled. 2. Calculate the pairwise distance matrix for representations if pre-computation is enabled or not in SVI mode. 3. Initialize iteration variables and structures. 4. Perform iterative variational updates for alignment, including assignment P, gamma, alpha, sigma2, rigid and non-rigid updates. 5. Retrieve the full cell-cell assignment after the iterative process and calculate the optimal rigid transformation

Returns:

The final cell-cell assignment matrix.

Return type:

np.ndarray

_check()[source]

Validate and initialize various attributes for the Morpho_pairwise object.

This method performs several checks and initializations, including: - Representation layers and fields in AnnData objects - Spatial keys in AnnData objects - Label transfer dictionaries - Dissimilarity metrics - Probability types and parameters - Initialization layers and fields - Guidance effects

Raises:
  • ValueError – If any of the validations fail or required attributes are missing.

  • KeyError – If the specified spatial key is not found in the AnnData objects.

_align_preprocess(dtype: str = 'float32', device: str = 'cpu')[source]

Preprocess the data for alignment.

This method performs several preprocessing steps, including: - Determining the backend (CPU/GPU) for computation. - Extracting common genes from the samples. - Extracting gene expression or representations from the samples. - Checking and generating the label transfer matrix from the dictionary. - Extracting and normalizing spatial coordinates. - Normalizing gene expression if required. - Preprocessing guidance pairs if provided.

Parameters:
dtype str, optional

The data type for computations. Defaults to “float32”.

device str, optional

The device used for computation (e.g., “cpu” or “cuda:0”). Defaults to “cpu”.

Raises:

AssertionError – If the spatial coordinate dimensions of the samples are different.

_guidance_pair_preprocess()[source]

Preprocess the guidance pairs for alignment.

This method converts the guidance pairs to the backend type (e.g., NumPy, Torch) and normalizes them if required.

The normalization is based on the means and scales of the spatial coordinates.

Raises:

ValueError – If self.guidance_pair is not properly formatted.

_normalize_coords()[source]

Normalize the spatial coordinates of the samples.

This method normalizes the spatial coordinates of the samples to have zero mean and unit variance. It can normalize the coordinates separately or globally based on the provided arguments.

Raises:

AssertionError – If the dimensionality of the coordinates does not match.

_normalize_exps()[source]

Normalize the gene expression matrices.

This method normalizes the gene expression matrices for the samples if the representation field is ‘layer’ and the dissimilarity metric is not ‘kl’. The normalization ensures that the matrices have a consistent scale across the samples.

Raises:

ValueError – If the normalization scale cannot be calculated.

_initialize_variational_variables()[source]

Initialize variational variables for the alignment process.

This method sets initial guesses for various parameters, initializes variational variables, and configures the Stochastic Variational Inference (SVI) mode if enabled.

Parameters:
sigma2_init_scale float, optional

Initial scaling factor for sigma2. Defaults to 1.0.

Raises:

ValueError – If any initialization fails.

_init_probability_parameters(subsample: int = 20000)[source]

Initialize probability parameters for the alignment process.

This method calculates initial values for probability parameters based on the provided subsampling size and the specified dissimilarity and probability types.

Parameters:
subsample int, optional

The number of subsamples to use for initialization. Defaults to 20000.

Raises:

ValueError – If an unsupported probability type is encountered.

_construct_kernel(inducing_variables_num, sampling_method)[source]

Construct the kernel matrix for the alignment process.

This method generates inducing variables from the spatial coordinates, constructs the sparse kernel matrix, and handles different kernel types. It raises an error if the kernel type is not implemented.

Parameters:
inducing_variables_num int

Number of inducing variables to sample.

sampling_method str

Method used for sampling the inducing variables.

Raises:

NotImplementedError – If the specified kernel type is not implemented.

_update_batch(iter: int)[source]

Update the batch for Stochastic Variational Inference (SVI).

This method updates the batch indices and step size for each iteration during the SVI process. It ensures that the batch permutation is rolled to provide a new batch for each iteration.

Parameters:
iter int

The current iteration number.

Raises:

ValueError – If batch size exceeds the number of available data points.

_coarse_rigid_alignment(n_sampling=20000)[source]

Perform coarse rigid alignment between two sets of spatial coordinates.

This method performs downsampling, voxelization, and matching pairs construction based on brute force mutual K-nearest neighbors (K-NN). It calculates the similarity distance based on gene expression and performs a coarse alignment using inlier estimation. Optionally, it allows flipping the data for better alignment.

Parameters:
n_sampling int, optional

The number of samples to use for downsampling. Defaults to 20000.

Raises:
  • ValueError – If any required representation is not found in the AnnData objects.

  • RuntimeError – If coarse rigid alignment fails after reducing top_K.

_save_iter(iter: int)[source]

Save the current iteration’s alignment results.

This method saves the current transformed coordinates and the sigma2 value for the specified iteration. It normalizes the coordinates if normalization is enabled.

Parameters:
iter int

The current iteration number.

Raises:

KeyError – If key_added or “sigma2” key is not found in iter_added.

_update_assignment_P()[source]

Update the assignment matrix P.

This method calculates the assignment matrix P, which represents the probability of cells in the sampleB are generated by the cells in sampleA. It considers both spatial and expression / representation distances and updates variational parameters accordingly.

Parameters:
None

Raises:

ValueError – If any required representation is not found in the AnnData objects.

_update_gamma()[source]

Update the gamma parameter.

This method updates the gamma parameter based on the current state of the alignment process. It adjusts gamma using the digamma function (_psi) and ensures that gamma remains within the range [0.01, 0.99].

_update_alpha()[source]

Update the gamma parameter.

This method updates the gamma parameter based on the current state of the alignment process. It adjusts gamma using the digamma function (_psi) and ensures that gamma remains within the range [0.01, 0.99].

_update_nonrigid()[source]

Update the non-rigid transformation parameters.

This method updates the non-rigid transformation parameters using the current state of the alignment process. It computes the Sigma inverse matrix, the PXB term, and updates the variational parameters for the non-rigid alignment.

_update_rigid()[source]

Update the rigid transformation parameters.

This method updates the rigid transformation parameters using the current state of the alignment process. It solves for rotation and translation using the SVD formula and incorporates guidance and nearest neighbor initialization if applicable.

_update_sigma2(iter: int)[source]

Update the sigma2 parameter.

This method updates the sigma2 parameter based on the current state of the alignment process. It ensures that sigma2 remains above a certain threshold to prevent numerical instability.

Parameters:
iter int

The current iteration number.

Raises:

ValueError – If sigma2 is not properly updated.

_get_optimal_R()[source]

Compute the optimal rotation matrix R and translation vector t.

This method computes the optimal rotation matrix and translation vector for aligning the coordinates of sample A to sample B. It uses the SVD formula to determine the optimal rotation and ensures that the transformation maintains the correct orientation.

Raises:

ValueError – If the SVD decomposition fails or if the determinant check fails.

_wrap_output()[source]

Wrap the output after the alignment process.

This method denormalizes the aligned coordinates, converts them to numpy arrays, and saves them in the instance. It also prepares a dictionary containing the transformation parameters and metadata if vecfld_key_added is not None.