spateo.tdr.morphometrics.morphofield.sparsevfc

Attributes

lm

Functions

get_optimal_mapping_relationship(X, Y, pi[, keep_all])

paste_pairwise_align(→ Tuple[numpy.ndarray, Optional[int]])

Calculates and returns optimal alignment of two slices.

get_X_Y_grid(→ Tuple[numpy.ndarray, numpy.ndarray, ...)

Prepare the X (spatial coordinates), Y (gene expression) and grid points for the kernel or deep model.

cell_directions(→ Tuple[Optional[anndata.AnnData], ...)

Obtain the optimal mapping relationship and developmental direction between cells for samples between continuous developmental stages.

_morphofield_sparsevfc(, **kwargs) → dict)

Calculating and predicting the vector field during development by the Kernel method (sparseVFC).

morphofield_sparsevfc(, inplace, ...)

Calculating and predicting the vector field during development by the Kernel method (sparseVFC).

Module Contents

spateo.tdr.morphometrics.morphofield.sparsevfc.get_optimal_mapping_relationship(X: numpy.ndarray, Y: numpy.ndarray, pi: numpy.ndarray, keep_all: bool = False)[source]
spateo.tdr.morphometrics.morphofield.sparsevfc.paste_pairwise_align(sampleA: anndata.AnnData, sampleB: anndata.AnnData, layer: str = 'X', genes: list | numpy.ndarray | None = None, spatial_key: str = 'spatial', alpha: float = 0.1, dissimilarity: str = 'kl', G_init=None, a_distribution=None, b_distribution=None, norm: bool = False, numItermax: int = 200, numItermaxEmd: int = 100000, dtype: str = 'float32', device: str = 'cpu', verbose: bool = True) Tuple[numpy.ndarray, int | None][source]

Calculates and returns optimal alignment of two slices.

Parameters:
sampleA

Sample A to align.

sampleB

Sample B to align.

layer

If ‘X’, uses sample.X to calculate dissimilarity between spots, otherwise uses the representation given by sample.layers[layer].

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the raw spatial coordinates.

alpha

Alignment tuning parameter. Note: 0 <= alpha <= 1. When α = 0 only the gene expression data is taken into account, while when α =1 only the spatial coordinates are taken into account.

dissimilarity

Expression dissimilarity measure: 'kl' or 'euclidean'.

G_init

Initial mapping to be used in FGW-OT, otherwise default is uniform mapping.

a_distribution

Distribution of sampleA spots, otherwise default is uniform.

b_distribution

Distribution of sampleB spots, otherwise default is uniform.

norm

If True, scales spatial distances such that neighboring spots are at distance 1. Otherwise, spatial distances remain unchanged.

numItermax

Max number of iterations for cg during FGW-OT.

numItermaxEmd

Max number of iterations for emd during FGW-OT.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: ‘0’.

verbose

If True, print progress updates.

Returns:

Alignment of spots. obj: Objective function output of FGW-OT.

Return type:

pi

spateo.tdr.morphometrics.morphofield.sparsevfc.lm
spateo.tdr.morphometrics.morphofield.sparsevfc.get_X_Y_grid(adata: anndata.AnnData | None = None, genes: List | None = None, X: numpy.ndarray | None = None, Y: numpy.ndarray | None = None, grid_num: List = [50, 50, 50]) Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray][source]

Prepare the X (spatial coordinates), Y (gene expression) and grid points for the kernel or deep model.

Parameters:
adata

AnnData object that contains spatial (numpy.ndarray) in the obsm attribute.

genes

Gene list whose interpolate expression across space needs to learned. If Y is provided, genes will only be used to retrive the gene annotation info.

X

The spatial coordinates of each data point.

Y

The gene expression of the corresponding data point.

grid_num

Number of grid to generate. Default is 50 for each dimension. Must be non-negative.

Returns:

spatial coordinates. Y: gene expression of the associated spatial coordinates. Grid: grid points formed with the input spatial coordinates. grid_in_hull: A list of booleans indicates whether the current grid points is within the convex hull formed by

the input data points.

Return type:

X

spateo.tdr.morphometrics.morphofield.sparsevfc.cell_directions(adataA: anndata.AnnData, adataB: anndata.AnnData, layer: str = 'X', genes: list | numpy.ndarray | None = None, spatial_key: str = 'align_spatial', key_added: str = 'mapping', alpha: float = 0.001, numItermax: int = 200, numItermaxEmd: int = 100000, dtype: str = 'float32', device: str = 'cpu', keep_all: bool = False, inplace: bool = True, **kwargs) Tuple[anndata.AnnData | None, numpy.ndarray][source]

Obtain the optimal mapping relationship and developmental direction between cells for samples between continuous developmental stages.

Parameters:
adataA

AnnData object of sample A from continuous developmental stages.

adataB

AnnData object of sample B from continuous developmental stages.

layer

If 'X', uses .X to calculate dissimilarity between spots, otherwise uses the representation given by .layers[layer].

genes

Genes used for calculation. If None, use all common genes for calculation.

spatial_key

The key in .obsm that corresponds to the spatial coordinate of each cell.

.uns. The key that will be used for the vector field key in

key_added

The key that will be used in .obsm.

  • X_{key_added}-The X_{key_added} that will be used for the coordinates of the cell that maps optimally in the next stage.

  • V_{key_added}-The V_{key_added} that will be used for the cell developmental directions.

alpha

Alignment tuning parameter. Note: 0 <= alpha <= 1.

When alpha = 0 only the gene expression data is taken into account, while when alpha =1 only the spatial coordinates are taken into account.

numItermax

Max number of iterations for cg during FGW-OT.

numItermaxEmd

Max number of iterations for emd during FGW-OT.

dtype

The floating-point number type. Only float32 and float64.

device

Equipment used to run the program. You can also set the specified GPU for running. E.g.: '0'

keep_all

Whether to retain all the optimal relationships obtained only based on the pi matrix, If keep_all is False, the optimal relationships obtained based on the pi matrix and the nearest coordinates.

inplace

Whether to copy adata or modify it inplace.

**kwargs

Additional parameters that will be passed to pairwise_align function.

Returns:

An AnnData object of sample A is updated/copied with the X_{key_added} and V_{key_added} in the .obsm attribute. A pi metrix.

spateo.tdr.morphometrics.morphofield.sparsevfc._morphofield_sparsevfc(X: numpy.ndarray, V: numpy.ndarray, NX: numpy.ndarray | None = None, grid_num: List[int] | None = None, M: int = 100, lambda_: float = 0.02, lstsq_method: str = 'scipy', min_vel_corr: float = 0.8, restart_num: int = 10, restart_seed: List[int] | Tuple[int] | numpy.ndarray = (0, 100, 200, 300, 400), **kwargs) dict[source]

Calculating and predicting the vector field during development by the Kernel method (sparseVFC).

Parameters:
X

The spatial coordinates of each cell.

V

The developmental direction of each cell.

NX

The spatial coordinates of new data point (grid). If NX is None, generate grid based on grid_num.

grid_num

The number of grids in each dimension for generating the grid velocity. Default is [50, 50, 50].

M

The number of basis functions to approximate the vector field.

lambda

Represents the trade-off between the goodness of data fit and regularization. Larger Lambda_ put more weights on regularization.

lstsq_method

The name of the linear least square solver, can be either 'scipy' or 'douin'.

min_vel_corr

The minimal threshold for the cosine correlation between input velocities and learned velocities to consider as a successful vector field reconstruction procedure. If the cosine correlation is less than this threshold and restart_num > 1, restart_num trials will be attempted with different seeds to reconstruct the vector field function. This can avoid some reconstructions to be trapped in some local optimal.

restart_num

The number of retrials for vector field reconstructions.

restart_seed

A list of seeds for each retrial. Must be the same length as restart_num or None.

**kwargs

Additional parameters that will be passed to SparseVFC function.

Returns:

X: Current state.

valid_ind: The indices of cells that have finite velocity values. X_ctrl: Sample control points of current state. ctrl_idx: Indices for the sampled control points. Y: Velocity estimates in delta t. beta: Parameter of the Gaussian Kernel for the kernel matrix (Gram matrix). V: Prediction of velocity of X. C: Finite set of the coefficients for the P: Posterior probability Matrix of inliers. VFCIndex: Indexes of inliers found by sparseVFC. sigma2: Energy change rate. grid: Grid of current state. grid_V: Prediction of velocity of the grid. iteration: Number of the last iteration. tecr_vec: Vector of relative energy changes rate comparing to previous step. E_traj: Vector of energy at each iteration. method: The method of learning vector field. Here method == ‘sparsevfc’.

Here the most important results are X, V, grid and grid_V.

X: Cell coordinates of the current state. V: Developmental direction of the X. grid: Grid coordinates of current state. grid_V: Prediction of developmental direction of the grid.

Return type:

A dictionary which contains

spateo.tdr.morphometrics.morphofield.sparsevfc.morphofield_sparsevfc(adata: anndata.AnnData, spatial_key: str = 'align_spatial', V_key: str = 'V_mapping', key_added: str = 'VecFld_morpho', NX: numpy.ndarray | None = None, grid_num: List[int] | None = None, M: int = 100, lambda_: float = 0.02, lstsq_method: str = 'scipy', min_vel_corr: float = 0.8, restart_num: int = 10, restart_seed: List[int] | Tuple[int] | numpy.ndarray = (0, 100, 200, 300, 400), inplace: bool = True, **kwargs) anndata.AnnData | None[source]

Calculating and predicting the vector field during development by the Kernel method (sparseVFC).

Parameters:
adata

AnnData object that contains the cell coordinates of the two states after alignment.

spatial_key

The key from the .obsm that corresponds to the spatial coordinates of each cell.

V_key

The key from the .obsm that corresponds to the developmental direction of each cell.

key_added

The key that will be used for the vector field key in .uns.

NX

The spatial coordinates of new data point. If NX is None, generate new points based on grid_num.

grid_num

The number of grids in each dimension for generating the grid velocity. Default is [50, 50, 50].

M

The number of basis functions to approximate the vector field.

lambda

Represents the trade-off between the goodness of data fit and regularization. Larger Lambda_ put more weights on regularization.

lstsq_method

The name of the linear least square solver, can be either 'scipy' or 'douin'.

min_vel_corr

The minimal threshold for the cosine correlation between input velocities and learned velocities to consider as a successful vector field reconstruction procedure. If the cosine correlation is less than this threshold and restart_num > 1, restart_num trials will be attempted with different seeds to reconstruct the vector field function. This can avoid some reconstructions to be trapped in some local optimal.

restart_num

The number of retrials for vector field reconstructions.

restart_seed

A list of seeds for each retrial. Must be the same length as restart_num or None.

inplace

Whether to copy adata or modify it inplace.

**kwargs

Additional parameters that will be passed to SparseVFC function.

Returns:

An AnnData object is updated/copied with the key_added dictionary in the .uns attribute.

The key_added dictionary which contains:

X: Current state. valid_ind: The indices of cells that have finite velocity values. X_ctrl: Sample control points of current state. ctrl_idx: Indices for the sampled control points. Y: Velocity estimates in delta t. beta: Parameter of the Gaussian Kernel for the kernel matrix (Gram matrix). V: Prediction of velocity of X. C: Finite set of the coefficients for the P: Posterior probability Matrix of inliers. VFCIndex: Indexes of inliers found by sparseVFC. sigma2: Energy change rate. grid: Grid of current state. grid_V: Prediction of velocity of the grid. iteration: Number of the last iteration. tecr_vec: Vector of relative energy changes rate comparing to previous step. E_traj: Vector of energy at each iteration. method: The method of learning vector field. Here method == ‘sparsevfc’.

Here the most important results are X, V, grid and grid_V.

X: Cell coordinates of the current state. V: Developmental direction of the X. grid: Grid coordinates of current state. grid_V: Prediction of developmental direction of the grid.