spateo.tools.spatially_variable_gene_ot#

Wasserstein distance would be calculated by ot python package, see following:

Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, Titouan Vayer, POT Python Optimal Transport library, Journal of Machine Learning Research, 22(78):1−8, 2021. Website: https://pythonot.github.io/

Module Contents#

Functions#

_cal_dis(adata, x1)

Compute distance between samples in x1.

_cal_wass_dis(M, a[, b, numItermax])

Computing Wasserstein distance.

bin_adata(adata[, bin_size])

_cal_geodesic_distance(adata[, n_neighbors, ...])

_cal_wass_dis_on_genes(M, inp)

shuffle_adata(adata[, seed])

Shuffle X in anndata object randomly.

cal_wass_dis_bs(→ pandas.DataFrame)

Computing Wasserstein distance for a AnnData to identify spatially variable genes.

spateo.tools.spatially_variable_gene_ot._cal_dis(adata, x1)[source]#

Compute distance between samples in x1.

Parameters:
adata

Return type:

adata

spateo.tools.spatially_variable_gene_ot._cal_wass_dis(M, a, b=[], numItermax=1000000)[source]#

Computing Wasserstein distance.

Parameters:
M

(ns,nt) array-like, float – Loss matrix (c-order array in numpy with type float64)

a

(ns,) array-like, float – Source histogram (uniform weight if empty list)

b

(nt,) array-like, float – Target histogram (uniform weight if empty list)

Returns:

(float, array-like) – Optimal transportation loss for the given parameters

Return type:

W

spateo.tools.spatially_variable_gene_ot.bin_adata(adata, bin_size=1)[source]#
spateo.tools.spatially_variable_gene_ot._cal_geodesic_distance(adata, n_neighbors=30, min_dis_cutoff=2.0, max_dis_cutoff=4.0)[source]#
spateo.tools.spatially_variable_gene_ot._cal_wass_dis_on_genes(M, inp)[source]#
spateo.tools.spatially_variable_gene_ot.shuffle_adata(adata: anndata.AnnData, seed: int = 0)[source]#

Shuffle X in anndata object randomly.

Parameters:
adata

AnnData object

seed

seed for randomly shuffling

Returns:

AnnData object

Return type:

adata

spateo.tools.spatially_variable_gene_ot.cal_wass_dis_bs(adata: anndata.AnnData, bin_size: int = 1, numItermax: int = 1000000, gene_set: List | numpy.ndarray = None, compare_to: typing_extensions.Literal[uniform, allUMI] = 'allUMI', processes: int = 1, bootstrap: int = 100, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0) pandas.DataFrame[source]#

Computing Wasserstein distance for a AnnData to identify spatially variable genes.

Parameters:
adata

AnnData object

bin_size

bin size for mergeing cells.

numItermax

The maximum number of iterations before stopping the optimization algorithm if it has not converged

gene_set

Gene set for computing, default is for all genes.

compare_to

compare distance to uniform distribution or allUMI distribution.

processes

process number for parallelly running

bootstrap

bootstrap number for permutation to calculate p-value

min_dis_cutoff

Cells/Bins whose min distance to 30 neighbors are larger than this cutoff would be filtered.

max_dis_cutoff

Cells/Bins whose max distance to 30 neighbors are larger than this cutoff would be filtered.

Returns:

a dataframe adata0: binned AnnData object

Return type:

w_df