spateo.svg.utils#

Module Contents#

Functions#

bin_adata(→ anndata.AnnData)

Aggregate cell-based adata by bin size. Cells within a bin would be

shuffle_adata(adata[, seed, replace])

Shuffle X in anndata object randomly.

filter_adata_by_pos_ratio(adata, pos_ratio)

Filter out cells with positive ratio lower than a setting value.

get_genes_by_pos_ratio(→ list)

Get genes that have postive ratio higher than a setting value.

add_pos_ratio_to_adata(adata[, layer, var_name])

Calculate positive ratios for all genes, and return to AnnData.

cal_geodesic_distance(→ anndata.AnnData)

Calculate geodesic distance between any pair of genes.

cal_euclidean_distance(→ anndata.AnnData)

scale_to(→ anndata.AnnData)

Scale the X array in AnnData.

cal_wass_dis(M, a[, b, numItermax])

Computing Wasserstein distance.

cal_rank_p(genes, ws, w_df[, bin_num])

loess_reg(→ anndata.AnnData)

spateo.svg.utils.bin_adata(adata: anndata.AnnData, bin_size: int = 1, layer: str = 'spatial') anndata.AnnData[source]#

Aggregate cell-based adata by bin size. Cells within a bin would be aggregated together as one cell.

Parameters:
adata

the input adata.

bin_size

the size of square to bin adata.

Returns:

Aggreated adata.

spateo.svg.utils.shuffle_adata(adata: anndata.AnnData, seed: int = 0, replace: bool = False)[source]#

Shuffle X in anndata object randomly.

Parameters:
adata

AnnData object

seed

seed for randomly shuffling

Returns:

AnnData object

Return type:

adata

spateo.svg.utils.filter_adata_by_pos_ratio(adata, pos_ratio)[source]#

Filter out cells with positive ratio lower than a setting value.

Parameters:
adata

AnnData object.

pos_ratio

Cells with positive ratio lower than this value would be discarded.

Returns:

AnnData object.

spateo.svg.utils.get_genes_by_pos_ratio(adata: anndata.AnnData, pos_ratio: float = 0.1) list[source]#

Get genes that have postive ratio higher than a setting value.

Parameters:
adata

AnnData object.

pos_ratio

The threshold of positive ratio.

Returns:

Gene list. AnnData object.

spateo.svg.utils.add_pos_ratio_to_adata(adata: anndata.AnnData, layer: str = None, var_name: str = 'raw_pos_rate')[source]#

Calculate positive ratios for all genes, and return to AnnData. We defind positive ratio of a gene as the percent of cells express this gene.

Parameters:
adata

AnnData object.

layer

The layer of AnnData, in which the data are used. If not given, we use data in X.

var_name

The var name for storing positive ratios.

Returns:

None

spateo.svg.utils.cal_geodesic_distance(adata: anndata.AnnData, layer: str = 'spatial', n_neighbors: int = 30, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 4.0) anndata.AnnData[source]#

Calculate geodesic distance between any pair of genes.

Parameters:
adata

AnnData object.

layer

The layer of AnnData, in which the data are used.

n_neighbors

The number of neighbor to connect a cell to its nearest neighbors.

min_dis_cutoff

Remove cells with minimal distance with its neighbors larger than this value. These cells are like islated cells.

max_dis_cutoff

Remove cells with maximal distance with its neighbors larger than this value. These cells are like sparse cells.

Returns:

AnnData object.

spateo.svg.utils.cal_euclidean_distance(adata: anndata.AnnData, layer: str = 'spatial', min_dis_cutoff: float = np.inf, max_dis_cutoff: float = np.inf) anndata.AnnData[source]#
spateo.svg.utils.scale_to(adata: anndata.AnnData, to_median: bool = True, N: int = 10000) anndata.AnnData[source]#

Scale the X array in AnnData.

Parameters:
adata

AnnData object.

to_median

Whether scale to the median of cell total expressions.

N

if to_median is False, scale data to this value.

Returns:

AnnData object.

spateo.svg.utils.cal_wass_dis(M, a, b=[], numItermax=1000000)[source]#

Computing Wasserstein distance.

Parameters:
M

(ns,nt) array-like, float – Loss matrix (c-order array in numpy with type float64)

a

(ns,) array-like, float – Source histogram (uniform weight if empty list)

b

(nt,) array-like, float – Target histogram (uniform weight if empty list)

Returns:

(float, array-like) – Optimal transportation loss for the given parameters

Return type:

W

spateo.svg.utils.cal_rank_p(genes, ws, w_df, bin_num=100)[source]#
spateo.svg.utils.loess_reg(adata: anndata.AnnData, layers: str = 'X') anndata.AnnData[source]#