spateo.svg¶
Submodules¶
Attributes¶
Functions¶
|
Identifying SVGs using a spatial uniform distribution as the reference. |
|
Calculate the standard deviation of the Wasserstein distance. |
Smoothing the gene expression using a graph neural network and downsampling the cells from the adata object. |
|
|
Smoothing the gene expression using a graph neural network. |
|
Downsampling the cells from the adata object. |
|
Calculate Wasserstein distances for a list of genes. |
|
Computing Wasserstein distance for an AnnData to identify spatially variable genes. |
|
Computing Wasserstein distance for a AnnData to identify spatially variable genes. |
Bin (based on spatial information), scale adata object and calculate the distance matrix based on the specified |
|
|
Find genes in gene_set that have similar distribution to each target_genes. |
|
Aggregate cell-based adata by bin size. Cells within a bin would be |
|
Shuffle X in anndata object randomly. |
|
Filter out cells with positive ratio lower than a setting value. |
|
Get genes that have postive ratio higher than a setting value. |
|
Calculate positive ratios for all genes, and return to AnnData. |
|
Calculate geodesic distance between any pair of genes. |
|
|
|
Scale the X array in AnnData. |
|
Computing Wasserstein distance. |
|
|
|
|
Bin (based on spatial information), scale adata object and calculate the distance matrix based on the specified |
|
|
|
|
|
|
Aggregate cell-based adata by bin size. Cells within a bin would be |
|
Shuffle X in anndata object randomly. |
|
Filter out cells with positive ratio lower than a setting value. |
|
Get genes that have postive ratio higher than a setting value. |
|
Calculate positive ratios for all genes, and return to AnnData. |
|
Calculate geodesic distance between any pair of genes. |
|
|
|
Scale the X array in AnnData. |
|
Computing Wasserstein distance. |
|
|
|
|
|
Aggregate cell-based adata by bin size. Cells within a bin would be |
|
Shuffle X in anndata object randomly. |
|
Filter out cells with positive ratio lower than a setting value. |
|
Get genes that have postive ratio higher than a setting value. |
|
Calculate positive ratios for all genes, and return to AnnData. |
|
Calculate geodesic distance between any pair of genes. |
|
|
|
Scale the X array in AnnData. |
|
Computing Wasserstein distance. |
|
|
|
Package Contents¶
- spateo.svg.lm¶
- spateo.svg.svg_iden_reg(adata: spateo.svg.utils.AnnData, bin_layer: str = 'spatial', cell_distance_method: str = 'geodesic', distance_layer: str = 'spatial', n_neighbors: int = 8, numItermax: int = 1000000, gene_set: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, target: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray, str] = [], min_dis_cutoff: float = 500, max_dis_cutoff: float = 1000, n_neighbors_for_std: int = 30) spateo.svg.utils.pd.DataFrame [source]¶
Identifying SVGs using a spatial uniform distribution as the reference.
- Parameters:
- adata
AnnData object
- bin_layer
Data in this layer will be binned according to the spatial information.
- cell_distance_method
The method for calculating distance between two cells, either geodesic or euclidean.
- distance_layer
Data in this layer will be used to calculate the spatial distance.
- n_neighbors
The number of nearest neighbors that will be considered for calculating spatial distance.
- numItermax
The maximum number of iterations before stopping the optimization algorithm if it has not converged.
- gene_set
Gene set that will be used to identified spatial variable genes, default is for all genes.
- target
The target gene expression distribution or the target gene name.
- min_dis_cutoff
Cells/Bins whose min distance to 30th neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30th neighbors are larger than this cutoff would be filtered.
- n_neighbors_for_std
Number of neighbors that will be used to calculate the standard deviation of the Wasserstein distances.
- Returns:
a pandas data frame that stores the information of spatial variable genes results. It includes the following columns:
- ”raw_pos_rate”: The raw positive ratio (the fraction of cells that have non-zero expression ) of the gene
across all cells.
- ”Wasserstein_distance”: The computed Wasserstein distance of each gene to the reference uniform
distribution.
- ”expectation_reg”: The predicted Wasserstein distance after fitting a loess regression using the gene
positive rate as the predictor.
”std”: Standard deviation of the Wasserstein distance. “std_reg”: The predicted standard deviation of the Wasserstein distance after fitting a loess regression
using the gene positive rate as the predictor.
”zscore”: The z-score of the Wasserstein distance. “pvalue”: The p-value based on the z-score. “adj_pvalue”: Adjusted p-value.
- In addition, the input adata object has updated with the following information:
adata.var[“raw_pos_rate”]: The positive rate of each gene.
- Return type:
w0
- spateo.svg.get_std_wasserstein(l: spateo.svg.utils.Union[spateo.svg.utils.np.ndarray, spateo.svg.utils.pd.DataFrame], n_neighbors: int = 30) spateo.svg.utils.np.ndarray [source]¶
Calculate the standard deviation of the Wasserstein distance.
- Parameters:
- l
The vector of the Wasserstein distance.
- n_neighbors
number of nearest neighbors.
- Returns:
The standard deviation of the Wasserstein distance.
- Return type:
std
- spateo.svg.smoothing_and_sampling(adata: spateo.svg.utils.AnnData, smoothing: bool = True, downsampling: int = 400, device: str = 'cpu') Tuple[spateo.svg.utils.AnnData, spateo.svg.utils.AnnData] [source]¶
Smoothing the gene expression using a graph neural network and downsampling the cells from the adata object.
- Parameters:
- adata
The input AnnData object.
- smoothing
Whether to do smooth the gene expression.
- downsampling
The number of cells to down sample.
- device
The device to run the deep learning smoothing model. Can be either “cpu” or proper “cuda” related devices, such as: “cuda:0”.
- Returns:
The adata after smoothing and downsampling. adata_smoothed: The adata after smoothing but not downsampling.
- Return type:
adata
- spateo.svg.smoothing(adata: spateo.svg.utils.AnnData, device: str = 'cpu') spateo.svg.utils.AnnData [source]¶
Smoothing the gene expression using a graph neural network.
- Parameters:
- adata
The input AnnData object.
- device
The device to run the deep learning smoothing model. Can be either “cpu” or proper “cuda” related devices, such as: “cuda:0”.
- Returns:
imputation result
- Return type:
adata_smoothed
- spateo.svg.downsampling(adata: spateo.svg.utils.AnnData, downsampling: int = 400) spateo.svg.utils.AnnData [source]¶
Downsampling the cells from the adata object.
- Parameters:
- adata
The input AnnData object.
- downsampling
The number of cells to down sample.
- Returns:
adata after the downsampling.
- Return type:
adata
- spateo.svg.cal_wass_dis_for_genes(inp0: Tuple[spateo.svg.utils.csr_matrix, spateo.svg.utils.AnnData], inp1: Tuple[int, spateo.svg.utils.List, spateo.svg.utils.np.ndarray, int]) Tuple[spateo.svg.utils.List, spateo.svg.utils.np.ndarray, spateo.svg.utils.np.ndarray] [source]¶
Calculate Wasserstein distances for a list of genes.
- Parameters:
- inp0
A tuple of the sparse matrix of spatial distance between nearest neighbors, and the adata object.
- inp1
A tuple of the seed, the list of genes, the target gene expression vector (need to be normalized to have a sum of 1), and the maximal number of iterations.
- Returns:
The gene list that is used to calculate the Wasserstein distribution. ws: The Wasserstein distances from each gene to the target gene. pos_rs: The expression positive rate vector related to the gene list.
- Return type:
gene_ids
- spateo.svg.cal_wass_dist_bs(adata: spateo.svg.utils.AnnData, bin_size: int = 1, bin_layer: str = 'spatial', cell_distance_method: str = 'geodesic', distance_layer: str = 'spatial', n_neighbors: int = 30, numItermax: int = 1000000, gene_set: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, target: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray, str] = [], processes: int = 1, bootstrap: int = 100, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0, rank_p: bool = True, bin_num: int = 100, larger_or_small: str = 'larger') Tuple[spateo.svg.utils.pd.DataFrame, spateo.svg.utils.AnnData] [source]¶
Computing Wasserstein distance for an AnnData to identify spatially variable genes.
- Parameters:
- adata
AnnData object.
- bin_size
Bin size for mergeing cells.
- bin_layer
Data in this layer will be binned according to the spatial information.
- cell_distance_method
The method for calculating distance between two cells, either geodesic or euclidean.
- distance_layer
The data of this layer would be used to calculate distance
- n_neighbors
The number of neighbors for calculating spatial distance.
- numItermax
The maximum number of iterations before stopping the optimization algorithm if it has not converged.
- gene_set
Gene set that will be used to compute Wasserstein distances, default is for all genes.
- target
The target gene expression distribution or the target gene name.
- processes
The process number for parallel computing
- bootstrap
Bootstrap number for permutation to calculate p-value
- min_dis_cutoff
Cells/Bins whose min distance to 30th neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30th neighbors are larger than this cutoff would be filtered.
- rank_p
Whether to calculate p value in ranking manner.
- bin_num
Classy genes into bin_num groups according to mean Wasserstein distance from bootstrap.
- larger_or_small
In what direction to get p value. Larger means the right tail area of the null distribution.
- Returns:
A dataframe storing information related to the Wasserstein distances. bin_scale_adata: Binned AnnData object
- Return type:
w_df
- spateo.svg.cal_wass_dis_nobs(adata: spateo.svg.utils.AnnData, bin_size: int = 1, bin_layer: str = 'spatial', cell_distance_method: str = 'geodesic', distance_layer: str = 'spatial', n_neighbors: int = 30, numItermax: int = 1000000, gene_set: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, target: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray, str] = [], min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0) Tuple[spateo.svg.utils.pd.DataFrame, spateo.svg.utils.AnnData] [source]¶
Computing Wasserstein distance for a AnnData to identify spatially variable genes.
- Parameters:
- adata
AnnData object
- bin_size
bin size for mergeing cells.
- bin_layer
data in this layer will be binned according to spatial information.
- cell_distance_method
the method for calculating distance of two cells. geodesic or euclidean
- distance_layer
the data of this layer would be used to calculate distance
- n_neighbors
the number of neighbors for calculation geodesic distance
- numItermax
The maximum number of iterations before stopping the optimization algorithm if it has not converged
- gene_set
Gene set for computing, default is for all genes.
- target
the target distribution or the target gene name.
- min_dis_cutoff
Cells/Bins whose min distance to 30 neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30 neighbors are larger than this cutoff would be filtered.
- Returns:
A dataframe storing information related to the Wasserstein distances.
- Return type:
w_df
- spateo.svg.bin_scale_adata_get_distance(adata: spateo.svg.utils.AnnData, bin_size: int = 1, bin_layer: str = 'spatial', distance_layer: str = 'spatial', cell_distance_method: str = 'geodesic', min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0, n_neighbors: int = 30) Tuple[spateo.svg.utils.AnnData, spateo.svg.utils.csr_matrix] [source]¶
Bin (based on spatial information), scale adata object and calculate the distance matrix based on the specified method (either geodesic or euclidean).
- Parameters:
- adata
AnnData object.
- bin_size
Bin size for mergeing cells.
- bin_layer
Data in this layer will be binned according to the spatial information.
- distance_layer
The data of this layer would be used to calculate distance
- cell_distance_method
The method for calculating distance between two cells, either geodesic or euclidean.
- min_dis_cutoff
Cells/Bins whose min distance to 30th neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30th neighbors are larger than this cutoff would be filtered.
- n_neighbors
The number of nearest neighbors that will be considered for calculating spatial distance.
- Returns:
Bin, scaled anndata object. M: The scipy sparse matrix of the calculated distance of nearest neighbors.
- Return type:
bin_scale_adata
- spateo.svg.cal_wass_dis_target_on_genes(adata: spateo.svg.utils.AnnData, bin_size: int = 1, bin_layer: str = 'spatial', distance_layer: str = 'spatial', cell_distance_method: str = 'geodesic', n_neighbors: int = 30, numItermax: int = 1000000, target_genes: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, gene_set: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, processes: int = 1, bootstrap: int = 0, top_n: int = 100, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0) Tuple[dict, spateo.svg.utils.AnnData] [source]¶
Find genes in gene_set that have similar distribution to each target_genes.
- Parameters:
- adata
AnnData object.
- bin_size
Bin size for mergeing cells.
- bin_layer
Data in this layer will be binned according to the spatial information.
- distance_layer
The data of this layer would be used to calculate distance
- cell_distance_method
The method for calculating distance between two cells, either geodesic or euclidean.
- n_neighbors
The number of neighbors for calculating spatial distance.
- numItermax
The maximum number of iterations before stopping the optimization algorithm if it has not converged.
- target_genes
The list of the target genes.
- gene_set
Gene set that will be used to compute Wasserstein distances, default is for all genes.
- processes
The process number for parallel computing.
- bootstrap
Number of bootstraps.
- top_n
Number of top genes to select.
- min_dis_cutoff
Cells/Bins whose min distance to 30th neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30th neighbors are larger than this cutoff would be filtered.
- Returns:
- The dictionary of the Wasserstein distance. Each key corresponds to a gene name while the corresponding
value the pandas DataFrame of the Wasserstein distance related information.
bin_scale_adata: binned, scaled anndata object.
- Return type:
w_genes
- spateo.svg.bin_adata(adata: anndata.AnnData, bin_size: int = 1, layer: str = 'spatial') anndata.AnnData [source]¶
Aggregate cell-based adata by bin size. Cells within a bin would be aggregated together as one cell.
- Parameters:
- adata
the input adata.
- bin_size
the size of square to bin adata.
- Returns:
Aggreated adata.
- spateo.svg.shuffle_adata(adata: anndata.AnnData, seed: int = 0, replace: bool = False)[source]¶
Shuffle X in anndata object randomly.
- Parameters:
- adata
AnnData object
- seed
seed for randomly shuffling
- Returns:
AnnData object
- Return type:
adata
- spateo.svg.filter_adata_by_pos_ratio(adata, pos_ratio)[source]¶
Filter out cells with positive ratio lower than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
Cells with positive ratio lower than this value would be discarded.
- Returns:
AnnData object.
- spateo.svg.get_genes_by_pos_ratio(adata: anndata.AnnData, pos_ratio: float = 0.1) list [source]¶
Get genes that have postive ratio higher than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
The threshold of positive ratio.
- Returns:
Gene list. AnnData object.
- spateo.svg.add_pos_ratio_to_adata(adata: anndata.AnnData, layer: str = None, var_name: str = 'raw_pos_rate')[source]¶
Calculate positive ratios for all genes, and return to AnnData. We defind positive ratio of a gene as the percent of cells express this gene.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used. If not given, we use data in X.
- var_name
The var name for storing positive ratios.
- Returns:
None
- spateo.svg.cal_geodesic_distance(adata: anndata.AnnData, layer: str = 'spatial', n_neighbors: int = 30, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 4.0) anndata.AnnData [source]¶
Calculate geodesic distance between any pair of genes.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used.
- n_neighbors
The number of neighbor to connect a cell to its nearest neighbors.
- min_dis_cutoff
Remove cells with minimal distance with its neighbors larger than this value. These cells are like islated cells.
- max_dis_cutoff
Remove cells with maximal distance with its neighbors larger than this value. These cells are like sparse cells.
- Returns:
AnnData object.
- spateo.svg.cal_euclidean_distance(adata: anndata.AnnData, layer: str = 'spatial', min_dis_cutoff: float = np.inf, max_dis_cutoff: float = np.inf) anndata.AnnData [source]¶
- spateo.svg.scale_to(adata: anndata.AnnData, to_median: bool = True, N: int = 10000) anndata.AnnData [source]¶
Scale the X array in AnnData.
- Parameters:
- adata
AnnData object.
- to_median
Whether scale to the median of cell total expressions.
- N
if to_median is False, scale data to this value.
- Returns:
AnnData object.
- spateo.svg.cal_wass_dis(M, a, b=[], numItermax=1000000)[source]¶
Computing Wasserstein distance.
- Parameters:
- M
(ns,nt) array-like, float – Loss matrix (c-order array in numpy with type float64)
- a
(ns,) array-like, float – Source histogram (uniform weight if empty list)
- b
(nt,) array-like, float – Target histogram (uniform weight if empty list)
- Returns:
(float, array-like) – Optimal transportation loss for the given parameters
- Return type:
W
- spateo.svg.loess_reg(adata: anndata.AnnData, layers: str = 'X') anndata.AnnData [source]¶
- spateo.svg.bin_scale_adata_get_distance(adata: spateo.svg.utils.AnnData, bin_size: int = 1, bin_layer: str = 'spatial', distance_layer: str = 'spatial', cell_distance_method: str = 'geodesic', min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0, n_neighbors: int = 30) Tuple[spateo.svg.utils.AnnData, spateo.svg.utils.csr_matrix] [source]¶
Bin (based on spatial information), scale adata object and calculate the distance matrix based on the specified method (either geodesic or euclidean).
- Parameters:
- adata
AnnData object.
- bin_size
Bin size for mergeing cells.
- bin_layer
Data in this layer will be binned according to the spatial information.
- distance_layer
The data of this layer would be used to calculate distance
- cell_distance_method
The method for calculating distance between two cells, either geodesic or euclidean.
- min_dis_cutoff
Cells/Bins whose min distance to 30th neighbors are larger than this cutoff would be filtered.
- max_dis_cutoff
Cells/Bins whose max distance to 30th neighbors are larger than this cutoff would be filtered.
- n_neighbors
The number of nearest neighbors that will be considered for calculating spatial distance.
- Returns:
Bin, scaled anndata object. M: The scipy sparse matrix of the calculated distance of nearest neighbors.
- Return type:
bin_scale_adata
- spateo.svg.cal_gro_wass_bs(adata1: spateo.svg.utils.AnnData, adata2: spateo.svg.utils.AnnData, bin_size1: int = 1, bin_size2: int = 1, bin_layer: str = 'spatial', cell_distance_method: str = 'geodesic', distance_layer: str = 'spatial', n_neighbors: int = 30, gene_set: spateo.svg.utils.Union[spateo.svg.utils.List, spateo.svg.utils.np.ndarray] = None, processes: int = 1, bootstrap: int = 100, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 6.0, larger_or_small: str = 'larger')[source]¶
- spateo.svg.lm¶
- spateo.svg.bin_adata(adata: anndata.AnnData, bin_size: int = 1, layer: str = 'spatial') anndata.AnnData [source]¶
Aggregate cell-based adata by bin size. Cells within a bin would be aggregated together as one cell.
- Parameters:
- adata
the input adata.
- bin_size
the size of square to bin adata.
- Returns:
Aggreated adata.
- spateo.svg.shuffle_adata(adata: anndata.AnnData, seed: int = 0, replace: bool = False)[source]¶
Shuffle X in anndata object randomly.
- Parameters:
- adata
AnnData object
- seed
seed for randomly shuffling
- Returns:
AnnData object
- Return type:
adata
- spateo.svg.filter_adata_by_pos_ratio(adata, pos_ratio)[source]¶
Filter out cells with positive ratio lower than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
Cells with positive ratio lower than this value would be discarded.
- Returns:
AnnData object.
- spateo.svg.get_genes_by_pos_ratio(adata: anndata.AnnData, pos_ratio: float = 0.1) list [source]¶
Get genes that have postive ratio higher than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
The threshold of positive ratio.
- Returns:
Gene list. AnnData object.
- spateo.svg.add_pos_ratio_to_adata(adata: anndata.AnnData, layer: str = None, var_name: str = 'raw_pos_rate')[source]¶
Calculate positive ratios for all genes, and return to AnnData. We defind positive ratio of a gene as the percent of cells express this gene.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used. If not given, we use data in X.
- var_name
The var name for storing positive ratios.
- Returns:
None
- spateo.svg.cal_geodesic_distance(adata: anndata.AnnData, layer: str = 'spatial', n_neighbors: int = 30, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 4.0) anndata.AnnData [source]¶
Calculate geodesic distance between any pair of genes.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used.
- n_neighbors
The number of neighbor to connect a cell to its nearest neighbors.
- min_dis_cutoff
Remove cells with minimal distance with its neighbors larger than this value. These cells are like islated cells.
- max_dis_cutoff
Remove cells with maximal distance with its neighbors larger than this value. These cells are like sparse cells.
- Returns:
AnnData object.
- spateo.svg.cal_euclidean_distance(adata: anndata.AnnData, layer: str = 'spatial', min_dis_cutoff: float = np.inf, max_dis_cutoff: float = np.inf) anndata.AnnData [source]¶
- spateo.svg.scale_to(adata: anndata.AnnData, to_median: bool = True, N: int = 10000) anndata.AnnData [source]¶
Scale the X array in AnnData.
- Parameters:
- adata
AnnData object.
- to_median
Whether scale to the median of cell total expressions.
- N
if to_median is False, scale data to this value.
- Returns:
AnnData object.
- spateo.svg.cal_wass_dis(M, a, b=[], numItermax=1000000)[source]¶
Computing Wasserstein distance.
- Parameters:
- M
(ns,nt) array-like, float – Loss matrix (c-order array in numpy with type float64)
- a
(ns,) array-like, float – Source histogram (uniform weight if empty list)
- b
(nt,) array-like, float – Target histogram (uniform weight if empty list)
- Returns:
(float, array-like) – Optimal transportation loss for the given parameters
- Return type:
W
- spateo.svg.loess_reg(adata: anndata.AnnData, layers: str = 'X') anndata.AnnData [source]¶
- spateo.svg.lm¶
- spateo.svg.bin_adata(adata: anndata.AnnData, bin_size: int = 1, layer: str = 'spatial') anndata.AnnData [source]¶
Aggregate cell-based adata by bin size. Cells within a bin would be aggregated together as one cell.
- Parameters:
- adata
the input adata.
- bin_size
the size of square to bin adata.
- Returns:
Aggreated adata.
- spateo.svg.shuffle_adata(adata: anndata.AnnData, seed: int = 0, replace: bool = False)[source]¶
Shuffle X in anndata object randomly.
- Parameters:
- adata
AnnData object
- seed
seed for randomly shuffling
- Returns:
AnnData object
- Return type:
adata
- spateo.svg.filter_adata_by_pos_ratio(adata, pos_ratio)[source]¶
Filter out cells with positive ratio lower than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
Cells with positive ratio lower than this value would be discarded.
- Returns:
AnnData object.
- spateo.svg.get_genes_by_pos_ratio(adata: anndata.AnnData, pos_ratio: float = 0.1) list [source]¶
Get genes that have postive ratio higher than a setting value.
- Parameters:
- adata
AnnData object.
- pos_ratio
The threshold of positive ratio.
- Returns:
Gene list. AnnData object.
- spateo.svg.add_pos_ratio_to_adata(adata: anndata.AnnData, layer: str = None, var_name: str = 'raw_pos_rate')[source]¶
Calculate positive ratios for all genes, and return to AnnData. We defind positive ratio of a gene as the percent of cells express this gene.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used. If not given, we use data in X.
- var_name
The var name for storing positive ratios.
- Returns:
None
- spateo.svg.cal_geodesic_distance(adata: anndata.AnnData, layer: str = 'spatial', n_neighbors: int = 30, min_dis_cutoff: float = 2.0, max_dis_cutoff: float = 4.0) anndata.AnnData [source]¶
Calculate geodesic distance between any pair of genes.
- Parameters:
- adata
AnnData object.
- layer
The layer of AnnData, in which the data are used.
- n_neighbors
The number of neighbor to connect a cell to its nearest neighbors.
- min_dis_cutoff
Remove cells with minimal distance with its neighbors larger than this value. These cells are like islated cells.
- max_dis_cutoff
Remove cells with maximal distance with its neighbors larger than this value. These cells are like sparse cells.
- Returns:
AnnData object.
- spateo.svg.cal_euclidean_distance(adata: anndata.AnnData, layer: str = 'spatial', min_dis_cutoff: float = np.inf, max_dis_cutoff: float = np.inf) anndata.AnnData [source]¶
- spateo.svg.scale_to(adata: anndata.AnnData, to_median: bool = True, N: int = 10000) anndata.AnnData [source]¶
Scale the X array in AnnData.
- Parameters:
- adata
AnnData object.
- to_median
Whether scale to the median of cell total expressions.
- N
if to_median is False, scale data to this value.
- Returns:
AnnData object.
- spateo.svg.cal_wass_dis(M, a, b=[], numItermax=1000000)[source]¶
Computing Wasserstein distance.
- Parameters:
- M
(ns,nt) array-like, float – Loss matrix (c-order array in numpy with type float64)
- a
(ns,) array-like, float – Source histogram (uniform weight if empty list)
- b
(nt,) array-like, float – Target histogram (uniform weight if empty list)
- Returns:
(float, array-like) – Optimal transportation loss for the given parameters
- Return type:
W
- spateo.svg.loess_reg(adata: anndata.AnnData, layers: str = 'X') anndata.AnnData [source]¶