spateo.tools.gene_expression_variance
#
Characterizing cell-to-cell variability within spatial domains
Module Contents#
Functions#
|
Calculate the Mann-Whitney U test p-value for a gene between two groups. |
|
Find highly-variable genes in single-cell data matrices. |
|
Find highly-variable genes in sparse single-cell data matrices. |
|
Computes and then optionally visualizes the variance decomposition for an AnnData object. |
|
For each gene in the chosen subset, computes a variance decomposition by computing the intra-cell type variance |
|
Visualization of the parts-wise intra-cell type variation, cell type-independent gene variation to the total |
- spateo.tools.gene_expression_variance.compute_gene_groups_p_val(gene: str, group1: anndata.AnnData, group2: anndata.AnnData) Tuple[str, float] [source]#
Calculate the Mann-Whitney U test p-value for a gene between two groups.
- Parameters:
- gene
Name of the gene
- group1
AnnData object containing cells from the first group to compare
- group2
AnnData object containing cells from the second group to compare
- Returns:
Name of the gene p_val: Mann-Whitney U test p-value
- Return type:
gene
- spateo.tools.gene_expression_variance.get_highvar_genes(expression: numpy.ndarray | scipy.sparse.csr_matrix | scipy.sparse.csc_matrix | scipy.sparse.coo_matrix, expected_fano_threshold: float | None = None, numgenes: int | None = None, minimal_mean: float = 0.5) Tuple[pandas.DataFrame, Dict] [source]#
Find highly-variable genes in single-cell data matrices.
- Parameters:
- expression
Gene expression matrix
- expected_fano_threshold
Optionally can be used to set a manual dispersion threshold (for definition of “highly-variable”)
- numgenes
Optionally can be used to find the n most variable genes
- minimal_mean
Sets a threshold on the minimum mean expression to consider
- spateo.tools.gene_expression_variance.get_highvar_genes_sparse(expression: numpy.ndarray | scipy.sparse.csr_matrix | scipy.sparse.csc_matrix | scipy.sparse.coo_matrix, expected_fano_threshold: float | None = None, numgenes: int | None = None, minimal_mean: float = 0.5) Tuple[pandas.DataFrame, Dict] [source]#
Find highly-variable genes in sparse single-cell data matrices.
- Parameters:
- expression
Gene expression matrix
- expected_fano_threshold
Optionally can be used to set a manual dispersion threshold (for definition of “highly-variable”)
- numgenes
Optionally can be used to find the n most variable genes
- minimal_mean
Sets a threshold on the minimum mean expression to consider
- Returns:
Results dataframe containing pertinent information for each gene gene_fano_parameters: Additional informative dictionary (w/ records of dispersion for each gene, threshold, etc.)
- Return type:
gene_counts_stats
- spateo.tools.gene_expression_variance.compute_variance_decomposition(adata: anndata.AnnData, spatial_label_id: str, celltype_label_id: str, genes: Union[None, str, List[str]] = None, figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#
Computes and then optionally visualizes the variance decomposition for an AnnData object.
Within spatial regions, determines the proportion of the total variation that occurs within the same cell type, the proportion of the variation that occurs between cell types in the region, and the proportion of the variation that comes from baseline differences in the expression levels of the genes in the data. The within-cell type variation could potentially come from differences in cell-cell communication.
- Parameters:
- adata
AnnData object containing data
- spatial_label_id
Key in .obs containing spatial domain labels
- celltype_label_id
Key in .obs containing cell type labels
- genes
Can be used to filter to chosen subset of genes for variance computation
- figsize
Can be optionally used to set the size of the plotted figure
- save_show_or_return
Whether to save, show or return the figure. Only used if ‘visualize’ is True If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
- save_kwargs
A dictionary that will passed to the save_fig function. Only used if ‘visualize’ is True. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
- Returns:
- Dataframe containing four columns, for the category label, celltype variation,
inter-celltype variation and gene-level variation
- Return type:
var_decomposition
- spateo.tools.gene_expression_variance.genewise_variance_decomposition(adata: anndata.AnnData, celltype_label_id: str, genes: Union[str, List[str]], figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#
For each gene in the chosen subset, computes a variance decomposition by computing the intra-cell type variance and the inter-cell type variance.
- Parameters:
- adata
AnnData object containing data
- celltype_label_id
Key in .obs containing cell type labels
- genes
Can be used to filter to chosen subset of genes for variance computation
- figsize
Can be used to optionally set the size of the plotted figure
- save_show_or_return
Whether to save, show or return the figure. Only used if ‘visualize’ is True If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
- save_kwargs
A dictionary that will passed to the save_fig function. Only used if ‘visualize’ is True. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
- Returns:
- Dataframe containing three columns, for the gene, intra-celltype variation and
inter-celltype variation
- Return type:
var_decomposition
- spateo.tools.gene_expression_variance.plot_variance_decomposition(var_df: pandas.DataFrame, figsize: Tuple[float, float] = (6, 2), cmap: str = 'Blues_r', multiindex: bool = False, title: Union[None, str] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#
Visualization of the parts-wise intra-cell type variation, cell type-independent gene variation to the total variation within the data.
- Parameters:
- var_df
Output from :func compute_variance_decomposition
- figsize
(width, height) of the figure window
- cmap
Name of the matplotlib colormap to use
- multiindex
Specifies whether to set labels to record multi-level index information. Should only be used if var_df has a multi-index.
- title
Optionally, provide custom title to plot. If not given, will use default title.
- save_show_or_return
Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be returned.
- save_kwargs
A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.