spateo.tools.gene_expression_variance ===================================== .. py:module:: spateo.tools.gene_expression_variance .. autoapi-nested-parse:: Characterizing cell-to-cell variability within spatial domains Functions --------- .. autoapisummary:: spateo.tools.gene_expression_variance.compute_gene_groups_p_val spateo.tools.gene_expression_variance.get_highvar_genes spateo.tools.gene_expression_variance.get_highvar_genes_sparse spateo.tools.gene_expression_variance.compute_variance_decomposition spateo.tools.gene_expression_variance.genewise_variance_decomposition spateo.tools.gene_expression_variance.plot_variance_decomposition Module Contents --------------- .. py:function:: compute_gene_groups_p_val(gene: str, group1: anndata.AnnData, group2: anndata.AnnData) -> Tuple[str, float] Calculate the Mann-Whitney U test p-value for a gene between two groups. :param gene: Name of the gene :param group1: AnnData object containing cells from the first group to compare :param group2: AnnData object containing cells from the second group to compare :returns: Name of the gene p_val: Mann-Whitney U test p-value :rtype: gene .. py:function:: get_highvar_genes(expression: Union[numpy.ndarray, scipy.sparse.csr_matrix, scipy.sparse.csc_matrix, scipy.sparse.coo_matrix], expected_fano_threshold: Optional[float] = None, numgenes: Optional[int] = None, minimal_mean: float = 0.5) -> Tuple[pandas.DataFrame, Dict] Find highly-variable genes in single-cell data matrices. :param expression: Gene expression matrix :param expected_fano_threshold: Optionally can be used to set a manual dispersion threshold (for definition of "highly-variable") :param numgenes: Optionally can be used to find the n most variable genes :param minimal_mean: Sets a threshold on the minimum mean expression to consider .. py:function:: get_highvar_genes_sparse(expression: Union[numpy.ndarray, scipy.sparse.csr_matrix, scipy.sparse.csc_matrix, scipy.sparse.coo_matrix], expected_fano_threshold: Optional[float] = None, numgenes: Optional[int] = None, minimal_mean: float = 0.5) -> Tuple[pandas.DataFrame, Dict] Find highly-variable genes in sparse single-cell data matrices. :param expression: Gene expression matrix :param expected_fano_threshold: Optionally can be used to set a manual dispersion threshold (for definition of "highly-variable") :param numgenes: Optionally can be used to find the n most variable genes :param minimal_mean: Sets a threshold on the minimum mean expression to consider :returns: Results dataframe containing pertinent information for each gene gene_fano_parameters: Additional informative dictionary (w/ records of dispersion for each gene, threshold, etc.) :rtype: gene_counts_stats .. py:function:: compute_variance_decomposition(adata: anndata.AnnData, spatial_label_id: str, celltype_label_id: str, genes: Union[None, str, List[str]] = None, figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal['save', 'show', 'return', 'both', 'all'] = 'show', save_kwargs: Optional[dict] = {}) Computes and then optionally visualizes the variance decomposition for an AnnData object. Within spatial regions, determines the proportion of the total variation that occurs within the same cell type, the proportion of the variation that occurs between cell types in the region, and the proportion of the variation that comes from baseline differences in the expression levels of the genes in the data. The within-cell type variation could potentially come from differences in cell-cell communication. :param adata: AnnData object containing data :param spatial_label_id: Key in .obs containing spatial domain labels :param celltype_label_id: Key in .obs containing cell type labels :param genes: Can be used to filter to chosen subset of genes for variance computation :param figsize: Can be optionally used to set the size of the plotted figure :param save_show_or_return: Whether to save, show or return the figure. Only used if 'visualize' is True If "both", it will save and plot the figure at the same time. If "all", the figure will be saved, displayed and the associated axis and other object will be return. :param save_kwargs: A dictionary that will passed to the save_fig function. Only used if 'visualize' is True. By default it is an empty dictionary and the save_fig function will use the {"path": None, "prefix": 'scatter', "dpi": None, "ext": 'pdf', "transparent": True, "close": True, "verbose": True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs. :returns: Dataframe containing four columns, for the category label, celltype variation, inter-celltype variation and gene-level variation :rtype: var_decomposition .. py:function:: genewise_variance_decomposition(adata: anndata.AnnData, celltype_label_id: str, genes: Union[str, List[str]], figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal['save', 'show', 'return', 'both', 'all'] = 'show', save_kwargs: Optional[dict] = {}) For each gene in the chosen subset, computes a variance decomposition by computing the intra-cell type variance and the inter-cell type variance. :param adata: AnnData object containing data :param celltype_label_id: Key in .obs containing cell type labels :param genes: Can be used to filter to chosen subset of genes for variance computation :param figsize: Can be used to optionally set the size of the plotted figure :param save_show_or_return: Whether to save, show or return the figure. Only used if 'visualize' is True If "both", it will save and plot the figure at the same time. If "all", the figure will be saved, displayed and the associated axis and other object will be return. :param save_kwargs: A dictionary that will passed to the save_fig function. Only used if 'visualize' is True. By default it is an empty dictionary and the save_fig function will use the {"path": None, "prefix": 'scatter', "dpi": None, "ext": 'pdf', "transparent": True, "close": True, "verbose": True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs. :returns: Dataframe containing three columns, for the gene, intra-celltype variation and inter-celltype variation :rtype: var_decomposition .. py:function:: plot_variance_decomposition(var_df: pandas.DataFrame, figsize: Tuple[float, float] = (6, 2), cmap: str = 'Blues_r', multiindex: bool = False, title: Union[None, str] = None, save_show_or_return: Literal['save', 'show', 'return', 'both', 'all'] = 'show', save_kwargs: Optional[dict] = {}) Visualization of the parts-wise intra-cell type variation, cell type-independent gene variation to the total variation within the data. :param var_df: Output from :func `compute_variance_decomposition` :param figsize: (width, height) of the figure window :param cmap: Name of the matplotlib colormap to use :param multiindex: Specifies whether to set labels to record multi-level index information. Should only be used if var_df has a multi-index. :param title: Optionally, provide custom title to plot. If not given, will use default title. :param save_show_or_return: Whether to save, show or return the figure. If "both", it will save and plot the figure at the same time. If "all", the figure will be saved, displayed and the associated axis and other object will be returned. :param save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {"path": None, "prefix": 'scatter', "dpi": None, "ext": 'pdf', "transparent": True, "close": True, "verbose": True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.