spateo.tools.CCI_effects_modeling.MuSIC_upstream¶
Functionalities to aid in feature selection to characterize signaling patterns from spatial transcriptomics. Given a list of signaling molecules (ligands or receptors) and/or target genes
Classes¶
Various methods to select initial targets or predictors for intercellular analyses. |
Module Contents¶
- class spateo.tools.CCI_effects_modeling.MuSIC_upstream.MuSIC_Molecule_Selector(parser: argparse.ArgumentParser, args_list: List[str] | None = None)[source]¶
Bases:
spateo.tools.CCI_effects_modeling.MuSIC.MuSIC
Various methods to select initial targets or predictors for intercellular analyses.
- Parameters:
- parser
ArgumentParser object initialized with argparse, to parse command line arguments for arguments pertinent to modeling.
- mod_type[source]¶
The type of model that will be employed for eventual downstream modeling. Will dictate how predictors will be found (if applicable). Options:
“niche”: Spatially-aware, uses categorical cell type labels as independent variables.
- “lr”: Spatially-aware, essentially uses the combination of receptor expression in the “target” cell
and spatially lagged ligand expression in the neighboring cells as independent variables.
- “ligand”: Spatially-aware, essentially uses ligand expression in the neighboring cells as
independent variables.
“receptor”: Uses receptor expression in the “target” cell as independent variables.
- adata_path¶
Path to the AnnData object from which to extract data for modeling
- normalize[source]¶
Set True to Perform library size normalization, to set total counts in each cell to the same number (adjust for cell size).
- smooth[source]¶
Set True to correct for dropout effects by leveraging gene expression neighborhoods to smooth expression.
- target_expr_threshold[source]¶
When selecting targets, expression above a threshold percentage of cells will be used to filter to a smaller subset of interesting genes. Defaults to 0.1.
- r_squared_threshold¶
When selecting targets, only genes with an R^2 above this threshold will be used as targets
- custom_lig_path¶
Optional path to a .txt file containing a list of ligands for the model, separated by newlines. If provided, will find targets for which this set of ligands collectively explains the most variance for (on a gene-by-gene basis) when taking neighborhood expression into account
- custom_ligands¶
Optional list of ligands for the model, can be used as an alternative to :attr custom_lig_path. If provided, will find targets for which this set of ligands collectively explains the most variance for (on a gene-by-gene basis) when taking neighborhood expression into account
- custom_rec_path¶
Optional path to a .txt file containing a list of receptors for the model, separated by newlines. If provided, will find targets for which this set of receptors collectively explains the most variance for
- custom_receptors¶
Optional list of receptors for the model, can be used as an alternative to :attr custom_rec_path. If provided, will find targets for which this set of receptors collectively explains the most variance for
- custom_pathways_path¶
Rather than providing a list of receptors, can provide a list of signaling pathways- all receptors with annotations in this pathway will be included in the model. If provided, will find targets for which receptors in these pathways collectively explain the most variance for
- custom_pathways¶
Optional list of signaling pathways for the model, can be used as an alternative to :attr custom_pathways_path. If provided, will find targets for which receptors in these pathways collectively explain the most variance for
- targets_path¶
Optional path to a .txt file containing a list of prediction target genes for the model, separated by newlines. If not provided, targets will be strategically selected from the given receptors.
- custom_targets¶
Optional list of prediction target genes for the model, can be used as an alternative to :attr targets_path.
- cci_dir¶
Full path to the directory containing cell-cell communication databases
- species[source]¶
Selects the cell-cell communication database the relevant ligands will be drawn from. Options: “human”, “mouse”.
- output_path¶
Full path name for the .csv file in which results will be saved
- group_key¶
Key in .obs of the AnnData object that contains the cell type labels, used if targeting molecules that have cell type-specific activity
- coords_key¶
Key in .obsm of the AnnData object that contains the coordinates of the cells
- n_neighbors¶
Number of nearest neighbors to use in the case that ligands are provided or in the case that ligands of interest should be found
- find_targets(save_id: str | None = None, bw_membrane_bound: float | int = 8, bw_secreted: float | int = 25, kernel: Literal['bisquare', 'exponential', 'gaussian', 'quadratic', 'triangular', 'uniform'] = 'bisquare', **kwargs)[source]¶
- Find genes that may serve as interesting targets by computing the IoU with receptor signal. Will find
genes that are highly coexpressed with receptors or ligand:receptor signals.
- Parameters:
- save_id
Optional string to append to the end of the saved file name. Will save signaling molecule names as “ligand_{save_id}.txt”, etc.
- bw_membrane_bound
Bandwidth used to compute spatial weights for membrane-bound ligands. If integer, will convert to appropriate distance bandwidth.
- bw_secreted
Bandwidth used to compute spatial weights for secreted ligands. If integer, will convert to appropriate distance bandwidth.
- kernel
Type of kernel function used to weight observations when computing spatial weights; one of “bisquare”, “exponential”, “gaussian”, “quadratic”, “triangular” or “uniform”.
- kwargs
Keyword arguments for any of the Spateo argparse arguments. Should not include ‘output_path’ ( which will be determined by the output path used for the main model). Should also not include any of ‘ligands’ or ‘receptors’, which will be determined by this function.