spateo.tools.CCI_effects_modeling.MuSIC_upstream

Functionalities to aid in feature selection to characterize signaling patterns from spatial transcriptomics. Given a list of signaling molecules (ligands or receptors) and/or target genes

Classes

MuSIC_Molecule_Selector

Various methods to select initial targets or predictors for intercellular analyses.

Module Contents

class spateo.tools.CCI_effects_modeling.MuSIC_upstream.MuSIC_Molecule_Selector(parser: argparse.ArgumentParser, args_list: List[str] | None = None)[source]

Bases: spateo.tools.CCI_effects_modeling.MuSIC.MuSIC

Various methods to select initial targets or predictors for intercellular analyses.

Parameters:
parser

ArgumentParser object initialized with argparse, to parse command line arguments for arguments pertinent to modeling.

mod_type[source]

The type of model that will be employed for eventual downstream modeling. Will dictate how predictors will be found (if applicable). Options:

  • “niche”: Spatially-aware, uses categorical cell type labels as independent variables.

  • “lr”: Spatially-aware, essentially uses the combination of receptor expression in the “target” cell

    and spatially lagged ligand expression in the neighboring cells as independent variables.

  • “ligand”: Spatially-aware, essentially uses ligand expression in the neighboring cells as

    independent variables.

  • “receptor”: Uses receptor expression in the “target” cell as independent variables.

distr[source]

Distribution family for the dependent variable; one of “gaussian”, “poisson”, “nb”

adata_path

Path to the AnnData object from which to extract data for modeling

normalize[source]

Set True to Perform library size normalization, to set total counts in each cell to the same number (adjust for cell size).

smooth[source]

Set True to correct for dropout effects by leveraging gene expression neighborhoods to smooth expression.

log_transform[source]

Set True if log-transformation should be applied to expression.

target_expr_threshold[source]

When selecting targets, expression above a threshold percentage of cells will be used to filter to a smaller subset of interesting genes. Defaults to 0.1.

r_squared_threshold

When selecting targets, only genes with an R^2 above this threshold will be used as targets

custom_lig_path

Optional path to a .txt file containing a list of ligands for the model, separated by newlines. If provided, will find targets for which this set of ligands collectively explains the most variance for (on a gene-by-gene basis) when taking neighborhood expression into account

custom_ligands

Optional list of ligands for the model, can be used as an alternative to :attr custom_lig_path. If provided, will find targets for which this set of ligands collectively explains the most variance for (on a gene-by-gene basis) when taking neighborhood expression into account

custom_rec_path

Optional path to a .txt file containing a list of receptors for the model, separated by newlines. If provided, will find targets for which this set of receptors collectively explains the most variance for

custom_receptors

Optional list of receptors for the model, can be used as an alternative to :attr custom_rec_path. If provided, will find targets for which this set of receptors collectively explains the most variance for

custom_pathways_path

Rather than providing a list of receptors, can provide a list of signaling pathways- all receptors with annotations in this pathway will be included in the model. If provided, will find targets for which receptors in these pathways collectively explain the most variance for

custom_pathways

Optional list of signaling pathways for the model, can be used as an alternative to :attr custom_pathways_path. If provided, will find targets for which receptors in these pathways collectively explain the most variance for

targets_path

Optional path to a .txt file containing a list of prediction target genes for the model, separated by newlines. If not provided, targets will be strategically selected from the given receptors.

custom_targets

Optional list of prediction target genes for the model, can be used as an alternative to :attr targets_path.

cci_dir

Full path to the directory containing cell-cell communication databases

species[source]

Selects the cell-cell communication database the relevant ligands will be drawn from. Options: “human”, “mouse”.

output_path

Full path name for the .csv file in which results will be saved

group_key

Key in .obs of the AnnData object that contains the cell type labels, used if targeting molecules that have cell type-specific activity

coords_key

Key in .obsm of the AnnData object that contains the coordinates of the cells

n_neighbors

Number of nearest neighbors to use in the case that ligands are provided or in the case that ligands of interest should be found

find_targets(save_id: str | None = None, bw_membrane_bound: float | int = 8, bw_secreted: float | int = 25, kernel: Literal['bisquare', 'exponential', 'gaussian', 'quadratic', 'triangular', 'uniform'] = 'bisquare', **kwargs)[source]
Find genes that may serve as interesting targets by computing the IoU with receptor signal. Will find

genes that are highly coexpressed with receptors or ligand:receptor signals.

Parameters:
save_id

Optional string to append to the end of the saved file name. Will save signaling molecule names as “ligand_{save_id}.txt”, etc.

bw_membrane_bound

Bandwidth used to compute spatial weights for membrane-bound ligands. If integer, will convert to appropriate distance bandwidth.

bw_secreted

Bandwidth used to compute spatial weights for secreted ligands. If integer, will convert to appropriate distance bandwidth.

kernel

Type of kernel function used to weight observations when computing spatial weights; one of “bisquare”, “exponential”, “gaussian”, “quadratic”, “triangular” or “uniform”.

kwargs

Keyword arguments for any of the Spateo argparse arguments. Should not include ‘output_path’ ( which will be determined by the output path used for the main model). Should also not include any of ‘ligands’ or ‘receptors’, which will be determined by this function.