`spateo.tools.CCI_effects_modeling.MuSIC_downstream`#

Additional functionalities to characterize signaling patterns from spatial transcriptomics

These include:

prediction of the effects of spatial perturbation on gene expression- this can include the effect of perturbing

known regulators of ligand/receptor expression or the effect of perturbing the ligand/receptor itself. - following spatially-aware regression (or a sequence of spatially-aware regressions), combine regression results with data such that each cell can be associated with region-specific coefficient(s). - following spatially-aware regression (or a sequence of spatially-aware regressions), overlay the directionality of the predicted influence of the ligand on downstream expression.

Module Contents#

Classes#

MuSIC_Interpreter

Interpretation and downstream analysis of spatially weighted regression models.

Functions#

`replace_col_with_collagens`(string)
`replace_hla_with_hlas`(string)

class spateo.tools.CCI_effects_modeling.MuSIC_downstream.MuSIC_Interpreter(parser: argparse.ArgumentParser, args_list: List[str] | None = None, keep_column_threshold_proportion_cells: float | None = None)[source]#

Bases: spateo.tools.CCI_effects_modeling.MuSIC.MuSIC

Interpretation and downstream analysis of spatially weighted regression models.

Parameters:

parser: ArgumentParser object initialized with argparse, to parse command line arguments for arguments pertinent to modeling.
args_list: If parser is provided by function call, the arguments to parse must be provided as a separate list. It is recommended to use the return from :func define_spateo_argparse() for this.
keep_coeff_threshold_proportion_cells: If provided, will threshold columns to only keep those that are nonzero in a proportion of cells greater than this threshold. For example, if this is set to 0.5, more than half of the cells must have a nonzero value for a given column for it to be retained for further inspection. Intended to be used to filter out likely false positives.

compute_coeff_significance(method: str = 'fdr_bh', significance_threshold: float = 0.05)[source]#

Computes local statistical significance for fitted coefficients.

Parameters:

method

Method to use for correction. Available methods can be found in the documentation for

statsmodels.stats.multitest.multipletests(), and are also listed below (in correct case) for convenience: - Named methods:

bonferroni

sidak

holm-sidak

holm

simes-hochberg

hommel

Abbreviated methods:
- fdr_bh: Benjamini-Hochberg correction
- fdr_by: Benjamini-Yekutieli correction
- fdr_tsbh: Two-stage Benjamini-Hochberg
- fdr_tsbky: Two-stage Benjamini-Krieger-Yekutieli method

significance_threshold: p-value (or q-value) needed to call a parameter significant.

Returns:

Dataframe of identical shape to coeffs, where each element is True or False if it meets the threshold for significance pvalues: Dataframe of identical shape to coeffs, where each element is a p-value for that instance of that

feature

qvalues: Dataframe of identical shape to coeffs, where each element is a q-value for that instance of that: feature

Return type:

is_significant

filter_adata_spatial(instructions: List[str])[source]#

Based on spatial coordinates, filter the adata object to only include cells that meet the criteria. Criteria provided in the form of a list of instructions of the form “x less than 0.5 and y greater than 0.5”, etc., where each instruction is executed sequentially.

Parameters:

instructions: List of instructions to filter adata object by. Each instruction is a string of the form “x less than 0.5 and y greater than 0.5”, etc., where each instruction is executed sequentially.

filter_adata_custom(cell_ids: List[str])[source]#

Filter AnnData object to only the cells specified by the custom list.

Parameters:

cell_ids: List of cell IDs to keep. Each ID must be found in adata.obs_names

add_interaction_effect_to_adata(targets: str | List[str], interactions: str | List[str], visualize: bool = False) → anndata.AnnData[source]#

For each specified interaction/list of interactions, add the predicted interaction effect to the adata object.

Parameters:

targets: Target(s) to add interaction effect for. Can be a single target or a list of targets.
interactions: Interaction(s) to add interaction effect for. Can be a single interaction or a list of interactions. Should be the name of a gene for ligand models, or an L:R pair for L:R models (for example, “Igf1:Igf1r”).
visualize: Whether to visualize the interaction effect for each target/interaction pair. If True, will generate spatial scatter plot and save to HTML file.

Returns:

AnnData object with interaction effects added to .obs.

Return type:

adata

compute_and_visualize_diagnostics(type: Literal[correlations, confusion, rmse], n_genes_per_plot: int = 20)[source]#

For true and predicted gene expression, compute and generate either: confusion matrices, or correlations, including the Pearson correlation, Spearman correlation, or root mean-squared-error (RMSE).

Parameters:

type: Type of diagnostic to compute and visualize. Options: “correlations” for Pearson & Spearman correlation, “confusion” for confusion matrix, “rmse” for root mean-squared-error.
n_genes_per_plot: Only used if “type” is “confusion”. Number of genes to plot per figure. If there are more than this number of genes, multiple figures will be generated.

plot_interaction_effect_3D(target: str, interaction: str, save_path: str, pcutoff: float | None = 99.7, min_value: float | None = 0, zero_opacity: float = 1.0, size: float = 2.0, n_neighbors_smooth: int | None = 0)[source]#

Quick-visualize the magnitude of the predicted effect on target for a given interaction.

Parameters:

target: Target gene to visualize
interaction: Interaction to visualize (e.g. “Igf1:Igf1r” for L:R model, “Igf1” for ligand model)
save_path: Path to save the figure to (will save as HTML file)
pcutoff: Percentile cutoff for the colorbar. Will set all values above this percentile to this value.
min_value: Minimum value to set the colorbar to. Will set all values below this value to this value. Defaults to 0.
zero_opacity: Opacity of points with zero expression. Between 0.0 and 1.0. Default is 1.0.
size: Size of the points in the scatter plot. Default is 2.
n_neighbors_smooth: Number of neighbors to use for smoothing (to make effect patterns more apparent). If 0, no smoothing is applied. Default is 0.

plot_multiple_interaction_effects_3D(effects: List[str], save_path: str, include_combos_of_two: bool = False)[source]#

Quick-visualize the magnitude of the predicted effect on target for a given interaction.

Parameters:

effects: List of effects to visualize (e.g. [“Igf1:Igf1r”, “Igf1:InsR”] for L:R model, [“Igf1”] for ligand model)
save_path: Path to save the figure to (will save as HTML file)
include_combos_of_two: Whether to include paired combinations of effects (e.g. “Igf1:Igf1r and Igf1:InsR”) as separate categories. If False, will include these in the generic “Multiple interactions” category.

plot_tf_effect_3D(target: str, tf: str, save_path: str, ligand_targets: bool = True, receptor_targets: bool = False, target_gene_targets: bool = False, pcutoff: float = 99.7, min_value: float = 0, zero_opacity: float = 1.0, size: float = 2.0)[source]#

Quick-visualize the magnitude of the predicted effect on target for a given TF. Can only find the files necessary for this if :func CCI_deg_detection() has been run.

Parameters:

target: Target gene of interest
tf: TF of interest (e.g. “Foxo1”)
save_path: Path to save the figure to (will save as HTML file)
ligand_targets: Set True if ligands were used as the target genes for the :func CCI_deg_detection() model.
receptor_targets: Set True if receptors were used as the target genes for the :func CCI_deg_detection() model.
target_gene_targets: Set True if target genes were used as the target genes for the :func CCI_deg_detection() model.
pcutoff: Percentile cutoff for the colorbar. Will set all values above this percentile to this value.
min_value: Minimum value to set the colorbar to. Will set all values below this value to this value.
zero_opacity: Opacity of points with zero expression. Between 0.0 and 1.0. Default is 1.0.
size: Size of the points in the scatter plot. Default is 2.

visualize_overlap_between_interacting_components_3D(target: str, interaction: str, save_path: str, size: float = 2.0)[source]#

Visualize the spatial distribution of signaling features (ligand, receptor, or L:R field) and target gene, as well as the overlapping region. Intended for use with 3D spatial coordinates.

Parameters:

target: Target gene to visualize
interaction: Interaction to visualize (e.g. “Igf1:Igf1r” for L:R model, “Igf1” for ligand model)
save_path: Path to save the figure to (will save as HTML file)
size: Size of the points in the plot. Defaults to 2.

gene_expression_heatmap(use_ligands: bool = False, use_receptors: bool = False, use_target_genes: bool = False, genes: Optional[List[str]] = None, position_key: str = 'spatial', coord_column: Optional[Union[int, str]] = None, reprocess: bool = False, neatly_arrange_y: bool = True, window_size: int = 3, recompute: bool = False, title: Optional[str] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'magma', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the distribution of gene expression across cells in the spatial coordinates of cells; provides an idea of the simultaneous relative positions/patternings of different genes.

Parameters:

use_ligands: Set True to use ligands as the genes to visualize. If True, will ignore “genes” argument. “ligands_expr” file must be present in the model’s directory.
use_receptors: Set True to use receptors as the genes to visualize. If True, will ignore “genes” argument. “receptors_expr” file must be present in the model’s directory.
use_target_genes: Set True to use target genes as the genes to visualize. If True, will ignore “genes” argument. “targets” file must be present in the model’s directory.
genes: Optional list of genes to visualize. If “use_ligands”, “use_receptors”, and “use_target_genes” are all False, this must be given. This can also be used to visualize only a subset of the genes once processing & saving has already completed using e.g. “use_ligands”, “use_receptors”, etc.
position_key: Key in adata.obs or adata.obsm that provides a relative indication of the position of cells. i.e. spatial coordinates. Defaults to “spatial”. For each value in the position array (each coordinate, each category), multiple cells must have the same value.
coord_column: Optional, only used if “position_key” points to an entry in .obsm. In this case, this is the index or name of the column to be used to provide the positional context. Can also provide “xy”, “yz”, “xz”, “-xy”, “-yz”, “-xz” to draw a line between the two coordinate axes. “xy” will extend the new axis in the direction of increasing x and increasing y starting from x=0 and y=0 (or min. x/min. y), “-xy” will extend the new axis in the direction of decreasing x and increasing y starting from x=minimum x and y=maximum y, and so on.
reprocess: Set to True to reprocess the data and overwrite the existing files. Use if the genes to visualize have changed compared to the saved file (if existing), e.g. if “use_ligands” is True when the initial analysis used “use_target_genes”.
neatly_arrange_y: Set True to order the y-axis in terms of how early along the position axis the max z-scores for each row occur in. Used for a more uniform plot where similarly patterned interaction-target pairs are grouped together. If False, will sort this axis by the identity of the interaction (i.e. all “Fgf1” rows will be grouped together).
window_size: Size of window to use for smoothing. Must be an odd integer. If 1, no smoothing is applied.
recompute: Set to True to recompute the data and overwrite the existing files
title: Optional, can be used to provide title for plot
fontsize: Size of font for x and y labels.
figsize: Size of figure.
cmap: Colormap to use. Options: Any divergent matplotlib colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

effect_distribution_heatmap(target_subset: Optional[List[str]] = None, interaction_subset: Optional[List[str]] = None, position_key: str = 'spatial', coord_column: Optional[Union[int, str]] = None, effect_threshold: Optional[float] = None, check_downstream_ligand_effects: bool = False, check_downstream_receptor_effects: bool = False, check_downstream_target_effects: bool = False, use_significant: bool = False, sort_by_target: bool = False, neatly_arrange_y: bool = True, window_size: int = 3, recompute: bool = False, title: Optional[str] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'magma', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the distribution of interaction effects across cells in the spatial coordinates of cells; provides an idea of the simultaneous relative positions of different interaction effects.

Parameters:

target_subset: List of targets to consider. If None, will use all targets used in model fitting.
interaction_subset: List of interactions to consider. If None, will use all interactions used in model.
position_key: Key in adata.obs or adata.obsm that provides a relative indication of the position of cells. i.e. spatial coordinates. Defaults to “spatial”. For each value in the position array (each coordinate, each category), multiple cells must have the same value.
coord_column: Optional, only used if “position_key” points to an entry in .obsm. In this case, this is the index or name of the column to be used to provide the positional context. Can also provide “xy”, “yz”, “xz”, “-xy”, “-yz”, “-xz” to draw a line between the two coordinate axes. “xy” will extend the new axis in the direction of increasing x and increasing y starting from x=0 and y=0 (or min. x/min. y), “-xy” will extend the new axis in the direction of decreasing x and increasing y starting from x=minimum x and y=maximum y, and so on.
effect_threshold: Optional threshold minimum effect size to consider an effect for further analysis, as an absolute value. Use this to choose only the cells for which an interaction is predicted to have a strong effect. If None, use the median interaction effect.
check_downstream_ligand_effects: Set True to check the coefficients of downstream ligand models instead of coefficients of the upstream CCI model. Note that this may not necessarily look nice because TF-target relationships are not spatially dependent like L:R effects are.
check_downstream_receptor_effects: Set True to check the coefficients of downstream receptor models instead of coefficients of the upstream CCI model. Note that this may not necessarily look nice because TF-target relationships are not spatially dependent like L:R effects are.
check_downstream_target_effects: Set True to check the coefficients of downstream target models instead of coefficients of the upstream CCI model. Note that this may not necessarily look nice because TF-target relationships are not spatially dependent like L:R effects are.
use_significant: Whether to use only significant effects in computing the specificity. If True, will filter to cells + interactions where the interaction is significant for the target. Only valid if :func compute_coeff_significance() has been run.
sort_by_target: Set True to order the y-axis in terms of the identity of the target gene. Incompatible with “neatly_arrange_y”. If both this and “neatly_arrange_y” are False, will sort this axis by the identity of the interaction (i.e. all “Fgf1” rows will be grouped together).
neatly_arrange_y: Set True to order the y-axis in terms of how early along the position axis the max z-scores for each row occur in. Used for a more uniform plot where similarly patterned interaction-target pairs are grouped together. If False, will sort this axis by the identity of the interaction (i.e. all “Fgf1” rows will be grouped together).
window_size: Size of window to use for smoothing. Must be an odd integer. If 1, no smoothing is applied.
recompute: Set to True to recompute the data and overwrite the existing files
title: Optional, can be used to provide title for plot
fontsize: Size of font for x and y labels.
figsize: Size of figure.
cmap: Colormap to use. Options: Any divergent matplotlib colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

effect_distribution_density(effect_names: List[str], position_key: str = 'spatial', coord_column: Optional[Union[int, str]] = None, max_coord_val: float = 1.0, title: Optional[str] = None, x_label: Optional[str] = None, region_lower_bound: Optional[float] = None, region_upper_bound: Optional[float] = None, region_label: Optional[str] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the spatial enrichment of cell-cell interaction effects using density plots over spatial coordinates. Uses existing dataframe saved by effect_distribution_heatmap(), which must be run first.

Parameters:

effect_names: List of interaction effects to include in plot, in format “Target-Ligand:Receptor” (for L:R models) or “Target-Ligand” (for ligand models).
position_key: Key in adata.obs or adata.obsm that provides a relative indication of the position of cells. i.e. spatial coordinates. Defaults to “spatial”. For each value in the position array (each coordinate, each category), multiple cells must have the same value.
coord_column: Optional, only used if “position_key” points to an entry in .obsm. In this case, this is the index or name of the column to be used to provide the positional context. Can also provide “xy”, “yz”, “xz”, “-xy”, “-yz”, “-xz” to draw a line between the two coordinate axes. “xy” will extend the new axis in the direction of increasing x and increasing y starting from x=0 and y=0 (or min. x/min. y), “-xy” will extend the new axis in the direction of decreasing x and increasing y starting from x=minimum x and y=maximum y, and so on.
max_coord_val: Optional, can be used to adjust the numbers displayed along the x-axis for the relative position along the coordinate axis. Defaults to 1.0.
title: Optional, can be used to provide title for plot
x_label: Optional, can be used to provide x-axis label for plot
region_lower_bound: Optional, can be used to provide a lower bound for the region of interest to label on the plot- this can correspond to a spatial domain, etc.
region_upper_bound: Optional, can be used to provide an upper bound for the region of interest to label on the plot- this can correspond to a spatial domain, etc.
region_label: Optional, can be used to provide a label for the region of interest to label on the plot
fontsize: Size of font for x and y labels.
figsize: Size of figure.
cmap: Colormap to use. Options: Any divergent matplotlib colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

visualize_effect_specificity(agg_method: Literal[mean, percentage] = 'mean', plot_type: Literal[heatmap, volcano] = 'heatmap', target_subset: Optional[List[str]] = None, interaction_subset: Optional[List[str]] = None, ct_subset: Optional[List[str]] = None, group_key: Optional[str] = None, n_anchors: Optional[int] = None, effect_threshold: Optional[float] = None, use_significant: bool = False, target_cooccurrence_threshold: float = 0.1, significance_cutoff: float = 1.3, fold_change_cutoff: float = 1.5, fold_change_cutoff_for_labels: float = 3.0, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'seismic', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {}, save_df: bool = False)[source]#

Computes and visualizes the specificity of each interaction on each target. This is done by first separating the target-expressing cells (and their neighbors) from the rest of the cells (conditioned on predicted effect and also conditioned on receptor expression if L:R model is used). Then, computing the fold change of the average expression of the ligand in the neighborhood of the first subset vs. the neighborhoods of the second subset.

Parameters:

agg_method: Method to use for aggregating the specificity of each interaction on each target. Options: “mean” for mean ligand expression, “percentage” for the percentage of cells expressing the ligand.
plot_type: Type of plot to use for visualization. Options: “heatmap” for heatmap, “volcano” for volcano plot.
target_subset: List of targets to consider. If None, will use all targets used in model fitting.
interaction_subset: List of interactions to consider. If None, will use all interactions used in model.
ct_subset: Can be used to constrain the first group of cells (the query group) to the target-expressing cells of a particular type (conditioned on any other relevant variables). If given, will search for cell types in “group_key” attribute from model initialization. If not given, will use all cell types.
group_key: Can be used to specify entry in adata.obs that contains cell type groupings. If None, will use :attr group_key from model initialization.
n_anchors: Optional, number of target gene-expressing cells to use as anchors for analysis. Will be selected randomly from the set of target gene-expressing cells (conditioned on any other relevant values).
effect_threshold: Optional threshold minimum effect size to consider an effect for further analysis, as an absolute value. Use this to choose only the cells for which an interaction is predicted to have a strong effect. If None, use the median interaction effect.
use_significant: Whether to use only significant effects in computing the specificity. If True, will filter to cells + interactions where the interaction is significant for the target. Only valid if :func compute_coeff_significance() has been run.
significance_cutoff: Cutoff for negative log-10 q-value to consider an interaction/effect significant. Only used if “plot_type” is “volcano”. Defaults to 1.3 (corresponding to an approximate q-value of 0.05).
fold_change_cutoff: Cutoff for fold change to consider an interaction/effect significant. Only used if “plot_type” is “volcano”. Defaults to 1.5.
fold_change_cutoff_for_labels: Cutoff for fold change to include the label for an interaction/effect. Only used if “plot_type” is “volcano”. Defaults to 3.0.
fontsize: Size of font for x and y labels.
figsize: Size of figure.
cmap: Colormap to use. Options: Any divergent matplotlib colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
save_df: Set True to save the metric dataframe in the end

visualize_neighborhood(target: str, interaction: str, interaction_type: Literal[secreted, membrane - bound], select_examples_criterion: Literal[positive, negative] = 'positive', effect_threshold: float | None = None, cell_type: str | None = None, group_key: str | None = None, use_significant: bool = False, n_anchors: int = 100, n_neighbors_expressing: int = 20, display_plot: bool = True) → anndata.AnnData[source]#

Sets up AnnData object for visualization of interaction effects- cells will be colored by expression of the target gene, potentially conditioned on receptor expression, and neighboring cells will be colored by ligand expression.

Parameters:

target: Target gene of interest
interaction: Interaction feature to visualize, given in the same form as in the design matrix (if model is a ligand-based model or receptor-based model, this will be of form “Col4a1”. If model is a ligand-receptor based model, this will be of form “Col4a1:Itgb1”, for example).
interaction_type: Specifies whether the chosen interaction is secreted or membrane-bound. Options: “secreted” or “membrane-bound”.
select_examples_criterion: Whether to select cells with positive or negative interaction effects for visualization. Defaults to “positive”, which searches for cells for which the predicted interaction effect is above the given threshold. “Negative” will select cells for which the predicted interaction has no effect on the target expression.
effect_threshold: Optional threshold for the effect size of an interaction/effect to be considered for analysis; only used if “to_plot” is “percentage”. If not given, will use the upper quartile value among all interaction effect values to determine the threshold.
cell_type: Optional, can be used to select anchor cells from only a particular cell type. If None, will select from all cells.
group_key: Can be used to specify entry in adata.obs that contains cell type groupings. If None, will use :attr group_key from model initialization. Only used if “cell_type” is not None.
use_significant: Whether to use only significant effects in computing the specificity. If True, will filter to cells + interactions where the interaction is significant for the target. Only valid if :func compute_coeff_significance() has been run.
n_anchors: Number of target gene-expressing cells to use as anchors for visualization. Will be selected randomly from the set of target gene-expressing cells.
n_neighbors_expressing: Filters the set of cells that can be selected as anchors based on the number of their neighbors that express the chosen ligand. Only used for models that incorporate ligand expression.
display_plot: Whether to save a plot. If False, will return the AnnData object without doing anything else- this can then be visualized e.g. using spateo-viewer.

Returns:

Modified AnnData object containing the expression information for the target gene and neighboring: ligand expression.

Return type:

adata

cell_type_specific_interactions(to_plot: Literal[mean, percentage] = 'mean', plot_type: Literal[heatmap, barplot] = 'heatmap', group_key: Optional[str] = None, ct_subset: Optional[List[str]] = None, target_subset: Optional[List[str]] = None, interaction_subset: Optional[List[str]] = None, lower_threshold: float = 0.3, upper_threshold: float = 1.0, effect_threshold: Optional[float] = None, use_significant: bool = False, row_normalize: bool = False, col_normalize: bool = False, normalize_targets: bool = False, hierarchical_cluster_ct: bool = False, group_y_cell_type: bool = False, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, center: Optional[float] = None, cmap: str = 'Reds', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {}, save_df: bool = False)[source]#

Map interactions and interaction effects that are specific to particular cell type groupings. Returns a heatmap representing the enrichment of the interaction/effect within cells of that grouping (if “to_plot” is effect, this will be enrichment of the effect on cell type-specific expression). Enrichment determined by mean effect size or expression.

Parameters:

to_plot: Whether to plot the mean effect size or the proportion of cells in a cell type w/ effect on target. Options are “mean” or “percentage”.
plot_type: Whether to plot the results as a heatmap or barplot. Options are “heatmap” or “barplot”. If “barplot”, must provide a subset of up to four interactions to visualize.
group_key: Can be used to specify entry in adata.obs that contains cell type groupings. If None, will use :attr group_key from model initialization.
ct_subset: Can be used to restrict the enrichment analysis to only cells of a particular type. If given, will search for cell types in “group_key” attribute from model initialization. Recommended to use to subset to cell types with sufficient numbers.
target_subset: List of targets to consider. If None, will use all targets used in model fitting.
interaction_subset: List of interactions to consider. If None, will use all interactions used in model. Is necessary if “plot_type” is “barplot”, since the barplot is only designed to accomodate up to three interactions at once.
lower_threshold: Lower threshold for the proportion of cells in a cell type group that must express a particular interaction/effect for it to be colored on the plot, as a proportion of the max value. Threshold will be applied to the non-normalized values (if normalization is applicable). Defaults to 0.3.
upper_threshold: Upper threshold for the proportion of cells in a cell type group that must express a particular interaction/effect for it to be colored on the plot, as a proportion of the max value. Threshold will be applied to the non-normalized values (if normalization is applicable). Defaults to 1.0 (the max value).
effect_threshold: Optional threshold for the effect size of an interaction/effect to be considered for analysis; only used if “to_plot” is “percentage”. If not given, will use the upper quartile value among all interaction effect values to determine the threshold.
use_significant: Whether to use only significant effects in computing the specificity. If True, will filter to cells + interactions where the interaction is significant for the target. Only valid if :func compute_coeff_significance() has been run.
row_normalize: Whether to minmax scale the metric values by row (i.e. for each interaction/effect). Helps to alleviate visual differences that result from scale rather than differences in mean value across cell types.
col_normalize: Whether to minmax scale the metric values by column (i.e. for each interaction/effect). Helps to alleviate visual differences that result from scale rather than differences in mean value across cell types.
normalize_targets: Whether to minmax scale the metric values by column for each target (i.e. for each interaction/effect), to remove differences that occur as a result of scale of expression. Provides a clearer picture of enrichment for each target.
hierarchical_cluster_ct: Whether to cluster the x-axis (target gene in cell type) using hierarchical clustering. If False, will order the x-axis by the order of the target genes for organization purposes.
group_y_cell_type: Whether to group the y-axis (target gene in cell type) by cell type. If False, will group by target gene instead. Defaults to False.
fontsize: Size of font for x and y labels.
figsize: Size of figure.
center: Optional, determines position of the colormap center. Between 0 and 1.
cmap: Colormap to use for heatmap. If metric is “number”, “proportion”, “specificity”, the bottom end of the range is 0. It is recommended to use a sequential colormap (e.g. “Reds”, “Blues”, “Viridis”, etc.). For metric = “fc”, if a divergent colormap is not provided, “seismic” will automatically be used.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
save_df: Set True to save the metric dataframe in the end

cell_type_interaction_fold_change(ref_ct: str, query_ct: str, group_key: Optional[str] = None, target_subset: Optional[List[str]] = None, interaction_subset: Optional[List[str]] = None, to_plot: Literal[mean, percentage] = 'mean', plot_type: Literal[volcano, MuSIC_Interpreter.cell_type_interaction_fold_change.barplot] = 'barplot', source_data: Literal[interaction, effect, MuSIC_Interpreter.cell_type_interaction_fold_change.target] = 'effect', top_n_to_plot: Optional[int] = None, significance_cutoff: float = 1.3, fold_change_cutoff: float = 1.5, fold_change_cutoff_for_labels: float = 3.0, plot_query_over_ref: bool = False, plot_ref_over_query: bool = False, plot_only_significant: bool = False, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'seismic', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {}, save_df: bool = False)[source]#

Computes fold change in predicted interaction effects between two cell types, and visualizes result.

Parameters:

ref_ct: Label of the first cell type to consider. Fold change will be computed with respect to the level in this cell type.
query_ct: Label of the second cell type to consider
group_key: Name of the key in .obs containing cell type information. If not given, will use :attr group_key from model initialization.
target_subset: List of targets to consider. If None, will use all targets used in model fitting.
interaction_subset: List of interactions to consider. If None, will use all interactions used in model.
to_plot: Whether to plot the mean effect size or the proportion of cells in a cell type w/ effect on target. Options are “mean” or “percentage”.
plot_type: Whether to plot the results as a volcano plot or barplot. Options are “volcano” or “barplot”.
source_data: Selects what to use in computing fold changes. Options: - “interaction”: will use the design matrix (e.g. neighboring ligand expression or L:R mapping) - “effect”: will use the coefficient arrays for each target - “target”: will use the target gene expression
top_n_to_plot: If given, will only include the top n features in the visualization. Recommended if “source_data” is “effect”, as all combinations of interaction and target will be considered in this case.
significance_cutoff: Cutoff for negative log-10 q-value to consider an interaction/effect significant. Only used if “plot_type” is “volcano”. Defaults to 1.3 (corresponding to an approximate q-value of 0.05).
fold_change_cutoff: Cutoff for fold change to consider an interaction/effect significant. Only used if “plot_type” is “volcano”. Defaults to 1.5.
fold_change_cutoff_for_labels: Cutoff for fold change to include the label for an interaction/effect. Only used if “plot_type” is “volcano”. Defaults to 3.0.
plot_query_over_ref: Whether to plot/visualize only the portion that corresponds to the fold change of the query cell type over the reference cell type (and the portion that is significant). If False (and “plot_ref_over_query” is False), will plot the entire volcano plot. Only used if “plot_type” is “volcano”.
plot_ref_over_query: Whether to plot/visualize only the portion that corresponds to the fold change of the reference cell type over the query cell type (and the portion that is significant). If False (and “plot_query_over_ref” is False), will plot the entire volcano plot. Only used if “plot_type” is “volcano”.
plot_only_significant: Whether to plot/visualize only the portion that passes the “significance_cutoff” p-value threshold. Only used if “plot_type” is “volcano”.
fontsize: Size of font for x and y labels.
figsize: Size of figure.
cmap: Colormap to use for heatmap. If metric is “number”, “proportion”, “specificity”, the bottom end of the range is 0. It is recommended to use a sequential colormap (e.g. “Reds”, “Blues”, “Viridis”, etc.). For metric = “fc”, if a divergent colormap is not provided, “seismic” will automatically be used.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
save_df: Set True to save the metric dataframe in the end

enriched_interactions_barplot(interactions: Optional[Union[str, List[str]]] = None, targets: Optional[Union[str, List[str]]] = None, plot_type: Literal[average, proportion] = 'average', effect_size_threshold: float = 0.0, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'Reds', top_n: Optional[int] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the top predicted effect sizes for each interaction on particular target gene(s).

Parameters:

interactions: Optional subset of interactions to focus on, given in the form ligand(s):receptor(s), following the formatting in the design matrix. If not given, will consider all interactions that were specified in model fitting.
targets: Can optionally specify a subset of the targets to compute this on. If not given, will use all targets that were specified in model fitting. If multiple targets are given, “save_show_or_return” should be “save” (and provide appropriate keyword arguments for saving using “save_kwargs”), otherwise only the last target will be shown.
plot_type: Options: “average” or “proportion”. Whether to plot the average effect size or the proportion of cells expressing the target predicted to be affected by the interaction.
effect_size_threshold: Lower bound for average effect size to include a particular interaction in the barplot
fontsize: Size of font for x and y labels
figsize: Size of figure
cmap: Colormap to use for barplot. It is recommended to use a sequential colormap (e.g. “Reds”, “Blues”, “Viridis”, etc.).
top_n: If given, will only include the top n features in the visualization. If not given, will include all features that pass the “effect_size_threshold”.
save_show_or_return: Whether to save, show or return the figure If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

enriched_tfs_barplot(tfs: Optional[Union[str, List[str]]] = None, targets: Optional[Union[str, List[str]]] = None, target_type: Literal[ligand, receptor, target_gene] = 'target_gene', plot_type: Literal[average, proportion] = 'average', effect_size_threshold: float = 0.0, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'Reds', top_n: Optional[int] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the top predicted effect sizes for each transcription factor on particular target gene(s).

Parameters:

tfs: Optional subset of transcription factors to focus on. If not given, will consider all transcription factors that were specified in model fitting.
targets: Can optionally specify a subset of the targets to compute this on. If not given, will use all targets that were specified in model fitting. If multiple targets are given, “save_show_or_return” should be “save” (and provide appropriate keyword arguments for saving using “save_kwargs”), otherwise only the last target will be shown.
target_type: Set whether the given targets are ligands, receptors or target genes. Used to determine which folder to check for outputs.
plot_type: Options: “average” or “proportion”. Whether to plot the average effect size or the proportion of cells expressing the target predicted to be affected by the interaction.
effect_size_threshold: Lower bound for average effect size to include a particular interaction in the barplot
fontsize: Size of font for x and y labels
figsize: Size of figure
cmap: Colormap to use for barplot. It is recommended to use a sequential colormap (e.g. “Reds”, “Blues”, “Viridis”, etc.).
top_n: If given, will only include the top n features in the visualization. If not given, will include all features that pass the “effect_size_threshold”.
save_show_or_return: Whether to save, show or return the figure If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

partial_correlation_interactions(interactions: Optional[Union[str, List[str]]] = None, targets: Optional[Union[str, List[str]]] = None, method: Literal[pearson, spearman] = 'pearson', filter_interactions_proportion_threshold: Optional[float] = None, plot_zero_threshold: Optional[float] = None, ignore_outliers: bool = True, alternative: Literal[two-sided, less, greater] = 'two-sided', fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, center: Optional[float] = None, cmap: str = 'Reds', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {}, save_df: bool = False)[source]#

Repression is more difficult to infer from single-cell data- this function computes semi-partial correlations to shed light on interactions that may be overall repressive. In this case, for a given interaction-target pair, all other interactions are used as covariates in a semi-partial correlation (to account for their effects on the target, but not the other interactions which should be more independent of each other compared to the target).

Parameters:

interactions: Optional, given in the form ligand(s):receptor(s), following the formatting in the design matrix. If not given, will use all interactions that were specified in model fitting.
targets: Can optionally specify a subset of the targets to compute this on. If not given, will use all targets that were specified in model fitting.
method: Correlation type, options: - Pearson \(r\) product-moment correlation - Spearman \(\rho\) rank-order correlation
filter_interactions_proportion_threshold: Optional, if given, will filter out interactions that are predicted to occur in below this proportion of cells beforehand (to reduce the number of computations)
plot_zero_threshold: Optional, if given, will mask out values below this threshold in the heatmap (will keep the interactions in the dataframe, just will not color the elements in the plot). Can also be used together with filter_interactions_proportion_threshold.
ignore_outliers: Whether to ignore extremely high values for target gene expression when computing partial correlations
alternative: Defines the alternative hypothesis, or tail of the partial correlation. Must be one of “two-sided” (default), “greater” or “less”. Both “greater” and “less” return a one-sided p-value. “greater” tests against the alternative hypothesis that the partial correlation is positive (greater than zero), “less” tests against the hypothesis that the partial correlation is negative.
fontsize: Size of font for x and y labels
figsize: Size of figure
center: Optional, determines position of the colormap center. Between 0 and 1.
cmap: Colormap to use for heatmap. If metric is “number”, “proportion”, “specificity”, the bottom end of the range is 0. It is recommended to use a sequential colormap (e.g. “Reds”, “Blues”, “Viridis”, etc.). For metric = “fc”, if a divergent colormap is not provided, “seismic” will automatically be used.
save_show_or_return: Whether to save, show or return the figure If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
save_df: Set True to save the metric dataframe in the end

get_effect_potential(target: str | None = None, ligand: str | None = None, receptor: str | None = None, sender_cell_type: str | None = None, receiver_cell_type: str | None = None, spatial_weights_membrane_bound: numpy.ndarray | scipy.sparse.spmatrix | None = None, spatial_weights_secreted: numpy.ndarray | scipy.sparse.spmatrix | None = None, spatial_weights_niche: numpy.ndarray | scipy.sparse.spmatrix | None = None, store_summed_potential: bool = True) → Tuple[scipy.sparse.spmatrix, numpy.ndarray, numpy.ndarray][source]#

For each cell, computes the ‘signaling effect potential’, interpreted as a quantification of the strength of effect of intercellular communication on downstream expression in a given cell mediated by any given other cell with any combination of ligands and/or cognate receptors, as inferred from the model results. Computations are similar to those of :func ~`.inferred_effect_direction`, but stops short of computing vector fields.

Parameters:

target: Optional string to select target from among the genes used to fit the model to compute signaling effects for. Note that this function takes only one target at a time. If not given, will take the first name from among all targets.
ligand: Needed if :attr mod_type is ‘ligand’; select ligand from among the ligands used to fit the model to compute signaling potential.
receptor: Needed if :attr mod_type is ‘lr’; together with ‘ligand’, used to select ligand-receptor pair from among the ligand-receptor pairs used to fit the model to compute signaling potential.
sender_cell_type: Can optionally be used to select cell type from among the cell types used to fit the model to compute sent potential. Must be given if :attr mod_type is ‘niche’.
receiver_cell_type: Can optionally be used to condition sent potential on receiver cell type.
store_summed_potential: If True, will store both sent and received signaling potential as entries in .obs of the AnnData object.

Returns:

Sparse array of shape [n_samples, n_samples]; proxy for the “signaling effect potential”: with respect to a particular target gene between each sender-receiver pair of cells.
normalized_effect_potential_sum_sender: Array of shape [n_samples,]; for each sending cell, the sum of the: signaling potential to all receiver cells for a given target gene, normalized between 0 and 1.
normalized_effect_potential_sum_receiver: Array of shape [n_samples,]; for each receiving cell, the sum of: the signaling potential from all sender cells for a given target gene, normalized between 0 and 1.

Return type:

effect_potential

For each cell, computes the ‘pathway effect potential’, which is an aggregation of the effect potentials of all pathway member ligand-receptor pairs (or all pathway member ligands, for ligand-only models).

Parameters:

pathway: Name of pathway to compute pathway effect potential for.
target: Optional string to select target from among the genes used to fit the model to compute signaling effects for. Note that this function takes only one target at a time. If not given, will take the first name from among all targets.
spatial_weights_secreted: Optional pairwise spatial weights matrix for secreted factors
spatial_weights_membrane_bound: Optional pairwise spatial weights matrix for membrane-bound factors
store_summed_potential: If True, will store both sent and received signaling potential as entries in .obs of the AnnData object.

Returns:

Array of shape [n_samples, n_samples]; proxy for the combined “signaling effect: potential” with respect to a particular target gene for ligand-receptor pairs in a pathway.
normalized_pathway_effect_potential_sum_sender: Array of shape [n_samples,]; for each sending cell,: the sum of the pathway sum potential to all receiver cells for a given target gene, normalized between 0 and 1.
normalized_pathway_effect_potential_sum_receiver: Array of shape [n_samples,]; for each receiving cell,: the sum of the pathway sum potential from all sender cells for a given target gene, normalized between 0 and 1.

Return type:

pathway_sum_potential

inferred_effect_direction(targets: str | List[str] | None = None, compute_pathway_effect: bool = False)[source]#

For visualization purposes, used for models that consider ligand expression (:attr mod_type is ‘ligand’ or ‘lr’ (for receptor models, assigning directionality is impossible and for niche models, it makes much less sense to draw/compute a vector field). Construct spatial vector fields to infer the directionality of observed effects (the “sources” of the downstream expression).

Parts of this function are inspired by ‘communication_direction’ from COMMOT: https://github.com/zcang/COMMOT

Parameters:

targets: Optional string or list of strings to select targets from among the genes used to fit the model to compute signaling effects for. If not given, will use all targets.
compute_pathway_effect: Whether to compute the effect potential for each pathway in the model. If True, will collectively take the effect potential of all pathway components. If False, will compute effect potential for each for each individual signal.

define_effect_vf(effect_potential: scipy.sparse.spmatrix, normalized_effect_potential_sum_sender: numpy.ndarray, normalized_effect_potential_sum_receiver: numpy.ndarray, sig: str, target: str, max_val: float = 0.05)[source]#

Given the pairwise effect potential array, computes the effect vector field.

Parameters:

effect_potential: Sparse array containing computed effect potentials- output from get_effect_potential()
normalized_effect_potential_sum_sender: Array containing the sum of the effect potentials sent by each cell. Output from get_effect_potential().
normalized_effect_potential_sum_receiver: Array containing the sum of the effect potentials received by each cell. Output from get_effect_potential().
max_val: Constrains the size of the vector field vectors. Recommended to set within the order of magnitude of 1/100 of the desired plot dimensions.
sig: Label for the mediating interaction (e.g. name of a ligand, name of a ligand-receptor pair, etc.)
target: Name of the target that the vector field describes the effect for

visualize_effect_vf_3D(interaction: str, target: str, vf_key: str | None = None, vector_magnitude_lower_bound: float = 0.0, manual_vector_scale_factor: float | None = None, bin_size: float | Tuple[float] | None = None, plot_cells: bool = True, cell_size: float = 1.0, alpha: float = 0.3, no_color_coding: bool = False, only_view_effect_region: bool = False, add_group_label: str | None = None, group_label_obs_key: str | None = None, title_position: Tuple[float, float] = (0.5, 0.9), save_path: str | None = None, **kwargs)[source]#

Visualize the directionality of the effect on target for a given interaction, overlaid onto the 3D spatial plot. Can only be used for models that use ligand expression (:attr mod_type is ‘ligand’ or ‘lr’).

Parameters:

interaction

Interaction to incorporate into the visualization (e.g. “Igf1:Igf1r” for L:R model, “Igf1” for ligand model)

target

Name of the target gene of interest. Will search key “spatial_effect_sender_vf_{interaction}_{ target}” to create vector field plot.

vf_key

Optional key in .obsm to specify which vector field to use. If not given, will use the provided “interaction” and “target” to find the key specifying the vector field.

vector_magnitude_lower_bound

Lower bound for the magnitude of the vector field vectors to be plotted, as a fraction of the maximum vector magnitude. Defaults to 0.0.

manual_vector_scale_factor

If not None, will manually scale the vector field by this factor ( multiplicatively). Used for visualization purposes, not recommended to set above 2.0 (otherwise likely to get misleading results with vectors that are too long).

bin_size

Optional, can be used to de-clutter plotting space by splitting the space into 3D bins and displaying one vector per bin. Can be given as a floating point number to create cubic bins, or as a tuple of floats to specify different bin sizes for each dimension. If not given, will plot one vector per cell. Defaults to None.

plot_cells

If False, will not plot any of the cells (unless a group label is given), so will only visualize vector field. Defaults to True.

cell_size

Size of the cells in the 3D plot. Defaults to 1.0.

alpha

If visualizing cells not affected by the interaction, this argument specifies the transparency of those cells.

no_color_coding

If True, will color all cells the same color (except cells of given category, if given).

only_view_effect_region

If True, will only plot the region where the effect is predicted to be found, rather than the entire 3D object

add_group_label

This optional argument represents a cell type category. Will color the cells belonging to this particular category orange. If given, it is recommended to also provide group_label_obs_key (which will be :attr group_key if not given).

group_label_obs_key

If add_group_label is given, this argument represents the observation key in the AnnData object that contains the group label. If not given, will default to :attr group_key.

title_position

Position of the title in the plot, given as a tuple of floats (i.e. (x, y)). Defaults to (0.5, 0.9).

save_path

Path to save the figure to (will save as HTML file)

kwargs

Additional arguments that can be passed to :func plotly.graph_objects.Cone. Common arguments: - “colorscale”: Sets the colorscale. The colorscale must be an array containing arrays mapping a

normalized value to an rgb, rgba, hex, hsl, hsv, or named color string.

”sizemode”: Determines whether sizeref is set as a “scaled” (i.e unitless) scalar (normalized by the
max u/v/w norm in the vector field) or as “absolute” value (in the same units as the vector field). Defaults to “scaled”.
”sizeref”: The scalar reference for the cone size. The cone size is determined by its u/v/w norm
multiplied by sizeref. Defaults to 2.0.
”showscale”: Determines whether or not a colorbar is displayed for this trace.

CCI_deg_detection_setup(group_key: str | None = None, custom_tfs: List[str] | None = None, sender_receiver_or_target_degs: Literal[sender, receiver, target] = 'sender', use_ligands: bool = True, use_receptors: bool = False, use_pathways: bool = False, use_targets: bool = False, use_cell_types: bool = False, compute_dim_reduction: bool = False)[source]#

Computes differential expression signatures of cells with various levels of ligand expression.

Parameters:

group_key: Key to add to .obs of the AnnData object created by this function, containing cell type labels for each cell. If not given, will use :attr group_key.
custom_tfs: Optional list of transcription factors to make sure to be included in analysis. If given, these TFs will be included among the regulators regardless of the expression-based thresholding done in preprocessing.
sender_receiver_or_target_degs: Only makes a difference if ‘use_pathways’ or ‘use_cell_types’ is specified. Determines whether to compute DEGs for ligands, receptors or target genes. If ‘use_pathways’ is True, the value of this argument will determine whether ligands or receptors are used to define the model. Note that in either case, differential expression of TFs, binding factors, etc. will be computed in association w/ ligands/receptors/target genes (only valid if ‘use_cell_types’ and not ‘use_pathways’ is specified.
use_ligands: Use ligand array for differential expression analysis. Will take precedent over sender/receiver cell type if also provided.
use_receptors: Use receptor array for differential expression analysis. Will take precedent over sender/receiver cell type if also provided.
use_pathways: Use pathway array for differential expression analysis. Will use ligands in these pathways to collectively compute signaling potential score. Will take precedent over sender cell types if also provided.
use_targets: Use target array for differential expression analysis.
use_cell_types: Use cell types to use for differential expression analysis. If given, will preprocess/construct the necessary components to initialize cell type-specific models. Note- should be used alongside ‘use_ligands’, ‘use_receptors’, ‘use_pathways’ or ‘use_targets’ to select which molecules to investigate in each cell type.
compute_dim_reduction: Whether to compute PCA representation of the data subsetted to targets.

CCI_deg_detection(group_key: str, cci_dir_path: str, sender_receiver_or_target_degs: Literal[sender, receiver, target] = 'sender', use_ligands: bool = True, use_receptors: bool = False, use_pathways: bool = False, use_targets: bool = False, ligand_subset: List[str] | None = None, receptor_subset: List[str] | None = None, target_subset: List[str] | None = None, cell_type: str | None = None, use_dim_reduction: bool = False, **kwargs)[source]#

Downstream method that when called, creates a separate instance of :class MuSIC specifically designed for the downstream task of detecting differentially expressed genes associated w/ ligand expression.

Parameters:

group_key: Key in adata.obs that corresponds to the cell type (or other grouping) labels
cci_dir_path: Path to directory containing all Spateo databases
sender_receiver_or_target_degs: Only makes a difference if ‘use_pathways’ or ‘use_cell_types’ is specified. Determines whether to compute DEGs for ligands, receptors or target genes. If ‘use_pathways’ is True, the value of this argument will determine whether ligands or receptors are used to define the model. Note that in either case, differential expression of TFs, binding factors, etc. will be computed in association w/ ligands/receptors/target genes (only valid if ‘use_cell_types’ and not ‘use_pathways’ is specified.
use_ligands: Use ligand array for differential expression analysis. Will take precedent over receptors and sender/receiver cell types if also provided. Should match the input to :func CCI_sender_deg_detection_setup.
use_receptors: Use receptor array for differential expression analysis.
use_pathways: Use pathway array for differential expression analysis. Will use ligands in these pathways to collectively compute signaling potential score. Will take precedent over sender cell types if also provided. Should match the input to :func CCI_sender_deg_detection_setup.
use_targets: Use target genes array for differential expression analysis.
ligand_subset: Subset of ligands to use for differential expression analysis. If not given, will use all ligands from the upstream model.
receptor_subset: Subset of receptors to use for differential expression analysis. If not given, will use all receptors from the upstream model.
target_subset: Subset of target genes to use for differential expression analysis. If not given, will use all target genes from the upstream model.
cell_type: Cell type to use to use for differential expression analysis. If given, will use the ligand/receptor subset obtained from :func ~`CCI_deg_detection_setup` and cells of the chosen cell type in the model.
use_dim_reduction: Whether to use PCA representation of the data to find nearest neighbors. If False, will instead use the Jaccard distance. Defaults to False. Note that this will ultimately fail if dimensionality reduction was not performed in :func ~`CCI_deg_detection_setup`.
kwargs: Keyword arguments for any of the Spateo argparse arguments. Should not include ‘adata_path’, ‘custom_lig_path’ & ‘ligand’ or ‘custom_pathways_path’ & ‘pathway’ (depending on whether ligands or pathways are being used for the analysis), and should not include ‘output_path’ (which will be determined by the output path used for the main model). Should also not include any of the other arguments for this function

Returns:

Fitted model instance that can be used for further downstream applications

Return type:

downstream_model

deg_effect_barplot(target: str, interaction_subset: Optional[List[str]] = None, top_n_interactions: Optional[int] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'Blues', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the proportion of cells expressing a particular target (ligand, receptor, or target gene involved in an upstream CCI model) that are predicted to be affected by each transcription factor, or that are predicted to be affected by each L:R pair/ligand.

Parameters:

target: Target gene
interaction_subset: Optional, can be used to specify subset of interactions (transcription factors, L:R pairs, etc.) to visualize, e.g. [“Sox2”, “Irx3”]. If not given, will default to all TFs, L:R pairs, etc.
top_n_interactions: Optional, can be used to specify the top n interactions (transcription factors, L:R pair, ligand, etc.) to visualize. If not given, will default to all TFs, L:R pairs, etc.
fontsize: Font size to determine size of the axis labels, ticks, title, etc.
figsize: Width and height of plotting window
cmap: Name of matplotlib colormap specifying colormap to use. Must be a sequential colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

deg_effect_heatmap(target_subset: Optional[List[str]] = None, target_type: Literal[ligand, receptor, target_gene, tf_target] = 'target_gene', to_plot: Literal[proportion, MuSIC_Interpreter.deg_effect_heatmap.specificity] = 'proportion', interaction_subset: Optional[List[str]] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'magma', lower_proportion_threshold: float = 0.1, order_interactions: bool = False, order_targets: bool = False, remove_rows_and_cols_threshold: Optional[int] = None, save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {}, save_df: bool = False)[source]#

Visualize the proportion of cells expressing any target (ligand, receptor, or target gene involved in an upstream CCI model) that are predicted to be affected by each transcription factor, or that are predicted to be affected by each L:R pair/ligand, using a heatmap for visualization.

Parameters:

target_subset: Optional, can be used to specify subset of targets (ligands, receptors, target genes, or “TF_target” for target genes where the interaction to plot is TF effect) to visualize, e.g. [“Tubb1a”, “Tubb1b”]. If not given, will default to all targets.
target_type: Type of target gene to visualize. Must be one of “ligand”, “receptor”, or “target_gene”. Defaults to “target_gene”. Used to specify where to search for the target genes to process.
to_plot: Two options, “proportion” or “specificity”: for proportion, plot the proportion of cells expressing the target that are affected by each interaction. For specificity, take the proportion of cells affected by each interaction for which the interaction is predicted to affect a specific target.
interaction_subset: Optional, can be used to specify subset of interactions (transcription factors, L:R pairs, etc.) to visualize, e.g. [“Sox2”, “Irx3”]. If not given, will default to all TFs, L:R pairs, etc.
fontsize: Font size to determine size of the axis labels, ticks, title, etc.
figsize: Width and height of plotting window
cmap: Name of matplotlib colormap specifying colormap to use. Must be a sequential colormap.
lower_proportion_threshold: Proportion threshold below which to set the proportion to 0 in the display. Defaults to 0.1.
order_interactions: Whether to hierarchically sort the y-axis/interactions (transcription factors, L:R pairs, etc.).
order_targets: Whether to hierarchically sort the x-axis/targets (ligands, receptors, target genes)
remove_rows_and_cols_threshold: Optional, can be used to specify the threshold for the number of nonzero interactions/TFs a row/column needs to be displayed. If not given, all rows and columns will be displayed.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.
save_df: Set True to save the metric dataframe in the end

top_target_barplot(interaction: str, target_subset: Optional[List[str]] = None, use_ligand_targets: bool = False, use_receptor_targets: bool = False, use_target_gene_targets: bool = True, top_n_targets: Optional[int] = None, fontsize: Union[None, int] = None, figsize: Union[None, Tuple[float, float]] = None, cmap: str = 'Blues', save_show_or_return: Literal[save, show, return, both, all] = 'show', save_kwargs: Optional[dict] = {})[source]#

Visualize the proportion of cells expressing each target (ligand, receptor, or target gene involved in an upstream CCI model) that are predicted to be affected by a given interaction, i.e. transcription factor, L:R pair/ligand.

Parameters:

interaction: The interaction to investigate, in the form specified in the design matrix, e.g. “Sox9” or “Igf1:Igf1r”.
target_subset: Optional, specify subset of target genes to visualize. If not given, defaults to all targets.
use_ligand_targets: Whether ligands should be used as targets, i.e. if “interaction” is a TF and the target genes being influenced by the TF are ligands. If True, will ignore “use_receptor_targets” and “use_target_gene_targets”.
use_receptor_targets: Whether receptors should be used as targets, i.e. if “interaction” is a TF and the target genes being influenced by the TF are receptors. If True, will ignore “use_target_gene_targets”.
use_target_gene_targets: Whether target genes should be used as targets, i.e. if “interaction” is a TF and the target genes being influenced by the TF are target genes (that are not ligands or receptors).
top_n_targets: Number of top targets to visualize. Defaults to 10.
fontsize: Font size to determine size of the axis labels, ticks, title, etc.
figsize: Width and height of plotting window
cmap: Name of matplotlib colormap specifying colormap to use. Must be a sequential colormap.
save_show_or_return: Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.
save_kwargs: A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

visualize_intercellular_network(lr_model_output_dir: str, target_subset: List[str] | str | None = None, top_n_targets: int | None = 3, ligand_subset: List[str] | str | None = None, receptor_subset: List[str] | str | None = None, regulator_subset: List[str] | str | None = None, include_tf_ligand: bool = False, include_tf_target: bool = True, cell_subset: List[str] | str | None = None, select_n_lr: int = 5, select_n_tf: int = 3, effect_size_threshold: float = 0.2, coexpression_threshold: float = 0.2, aggregate_method: Literal[mean, median, sum] = 'mean', cmap_neighbors: str = 'autumn', cmap_default: str = 'winter', scale_factor: float = 3, layout: Literal[random, circular, kamada, planar, spring, spectral, spiral] = 'planar', node_fontsize: int = 8, edge_fontsize: int = 8, arrow_size: int = 1, node_label_position: str = 'middle center', edge_label_position: str = 'middle center', upper_margin: float = 40, lower_margin: float = 20, left_margin: float = 50, right_margin: float = 50, title: str | None = None, save_path: str | None = None, save_id: str | None = None, save_ext: str = 'png', dpi: int = 300)[source]#

After fitting model, construct and visualize the inferred intercellular regulatory network. Effect sizes ( edge values) will be averaged over cells specified by “cell_subset”, otherwise all cells will be used.

Parameters:

lr_model_output_dir

Path to directory containing the outputs of the L:R model. This function will assume :attr output_path is the output path for the downstream model, i.e. connecting regulatory factors/TFs to ligands/receptors/targets.

target_subset

Optional, can be used to specify target genes downstream of signaling interactions of interest. If not given, will use all targets used for the model.

top_n_targets

Optional, can be used to specify the number of top targets to include in the network instead of providing full list of custom targets (“top” judged by fraction of the chosen subset of cells each target is expressed in).

ligand_subset

Optional, can be used to specify subset of ligands. If not given, will use all ligands present in any of the interactions for the model.

receptor_subset

Optional, can be used to specify subset of receptors. If not given, will use all receptors present in any of the interactions for the model.

regulator_subset

Optional, can be used to specify subset of regulators (transcription factors, etc.). If not given, will use all regulatory molecules used in fitting the downstream model(s).

include_tf_ligand

Whether to include TF-ligand interactions in the network. While providing more information, this can make it more difficult to interpret the plot. Defaults to False.

include_tf_target

Whether to include TF-target interactions in the network. While providing more information, this can make it more difficult to interpret the plot. Defaults to True.

cell_subset

Optional, can be used to specify subset of cells to use for averaging effect sizes. If not given, will use all cells. Can be either:

A list of cell IDs (must be in the same format as the cell IDs in the adata object)

Cell type label(s)

select_n_lr

Threshold for filtering out edges with low effect sizes, by selecting up to the top n L:R interactions per target (fewer can be selected if the top n are all zero). Default is 5.

select_n_tf

Threshold for filtering out edges with low effect sizes, by selecting up to the top n TFs. For TF-ligand edges, will select the top n for each receptor (with a theoretical maximum of n * number of receptors in the graph).

coexpression_threshold

For receptor-target, TF-ligand, TF-receptor links, only draw edges if the molecule pairs in question are coexpressed in > threshold number of cells.

aggregate_method

Only used when “include_tf_ligand” is True. For the TF-ligand array, each row will be replaced by the mean, median or sum of the neighboring rows. Defaults to “mean”.

cmap_neighbors

Colormap to use for nodes belonging to “source”/receiver cells. Defaults to yellow-orange-red.

cmap_default

Colormap to use for nodes belonging to “neighbor”/sender cells. Defaults to purple-blue-green.

scale_factor

Adjust to modify the size of the nodes

layout

Used for positioning nodes on the plot. Options: - “random”: Randomly positions nodes ini the unit square. - “circular”: Positions nodes on a circle. - “kamada”: Positions nodes using Kamada-Kawai path-length cost-function. - “planar”: Positions nodes without edge intersections, if possible. - “spring”: Positions nodes using Fruchterman-Reingold force-directed algorithm. - “spectral”: Positions nodes using eigenvectors of the graph Laplacian. - “spiral”: Positions nodes in a spiral layout.

node_fontsize

Font size for node labels

edge_fontsize

Font size for edge labels

arrow_size

Size of the arrow for directed graphs, by default 1

node_label_position

Position of node labels. Options: ‘top left’, ‘top center’, ‘top right’, ‘middle left’, ‘middle center’, ‘middle right’, ‘bottom left’, ‘bottom center’, ‘bottom right’

edge_label_position

Position of edge labels. Options: ‘top left’, ‘top center’, ‘top right’, ‘middle left’, ‘middle center’, ‘middle right’, ‘bottom left’, ‘bottom center’, ‘bottom right’

title

Optional, title for the plot. If not given, will use the AnnData object path to derive this.

upper_margin

Margin between top of the plot and top of the figure

lower_margin

Margin between bottom of the plot and bottom of the figure

left_margin

Margin between left of the plot and left of the figure

right_margin

Margin between right of the plot and right of the figure

save_path

Optional, directory to save figure to. If not given, will save to the parent folder of the path provided for :attr output_path in the argument specification.

save_id

Optional unique identifier that can be used in saving. If not given, will use the AnnData object path to derive this.

save_ext

File extension to save figure as. Default is “png”.

dpi

Resolution to save figure at. Default is 300.

Returns:

Graph object, such that it can be separately plotted in interactive window. sizing_list: List of node sizes, for use in interactive window. color_list: List of node colors, for use in interactive window.

Return type:

permutation_test(gene: str, n_permutations: int = 100, permute_nonzeros_only: bool = False, **kwargs)[source]#

Sets up permutation test for determination of statistical significance of model diagnostics. Can be used to identify true/the strongest signal-responsive expression patterns.

Parameters:

gene: Target gene to perform permutation test on.
n_permutations: Number of permutations of the gene expression to perform. Default is 100.
permute_nonzeros_only: Whether to only perform the permutation over the gene-expressing cells
kwargs: Keyword arguments for any of the Spateo argparse arguments. Should not include ‘adata_path’, ‘target_path’, or ‘output_path’ (which will be determined by the output path used for the main model). Also should not include ‘custom_lig_path’, ‘custom_rec_path’, ‘mod_type’, ‘bw_fixed’ or ‘kernel’ (which will be determined by the initial model instantiation).

eval_permutation_test(gene: str)[source]#

Evaluation function for permutation tests. Will compute multiple metrics (correlation coefficients, F1 scores, AUROC in the case that all cells were permuted, etc.) to compare true and model-predicted gene expression vectors.

Parameters:

gene: Target gene for which to evaluate permutation test

spateo.tools.CCI_effects_modeling.MuSIC_downstream.replace_col_with_collagens(string)[source]#

spateo.tools.CCI_effects_modeling.MuSIC_downstream.replace_hla_with_hlas(string)[source]#

spateo.tools.CCI_effects_modeling.MuSIC_downstream#

Module Contents#

Classes#

Functions#

`spateo.tools.CCI_effects_modeling.MuSIC_downstream`#