`spateo.tools.ST_regression.regression_utils`#

Auxiliary functions to aid in the interpretation functions for the spatial and spatially-lagged regression models.

Module Contents#

Functions#

`softplus`(z)	Numerically stable version of log(1 + exp(z)).
`L1_penalty`(→ float)	Implementation of the L1 penalty that penalizes based on absolute value of coefficient magnitude.
`L2_penalty`(→ float)	Implementation of the L2 penalty that penalizes based on the square of coefficient magnitudes.
`L1_L2_penalty`(→ float)	Combination of the L1 and L2 penalties.
`get_fisher_inverse`(→ numpy.ndarray)	Computes the Fisher matrix that measures the amount of information each feature in x provides about y- that is,
`wald_test`(→ numpy.ndarray)	Perform single-coefficient Wald test, informing whether a given coefficient deviates significantly from the
`multitesting_correction`(→ numpy.ndarray)	In the case of testing multiple hypotheses from the same experiment, perform multiple test correction to adjust
`_get_p_value`(→ numpy.ndarray)	Computes p-values for differential expression for each feature
`compute_wald_test`(→ Tuple[numpy.ndarray, ...)	param params Array of shape [n_features, n_params]
`mae`(→ float)	Mean absolute error- in this context, actually log1p mean absolute error
`mse`(→ float)	Mean squared error- in this context, actually log1p mean squared error
`plot_prior_vs_data`(reconst, adata, kind, target_name, ...)	Plots distribution of observed vs. predicted counts in the form of a comparative density barplot.

spateo.tools.ST_regression.regression_utils.softplus(z)[source]#: Numerically stable version of log(1 + exp(z)).

spateo.tools.ST_regression.regression_utils.L1_penalty(beta: numpy.ndarray) → float[source]#

Implementation of the L1 penalty that penalizes based on absolute value of coefficient magnitude.

Parameters

beta: Array of shape [n_features,]; learned model coefficients

Returns

float, value for the regularization parameter (typically stylized by lambda)

Return type

L1penalty

spateo.tools.ST_regression.regression_utils.L2_penalty(beta: numpy.ndarray, Tau: Union[None, numpy.ndarray] = None) → float[source]#

Implementation of the L2 penalty that penalizes based on the square of coefficient magnitudes.

Parameters

beta: Array of shape [n_features,]; learned model coefficients
Tau: optional array of shape [n_features, n_features]; the Tikhonov matrix for ridge regression. If not
provided
matrix. : Tau will default to the identity

spateo.tools.ST_regression.regression_utils.L1_L2_penalty(alpha: float, beta: numpy.ndarray, Tau: Union[None, numpy.ndarray] = None) → float[source]#

Combination of the L1 and L2 penalties.

Parameters

alpha: The weighting between L1 penalty (alpha=1.) and L2 penalty (alpha=0.) term of the loss function.
beta: Array of shape [n_features,]; learned model coefficients
Tau: optional array of shape [n_features, n_features]; the Tikhonov matrix for ridge regression. If not
provided
matrix. : Tau will default to the identity

Returns

Value for the regularization parameter

Return type

spateo.tools.ST_regression.regression_utils.get_fisher_inverse(x: numpy.ndarray, y: numpy.ndarray) → numpy.ndarray[source]#

Computes the Fisher matrix that measures the amount of information each feature in x provides about y- that is, whether the log-likelihood is sensitive to change in the parameter x.

Parameters

x: Independent variable array
y: Dependent variable array

Returns

np.ndarray

Return type

inverse_fisher

spateo.tools.ST_regression.regression_utils.wald_test(theta_mle: numpy.ndarray, theta_sd: numpy.ndarray, theta0: Union[float, numpy.ndarray] = 0) → numpy.ndarray[source]#

Perform single-coefficient Wald test, informing whether a given coefficient deviates significantly from the supplied reference value (theta0), based on the standard deviation of the posterior of the parameter estimate.

Parameters

theta_mle: Maximum likelihood estimation of given parameter by feature
theta_sd: Standard deviation of the maximum likelihood estimation
theta0: Value(s) to test theta_mle against. Must be either a single number or an array w/ equal number of entries to theta_mle.

Returns

np.ndarray

Return type

pvals

spateo.tools.ST_regression.regression_utils.multitesting_correction(pvals: numpy.ndarray, method: str = 'fdr_bh', alpha: float = 0.05) → numpy.ndarray[source]#

In the case of testing multiple hypotheses from the same experiment, perform multiple test correction to adjust q-values.

Args: pvals: Uncorrected p-values; must be given as a one-dimensional array method: Method to use for correction. Available methods can be found in the documentation for

statsmodels.stats.multitest.multipletests(), and are also listed below (in correct case) for convenience:

Named methods:

bonferroni

sidak

holm-sidak

holm

simes-hochberg

hommel

Abbreviated methods:

fdr_bh: Benjamini-Hochberg correction

fdr_by: Benjamini-Yekutieli correction

fdr_tsbh: Two-stage Benjamini-Hochberg

fdr_tsbky: Two-stage Benjamini-Krieger-Yekutieli method

alpha: Family-wise error rate (FWER)

Returns: qval: p-values post-correction

spateo.tools.ST_regression.regression_utils._get_p_value(variables: numpy.array, fisher_inv: numpy.array, coef_loc_totest: int) → numpy.ndarray[source]#

Computes p-values for differential expression for each feature

Parameters

variables: Array where each column corresponds to a feature
fisher_inv: Inverse Fisher information matrix
coef_loc_totest: Numerical column of the array corresponding to the coefficient to test

Returns

Array of identical shape to variables, where each element is a p-value for that instance of that: feature

Return type

pvalues

spateo.tools.ST_regression.regression_utils.compute_wald_test(params: numpy.ndarray, fisher_inv: numpy.ndarray, significance_threshold: float = 0.01) → Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray][source]#

Parameters

params: Array of shape [n_features, n_params]
fisher_inv: Inverse Fisher information matrix
significance_threshold: Upper threshold to be considered significant

Returns

Array of identical shape to variables, where each element is True or False if it meets the: threshold for significance
pvalues: Array of identical shape to variables, where each element is a p-value for that instance of that: feature
qvalues: Array of identical shape to variables, where each element is a q-value for that instance of that: feature

Return type

significance

spateo.tools.ST_regression.regression_utils.mae(y_true, y_pred) → float[source]#

Mean absolute error- in this context, actually log1p mean absolute error

Parameters

y_true: Regression model output
y_pred: Observed values for the dependent variable

Returns

Mean absolute error value across all samples

Return type

mae

spateo.tools.ST_regression.regression_utils.mse(y_true, y_pred) → float[source]#

Mean squared error- in this context, actually log1p mean squared error

Parameters

y_true: Regression model output
y_pred: Observed values for the dependent variable

Returns

Mean squared error value across all samples

Return type

mse

spateo.tools.ST_regression.regression_utils.plot_prior_vs_data(reconst: pandas.DataFrame, adata: anndata.AnnData, kind: str = 'barplot', target_name: Union[None, str] = None, title: Union[None, str] = None, figsize: Union[None, Tuple[float, float]] = None, save_show_or_return: Literal[save, show, return, both, all] = 'save', save_kwargs: dict = {})[source]#

Plots distribution of observed vs. predicted counts in the form of a comparative density barplot.

Parameters

reconst

DataFrame containing values for reconstruction/prediction of targets of a regression model

adata

AnnData object containing observed counts

kind

Kind of plot to generate. Options: “barplot”, “scatterplot”. Case sensitive, defaults to “barplot”.

target_name

Optional, can be:

Column name in DataFrame/AnnData object: name of gene to subset to
”sum”: computes sum over all features present in ‘reconst’ to compare to the corresponding subset of

’adata’. - “mean”: computes mean over all features present in ‘reconst’ to compare to the corresponding subset of ‘adata’.

If not given, will subset AnnData to features in ‘reconst’ and flatten both arrays to compare all values.

If not given, will compute the sum over all features present in ‘reconst’ and compare to the corresponding subset of ‘adata’.

save_show_or_return

Whether to save, show or return the figure. If “both”, it will save and plot the figure at the same time. If “all”, the figure will be saved, displayed and the associated axis and other object will be return.

save_kwargs

A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modifies those keys according to your needs.

spateo.tools.ST_regression.regression_utils#

Module Contents#

Functions#

`spateo.tools.ST_regression.regression_utils`#