spateo.tools.glm#

Module Contents#

Functions#

glm_degs(→ Optional[anndata.AnnData])

Differential genes expression tests using generalized linear regressions. Here only size factor normalized gene

glm_test(data[, fullModelFormulaStr, ...])

lrt(full, restr)

spateo.tools.glm.glm_degs(adata: anndata.AnnData, X_data: numpy.ndarray | None = None, genes: list | None = None, layer: str | None = None, key_added: str = 'glm_degs', fullModelFormulaStr: str = '~cr(time, df=3)', reducedModelFormulaStr: str = '~1', qval_threshold: float | None = 0.05, llf_threshold: float | None = -2000, ci_alpha: float = 0.05, inplace: bool = True) anndata.AnnData | None[source]#

Differential genes expression tests using generalized linear regressions. Here only size factor normalized gene expression matrix can be used, and SCT/pearson residuals transformed gene expression can not be used.

Tests each gene for differential expression as a function of integral time (the time estimated via the reconstructed vector field function) or pseudo-time using generalized additive models with natural spline basis. This function can also use other co-variates as specified in the full (i.e ~clusters) and reduced model formula to identify differentially expression genes across different categories, group, etc. glm_degs relies on statsmodels package and is adapted from the differentialGeneTest function in Monocle. Note that glm_degs supports performing deg analysis for any layer or normalized data in your adata object. That is you can either use the total, new, unspliced or velocity, etc. for the differential expression analysis.

Parameters:
adata

An Anndata object. The anndata object must contain a size factor normalized gene expression matrix.

X_data

The user supplied data that will be used for differential expression analysis directly.

genes

The list of genes that will be used to subset the data for differential expression analysis. If genes = None, all genes will be used.

layer

The layer that will be used to retrieve data for dimension reduction and clustering. If layer = None, .X is used.

key_added

The key that will be used for the glm_degs key in .uns.

fullModelFormulaStr

A formula string specifying the full model in differential expression tests (i.e. likelihood ratio tests) for each gene/feature.

reducedModelFormulaStr

A formula string specifying the reduced model in differential expression tests (i.e. likelihood ratio tests) for each gene/feature.

qval_threshold

Only keep the glm test results whose qval is less than the qval_threshold.

llf_threshold

Only keep the glm test results whose log-likelihood is less than the llf_threshold.

ci_alpha

The significance level for the confidence interval. The default ci_alpha = .05 returns a 95% confidence interval.

inplace

Whether to copy adata or modify it inplace.

Returns:

An AnnData object is updated/copied with the key_added dictionary in the .uns attribute, storing the differential expression test results after the GLM test.

spateo.tools.glm.glm_test(data, fullModelFormulaStr='~cr(time, df=3)', reducedModelFormulaStr='~1')[source]#
spateo.tools.glm.lrt(full, restr)[source]#