spateo.tools.glm¶
Functions¶
Module Contents¶
- spateo.tools.glm.glm_degs(adata: anndata.AnnData, X_data: numpy.ndarray | None = None, genes: list | None = None, layer: str | None = None, key_added: str = 'glm_degs', fullModelFormulaStr: str = '~cr(time, df=3)', reducedModelFormulaStr: str = '~1', qval_threshold: float | None = 0.05, llf_threshold: float | None = -2000, ci_alpha: float = 0.05, inplace: bool = True) anndata.AnnData | None [source]¶
Differential genes expression tests using generalized linear regressions. Here only size factor normalized gene expression matrix can be used, and SCT/pearson residuals transformed gene expression can not be used.
Tests each gene for differential expression as a function of integral time (the time estimated via the reconstructed vector field function) or pseudo-time using generalized additive models with natural spline basis. This function can also use other co-variates as specified in the full (i.e ~clusters) and reduced model formula to identify differentially expression genes across different categories, group, etc. glm_degs relies on statsmodels package and is adapted from the differentialGeneTest function in Monocle. Note that glm_degs supports performing deg analysis for any layer or normalized data in your adata object. That is you can either use the total, new, unspliced or velocity, etc. for the differential expression analysis.
- Parameters:
- adata
An Anndata object. The anndata object must contain a size factor normalized gene expression matrix.
- X_data
The user supplied data that will be used for differential expression analysis directly.
- genes
The list of genes that will be used to subset the data for differential expression analysis. If
genes = None
, all genes will be used.- layer
The layer that will be used to retrieve data for dimension reduction and clustering. If
layer = None
,.X
is used.- key_added
The key that will be used for the glm_degs key in
.uns
.- fullModelFormulaStr
A formula string specifying the full model in differential expression tests (i.e. likelihood ratio tests) for each gene/feature.
- reducedModelFormulaStr
A formula string specifying the reduced model in differential expression tests (i.e. likelihood ratio tests) for each gene/feature.
- qval_threshold
Only keep the glm test results whose qval is less than the
qval_threshold
.- llf_threshold
Only keep the glm test results whose log-likelihood is less than the
llf_threshold
.- ci_alpha
The significance level for the confidence interval. The default
ci_alpha = .05
returns a 95% confidence interval.- inplace
Whether to copy adata or modify it inplace.
- Returns:
An
AnnData
object is updated/copied with thekey_added
dictionary in the.uns
attribute, storing the differential expression test results after the GLM test.