spateo.tools.cluster.spagcn_utils¶
Classes¶
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907 |
|
Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer. |
|
Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8 |
Functions¶
|
(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels. |
|
|
|
Function to search proper l value for spagcn algorithm. |
|
get the initial number of clusters corresponding to given louvain resolution. |
|
Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm. |
|
To refine(smooth) the boundary of spatial domains(clusters). |
Module Contents¶
- spateo.tools.cluster.spagcn_utils.calculate_adj_matrix(x, y, x_pixel=None, y_pixel=None, image=None, beta=49, alpha=1, histology=True)[source]¶
(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.
- Parameters:
- x list
a list which contains corresponding x-coordinates for the spots, spatialy.
- y list
a list which contains corresponding y-coordinates for the spots, spatialy.
- x_pixel list, optional
a list which contains corresponding x-pixels for the spots, in histology image. Defaults to None.
- y_pixel list, optional
a list which contains corresponding y-pixels for the spots, in histology image. Defaults to None.
- (class image
numpy.ndarray, optional): the image(typically histology image) in numpy.ndarray format(can be obtained by cv2.imread). Defaults to None.
- beta int, optional
to control the range of neighbourhood when calculate grey value for one spot. Defaults to 49.
- alpha int, optional
to control the color scale. Defaults to 1.
- histology bool, optional
if the image is histological. Defaults to True.
- Returns:
numpy.ndarray: the calculated adjacent matrix.
- Return type:
class
- spateo.tools.cluster.spagcn_utils.search_l(p, adj, start=0.01, end=1000, tol=0.01, max_run=100)[source]¶
Function to search proper l value for spagcn algorithm.
- Parameters:
- p float, optional
parameter p in spagcn algorithm. See SpaGCN for details.
- (class adj
numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
- start float, optional
lower boundary of search. Defaults to 0.01.
- end int, optional
upper boundary of search. Defaults to 1000.
- tol float, optional
step length for search. Defaults to 0.01.
- max_run int, optional
maximum number of searching iteration. Defaults to 100.
- Returns:
the l value
- Return type:
- spateo.tools.cluster.spagcn_utils.get_cluster_num(adata, adj, res, tol, lr, max_epochs, l, r_seed=100, t_seed=100, n_seed=100)[source]¶
get the initial number of clusters corresponding to given louvain resolution.
- Parameters:
- adata
further passed to SpaGCN.train(), see SpaGCN.train.
- adj
further passed to SpaGCN.train(), see SpaGCN.train.
- res
further passed to SpaGCN.train(), see SpaGCN.train.
- tol
further passed to SpaGCN.train(), see SpaGCN.train.
- lr
further passed to SpaGCN.train(), see SpaGCN.train.
- max_epochs
further passed to SpaGCN.train(), see SpaGCN.train.
- l float
parameter l in spagcn algorithm, see SpaGCN for details.
- r_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- t_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- n_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- Returns:
number of clusters
- Return type:
- spateo.tools.cluster.spagcn_utils.search_res(adata, adj, l, target_num, start=0.4, step=0.1, tol=0.005, lr=0.05, max_epochs=10, r_seed=100, t_seed=100, n_seed=100, max_run=10)[source]¶
Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.
- Parameters:
- (class adj
~anndata.AnnData): an Annadata object.
- (class
numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
- l float
parameter l in spagcn algorithm, see SpaGCN for details.
- target_num int
desired number of clusters.
- start float, optional
the lower boundary of search for resolution. Defaults to 0.4.
- step float, optional
search step length. Defaults to 0.1.
- tol
further passed to SpaGCN.train(), see SpaGCN.train.
- lr
further passed to SpaGCN.train(), see SpaGCN.train.
- max_epochs
further passed to SpaGCN.train(), see SpaGCN.train.
- r_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- t_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- n_seed int, optional
Global seed for random, torch, numpy. Defaults to 100.
- max_run int, optional
max number of iteration. Defaults to 10.
- Returns:
calculated initial louvain resolution.
- Return type:
- spateo.tools.cluster.spagcn_utils.refine(sample_id, pred, dis, shape='square')[source]¶
To refine(smooth) the boundary of spatial domains(clusters).
- Parameters:
- sample_id list
list of sample(cell, spot or bin) names.
- pred list
list of spatial domains corresponding to the sample_id list.
- (class dis
numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
- shape str, optional
Smooth the spatial domains with given spatial topology, “hexagon” for Visium data, “square” for ST data. Defaults to “square”.
- Returns:
list of refined spatial domains corresponding to the sample_id list.
- Return type:
[list]
- class spateo.tools.cluster.spagcn_utils.GraphConvolution(in_features, out_features, bias=True)[source]¶
Bases:
torch.nn.Module
Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
- class spateo.tools.cluster.spagcn_utils.simple_GC_DEC(nfeat, nhid, alpha=0.2)[source]¶
Bases:
torch.nn.Module
Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer. For DEC, see https://arxiv.org/abs/1511.06335v2
- class spateo.tools.cluster.spagcn_utils.SpaGCN[source]¶
Bases:
object
Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8
- train(adata, adj, num_pcs=50, lr=0.005, max_epochs=2000, weight_decay=0, opt='adam', init_spa=True, init='louvain', n_neighbors=10, n_clusters=None, res=0.4, tol=0.001)[source]¶
train model for spagcn
- Parameters:
- (class adj
~anndata.AnnData): an Annadata object.
- (class
numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
- num_pcs int, optional
number of pcs(out dimension of PCA) to use. Defaults to 50.
- lr float, optional
learning rate in neural network. Defaults to 0.005.
- max_epochs int, optional
max epochs to train in neural network. Defaults to 2000.
- weight_decay int, optional
make learning rate decay while training. Defaults to 0.
- opt str, optional
the optimizer to use. Defaults to “adam”.
- init_spa bool, optional
make initial clusters with louvain or kmeans. Defaults to True.
- init str, optional
algorithm to use in inital clustering. Supports “louvain”, “kmeans”. Defaults to “louvain”.