spateo.tools.cluster.spagcn_utils¶

Classes¶

`GraphConvolution`	Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
`simple_GC_DEC`	Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer.
`SpaGCN`	Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

Functions¶

`calculate_adj_matrix`(x, y[, x_pixel, y_pixel, image, ...])	(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.
`calculate_p`(adj, l)
`search_l`(p, adj[, start, end, tol, max_run])	Function to search proper l value for spagcn algorithm.
`get_cluster_num`(adata, adj, res, tol, lr, max_epochs, l)	get the initial number of clusters corresponding to given louvain resolution.
`search_res`(adata, adj, l, target_num[, start, step, ...])	Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.
`refine`(sample_id, pred, dis[, shape])	To refine(smooth) the boundary of spatial domains(clusters).

Module Contents¶

spateo.tools.cluster.spagcn_utils.calculate_adj_matrix(x, y, x_pixel=None, y_pixel=None, image=None, beta=49, alpha=1, histology=True)[source]¶

(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.

Parameters:

x list: a list which contains corresponding x-coordinates for the spots, spatialy.
y list: a list which contains corresponding y-coordinates for the spots, spatialy.
x_pixel list, optional: a list which contains corresponding x-pixels for the spots, in histology image. Defaults to None.
y_pixel list, optional: a list which contains corresponding y-pixels for the spots, in histology image. Defaults to None.
(class image: numpy.ndarray, optional): the image(typically histology image) in numpy.ndarray format(can be obtained by cv2.imread). Defaults to None.
beta int, optional: to control the range of neighbourhood when calculate grey value for one spot. Defaults to 49.
alpha int, optional: to control the color scale. Defaults to 1.
histology bool, optional: if the image is histological. Defaults to True.

Returns:

numpy.ndarray: the calculated adjacent matrix.

Return type:

class

spateo.tools.cluster.spagcn_utils.calculate_p(adj, l)[source]¶

spateo.tools.cluster.spagcn_utils.search_l(p, adj, start=0.01, end=1000, tol=0.01, max_run=100)[source]¶

Function to search proper l value for spagcn algorithm.

Parameters:

p float, optional: parameter p in spagcn algorithm. See SpaGCN for details.
(class adj: numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
start float, optional: lower boundary of search. Defaults to 0.01.
end int, optional: upper boundary of search. Defaults to 1000.
tol float, optional: step length for search. Defaults to 0.01.
max_run int, optional: maximum number of searching iteration. Defaults to 100.

Returns:

the l value

Return type:

float

spateo.tools.cluster.spagcn_utils.get_cluster_num(adata, adj, res, tol, lr, max_epochs, l, r_seed=100, t_seed=100, n_seed=100)[source]¶

get the initial number of clusters corresponding to given louvain resolution.

Parameters:

adata: further passed to SpaGCN.train(), see SpaGCN.train.
adj: further passed to SpaGCN.train(), see SpaGCN.train.
res: further passed to SpaGCN.train(), see SpaGCN.train.
tol: further passed to SpaGCN.train(), see SpaGCN.train.
lr: further passed to SpaGCN.train(), see SpaGCN.train.
max_epochs: further passed to SpaGCN.train(), see SpaGCN.train.
l float: parameter l in spagcn algorithm, see SpaGCN for details.
r_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.
t_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.
n_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.

Returns:

number of clusters

Return type:

int

spateo.tools.cluster.spagcn_utils.search_res(adata, adj, l, target_num, start=0.4, step=0.1, tol=0.005, lr=0.05, max_epochs=10, r_seed=100, t_seed=100, n_seed=100, max_run=10)[source]¶

Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.

Parameters:

(class adj: ~anndata.AnnData): an Annadata object.
(class: numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
l float: parameter l in spagcn algorithm, see SpaGCN for details.
target_num int: desired number of clusters.
start float, optional: the lower boundary of search for resolution. Defaults to 0.4.
step float, optional: search step length. Defaults to 0.1.
tol: further passed to SpaGCN.train(), see SpaGCN.train.
lr: further passed to SpaGCN.train(), see SpaGCN.train.
max_epochs: further passed to SpaGCN.train(), see SpaGCN.train.
r_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.
t_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.
n_seed int, optional: Global seed for random, torch, numpy. Defaults to 100.
max_run int, optional: max number of iteration. Defaults to 10.

Returns:

calculated initial louvain resolution.

Return type:

float

spateo.tools.cluster.spagcn_utils.refine(sample_id, pred, dis, shape='square')[source]¶

To refine(smooth) the boundary of spatial domains(clusters).

Parameters:

sample_id list: list of sample(cell, spot or bin) names.
pred list: list of spatial domains corresponding to the sample_id list.
(class dis: numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
shape str, optional: Smooth the spatial domains with given spatial topology, “hexagon” for Visium data, “square” for ST data. Defaults to “square”.

Returns:

list of refined spatial domains corresponding to the sample_id list.

Return type:

[list]

class spateo.tools.cluster.spagcn_utils.GraphConvolution(in_features, out_features, bias=True)[source]¶

Bases: torch.nn.Module

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

in_features[source]¶

out_features[source]¶

weight[source]¶

reset_parameters()[source]¶

forward(input, adj)[source]¶

__repr__()[source]¶

class spateo.tools.cluster.spagcn_utils.simple_GC_DEC(nfeat, nhid, alpha=0.2)[source]¶

Bases: torch.nn.Module

Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer. For DEC, see https://arxiv.org/abs/1511.06335v2

gc[source]¶

nhid[source]¶

alpha = 0.2[source]¶

forward(x, adj)[source]¶

loss_function(p, q)[source]¶

target_distribution(q)[source]¶

fit(X, adj, lr=0.001, max_epochs=5000, update_interval=3, trajectory_interval=50, weight_decay=0.0005, opt='sgd', init='louvain', n_neighbors=10, res=0.4, n_clusters=10, init_spa=True, tol=0.001)[source]¶

predict(X, adj)[source]¶

class spateo.tools.cluster.spagcn_utils.SpaGCN[source]¶

Bases: object

Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

l = None[source]¶

set_l(l)[source]¶

train(adata, adj, num_pcs=50, lr=0.005, max_epochs=2000, weight_decay=0, opt='adam', init_spa=True, init='louvain', n_neighbors=10, n_clusters=None, res=0.4, tol=0.001)[source]¶

train model for spagcn

Parameters:

(class adj: ~anndata.AnnData): an Annadata object.
(class: numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.
num_pcs int, optional: number of pcs(out dimension of PCA) to use. Defaults to 50.
lr float, optional: learning rate in neural network. Defaults to 0.005.
max_epochs int, optional: max epochs to train in neural network. Defaults to 2000.
weight_decay int, optional: make learning rate decay while training. Defaults to 0.
opt str, optional: the optimizer to use. Defaults to “adam”.
init_spa bool, optional: make initial clusters with louvain or kmeans. Defaults to True.
init str, optional: algorithm to use in inital clustering. Supports “louvain”, “kmeans”. Defaults to “louvain”.

predict()[source]¶