spateo.tools.cluster.spagcn_utils#

Module Contents#

Classes#

GraphConvolution

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

simple_GC_DEC

Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer.

SpaGCN

Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

Functions#

calculate_adj_matrix(x, y[, x_pixel, y_pixel, image, ...])

(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.

calculate_p(adj, l)

search_l(p, adj[, start, end, tol, max_run])

Function to search proper l value for spagcn algorithm.

get_cluster_num(adata, adj, res, tol, lr, max_epochs, l)

get the initial number of clusters corresponding to given louvain resolution.

search_res(adata, adj, l, target_num[, start, step, ...])

Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.

refine(sample_id, pred, dis[, shape])

To refine(smooth) the boundary of spatial domains(clusters).

spateo.tools.cluster.spagcn_utils.calculate_adj_matrix(x, y, x_pixel=None, y_pixel=None, image=None, beta=49, alpha=1, histology=True)[source]#

(Part of spagcn algorithm) Function to calculate adjacent matrix according to spatial coordinate and image pixels.

Parameters:
x list

a list which contains corresponding x-coordinates for the spots, spatialy.

y list

a list which contains corresponding y-coordinates for the spots, spatialy.

x_pixel list, optional

a list which contains corresponding x-pixels for the spots, in histology image. Defaults to None.

y_pixel list, optional

a list which contains corresponding y-pixels for the spots, in histology image. Defaults to None.

(class image

numpy.ndarray, optional): the image(typically histology image) in numpy.ndarray format(can be obtained by cv2.imread). Defaults to None.

beta int, optional

to control the range of neighbourhood when calculate grey value for one spot. Defaults to 49.

alpha int, optional

to control the color scale. Defaults to 1.

histology bool, optional

if the image is histological. Defaults to True.

Returns:

numpy.ndarray: the calculated adjacent matrix.

Return type:

class

spateo.tools.cluster.spagcn_utils.calculate_p(adj, l)[source]#
spateo.tools.cluster.spagcn_utils.search_l(p, adj, start=0.01, end=1000, tol=0.01, max_run=100)[source]#

Function to search proper l value for spagcn algorithm.

Parameters:
p float, optional

parameter p in spagcn algorithm. See SpaGCN for details.

(class adj

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

start float, optional

lower boundary of search. Defaults to 0.01.

end int, optional

upper boundary of search. Defaults to 1000.

tol float, optional

step length for search. Defaults to 0.01.

max_run int, optional

maximum number of searching iteration. Defaults to 100.

Returns:

the l value

Return type:

float

spateo.tools.cluster.spagcn_utils.get_cluster_num(adata, adj, res, tol, lr, max_epochs, l, r_seed=100, t_seed=100, n_seed=100)[source]#

get the initial number of clusters corresponding to given louvain resolution.

Parameters:
adata

further passed to SpaGCN.train(), see SpaGCN.train.

adj

further passed to SpaGCN.train(), see SpaGCN.train.

res

further passed to SpaGCN.train(), see SpaGCN.train.

tol

further passed to SpaGCN.train(), see SpaGCN.train.

lr

further passed to SpaGCN.train(), see SpaGCN.train.

max_epochs

further passed to SpaGCN.train(), see SpaGCN.train.

l float

parameter l in spagcn algorithm, see SpaGCN for details.

r_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

t_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

n_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

Returns:

number of clusters

Return type:

int

spateo.tools.cluster.spagcn_utils.search_res(adata, adj, l, target_num, start=0.4, step=0.1, tol=0.005, lr=0.05, max_epochs=10, r_seed=100, t_seed=100, n_seed=100, max_run=10)[source]#

Function to search a proper initial louvain resolution to get desired number of clusters in spagcn algorithm.

Parameters:
(class adj

~anndata.AnnData): an Annadata object.

(class

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

l float

parameter l in spagcn algorithm, see SpaGCN for details.

target_num int

desired number of clusters.

start float, optional

the lower boundary of search for resolution. Defaults to 0.4.

step float, optional

search step length. Defaults to 0.1.

tol

further passed to SpaGCN.train(), see SpaGCN.train.

lr

further passed to SpaGCN.train(), see SpaGCN.train.

max_epochs

further passed to SpaGCN.train(), see SpaGCN.train.

r_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

t_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

n_seed int, optional

Global seed for random, torch, numpy. Defaults to 100.

max_run int, optional

max number of iteration. Defaults to 10.

Returns:

calculated initial louvain resolution.

Return type:

float

spateo.tools.cluster.spagcn_utils.refine(sample_id, pred, dis, shape='square')[source]#

To refine(smooth) the boundary of spatial domains(clusters).

Parameters:
sample_id list

list of sample(cell, spot or bin) names.

pred list

list of spatial domains corresponding to the sample_id list.

(class dis

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

shape str, optional

Smooth the spatial domains with given spatial topology, “hexagon” for Visium data, “square” for ST data. Defaults to “square”.

Returns:

list of refined spatial domains corresponding to the sample_id list.

Return type:

[list]

class spateo.tools.cluster.spagcn_utils.GraphConvolution(in_features, out_features, bias=True)[source]#

Bases: torch.nn.Module

Simple GCN layer, similar to https://arxiv.org/abs/1609.02907

reset_parameters()[source]#
forward(input, adj)[source]#
__repr__()[source]#

Return repr(self).

class spateo.tools.cluster.spagcn_utils.simple_GC_DEC(nfeat, nhid, alpha=0.2)[source]#

Bases: torch.nn.Module

Simple NN model constructed with a GraphConvolution layer followed by a DeepEmbeddingClustering layer. For DEC, see https://arxiv.org/abs/1511.06335v2

forward(x, adj)[source]#
loss_function(p, q)[source]#
target_distribution(q)[source]#
fit(X, adj, lr=0.001, max_epochs=5000, update_interval=3, trajectory_interval=50, weight_decay=0.0005, opt='sgd', init='louvain', n_neighbors=10, res=0.4, n_clusters=10, init_spa=True, tol=0.001)[source]#
predict(X, adj)[source]#
class spateo.tools.cluster.spagcn_utils.SpaGCN[source]#

Bases: object

Implementation for spagcn algorithm, see https://doi.org/10.1038/s41592-021-01255-8

set_l(l)[source]#
train(adata, adj, num_pcs=50, lr=0.005, max_epochs=2000, weight_decay=0, opt='adam', init_spa=True, init='louvain', n_neighbors=10, n_clusters=None, res=0.4, tol=0.001)[source]#

train model for spagcn

Parameters:
(class adj

~anndata.AnnData): an Annadata object.

(class

numpy.ndarray): the calculated adjacent matrix in spagcn algorithm.

num_pcs int, optional

number of pcs(out dimension of PCA) to use. Defaults to 50.

lr float, optional

learning rate in neural network. Defaults to 0.005.

max_epochs int, optional

max epochs to train in neural network. Defaults to 2000.

weight_decay int, optional

make learning rate decay while training. Defaults to 0.

opt str, optional

the optimizer to use. Defaults to “adam”.

init_spa bool, optional

make initial clusters with louvain or kmeans. Defaults to True.

init str, optional

algorithm to use in inital clustering. Supports “louvain”, “kmeans”. Defaults to “louvain”.

predict()[source]#