Note

This page was generated from 2_spaco_demo.ipynb. Interactive online version: Colab badge. Some tutorial content may look better in light mode.

2.Enhancing visualization clarity for categorical spatial datasets using Spaco

Spaco is a colorization enhancing algorithm derived from our Spateo Project.

Visualizing spatially resolved biological data with appropriate color mapping can significantly facilitate the exploration of underlying patterns and heterogeneity. Spaco (spatial colorization) provides a spatially constrained approach that generates discriminate color assignments for visualizing single-cell spatial data in various scenarios.

Installation

[ ]:
# Latest source from github (Recommended)
# pip install git+https://github.com/BrainStOrmics/Spaco.git

# PyPI
!pip install spaco-release

Load packages

[1]:
import numpy as np
import pandas as pd

import spaco

import scanpy as sc # For general visualization
import squidpy as sq # For loading example dataset

import matplotlib
import seaborn as sns

Load the pre-processed dataset

[2]:
# seqFISH mouse embryo dataset
adata_cellbin_bkp = sq.datasets.seqfish()

# make a copy (optional)
adata_cellbin = adata_cellbin_bkp.copy()

# extract the pre-annotated label from anndata
adata_cellbin.obs['annotation'] = adata_cellbin.obs['celltype_mapped_refined'].copy()
adata_cellbin.obs['annotation'] = adata_cellbin.obs['annotation'].astype(str).astype('category')

Filter cell types (optional)

[3]:
# filter out cell types with less than 10 cells
min_cells=10
unique_tmp = np.unique(adata_cellbin.obs['annotation'],return_counts=True)
adata_cellbin = adata_cellbin[adata_cellbin.obs['annotation'].isin(unique_tmp[0][unique_tmp[1]>min_cells])].copy()

# filter out unannotated cell types
adata_cellbin = adata_cellbin[adata_cellbin.obs['annotation']!="Unannotated"].copy()

# save filtered data
# adata_cellbin.write("./data/seqFish.h5ad")
[4]:
# read filtered data
# adata_cellbin = sc.read_h5ad("./data/seqFish.h5ad")

adata_cellbin
[4]:
AnnData object with n_obs × n_vars = 19416 × 351
    obs: 'Area', 'celltype_mapped_refined', 'annotation'
    uns: 'celltype_mapped_refined_colors'
    obsm: 'X_umap', 'spatial'

Color assignment optimization with a given palette

[5]:
# Get a default palette via matplotlib visualization
sc.set_figure_params(figsize=(3,6), facecolor="white", dpi_save=300)
sc.pl.spatial(adata_cellbin, color="annotation", spot_size=0.035)

palette_default = adata_cellbin.uns['annotation_colors'].copy()
sns.palplot(palette_default)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_11_0.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_11_1.png
[6]:
# Get optimized color-cluster assignment with Spaco
color_mapping = spaco.colorize(
    cell_coordinates=adata_cellbin.obsm['spatial'],
    cell_labels=adata_cellbin.obs['annotation'],
    colorblind_type="none",
    radius=0.05, # IMPORTANT: `radius` is related to the physical scaling of .obsm['spatial'],
                 # please set `radius` to define how far you would define 'neighboring' in .obsm['spatial']
    n_neighbors=30,
    palette=palette_default, # if `palette` is specified, the `colorize` function only refines the assignment.
)

color_mapping
|-----> Calculating cluster distance graph...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> Calculating color distance graph...
|-----------> Calculating color perceptual distance...
|-----------> Constructing color distance graph...
|-----> Optimizing color mapping...
[6]:
{'Allantois': '#b5bbe3',
 'Anterior somitic tissues': '#7d87b9',
 'Cardiomyocytes': '#d6bcc0',
 'Cranial mesoderm': '#8dd593',
 'Definitive endoderm': '#e6afb9',
 'Dermomyotome': '#d33f6a',
 'Endothelium': '#11c638',
 'Erythroid': '#8595e1',
 'Forebrain/Midbrain/Hindbrain': '#0fcfc0',
 'Gut tube': '#e07b91',
 'Haematoendothelial progenitors': '#f3e1eb',
 'Intermediate mesoderm': '#d5eae7',
 'Lateral plate mesoderm': '#8e063b',
 'Low quality': '#ef9708',
 'Mixed mesenchymal mesoderm': '#bb7784',
 'NMP': '#f0b98d',
 'Neural crest': '#ead3c6',
 'Presomitic mesoderm': '#c6dec7',
 'Sclerotome': '#bec1d4',
 'Spinal cord': '#023fa5',
 'Splanchnic mesoderm': '#9cded6',
 'Surface ectoderm': '#4a6fe3'}
[7]:
# Get visualization with optimized color assignment
color_mapping = {k: color_mapping[k] for k in adata_cellbin.obs['annotation'].cat.categories}

# Set new colors for adata
palette_spaco = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata_cellbin, color="annotation", spot_size=0.035, palette=palette_spaco)
sns.palplot(palette_spaco)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_13_0.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_13_1.png

Automatic colorization (CI-graph guided)

[8]:
# Get optimized color palette and assignment with Spaco
color_mapping = spaco.colorize(
    cell_coordinates=adata_cellbin.obsm['spatial'],
    cell_labels=adata_cellbin.obs['annotation'],
    colorblind_type="none",
    radius=0.3,
    n_neighbors=30,
    # palette=None, # when `palette` is not available, Spaco applies an automatic color selection
)

#color_mapping

# Get visualization with optimized colorization
color_mapping = {k: color_mapping[k] for k in adata_cellbin.obs['annotation'].cat.categories}

# Set new colors for adata
palette_umap = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata_cellbin, color="annotation", spot_size=0.035, palette=palette_umap)
sns.palplot(palette_umap)
|-----> Calculating cluster distance graph...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> `palette` not provided.
|-----------> Auto-generating colors from CIE Lab colorspace...
|-----------------> Calculating cluster embedding...
/home/jingzh/.conda/envs/spaco_dev/lib/python3.8/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
/home/jingzh/.conda/envs/spaco_dev/lib/python3.8/site-packages/umap/umap_.py:1780: UserWarning: using precomputed metric; inverse_transform will be unavailable
  warn("using precomputed metric; inverse_transform will be unavailable")
|-----------------> Rescaling embedding to CIE Lab colorspace...
|-----> Optimizing cluster color mapping...
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_15_3.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_15_4.png

Automatic colorization (Image guided)

[9]:
from PIL import Image

img = Image.open("./data/colorful-2468874_1280.jpg").convert("RGB")
matplotlib.pyplot.imshow(img)
[9]:
<matplotlib.image.AxesImage at 0x7fc6a421e9d0>
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_17_1.png
[10]:
# Get optimized color palette and assignment from an image
# this step can be time consuming according to the resolution of the image
color_mapping = spaco.colorize(
    cell_coordinates=adata_cellbin.obsm['spatial'],
    cell_labels=adata_cellbin.obs['annotation'],
    colorblind_type="none",
    radius=0.3,
    n_neighbors=30,
    # palette=None, # when `palette` is not available, Spaco applies an automatic color selection
    image_palette=img, # when `img_palette` is available, Spaco applies image-guided palette extraction
)

# Get visualization with optimized colorization
color_mapping = {k: color_mapping[k] for k in adata_cellbin.obs['annotation'].cat.categories}

# Set new colors for adata
palette_img = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata_cellbin, color="annotation", spot_size=0.035, palette=palette_img)
sns.palplot(palette_img)
|-----> Calculating cluster distance graph...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> `palette` not provided.
|-----------> Using `image palette`...
|-----------> Drawing appropriate colors from provided image...
|-----------------> Extracting color bins...
|-----------------> Initiating palette...
|-----------------> Optimizing extracted palette...
|-----> Calculating color distance graph...
|-----------> Calculating color perceptual distance...
|-----------> Constructing color distance graph...
|-----> Optimizing color mapping...
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_18_1.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_18_2.png

Separated usage of automatic color extraction

[11]:
# Spaco also supports separated usage of its functions (e.g. the theme-extraction function)

# We welcome users to selectively use Spaco's features rather than solely use our wrapped functions

extracted_palette = spaco.utils.extract_palette(img, n_colors=10, colorblind_type="none")
print(extracted_palette)

sns.palplot(extracted_palette)
|-----------------> Extracting color bins...
|-----------------> Initiating palette...
|-----------------> Optimizing extracted palette...
['#0f1fa1', '#06654c', '#0081f6', '#b80050', '#683300', '#dfaa00', '#eaa3ff', '#5adfef', '#ff7256', '#6de674']
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_20_1.png

Colorization on multiple plots (between tissue slices)

[12]:
# Load the pre-processed dataset
adata = sc.read_h5ad("./data/10DPI_1_left.h5ad")
radata = sc.read_h5ad("./data/15DPI_1_left.h5ad")
[13]:
# get a pre-defined color palette
from pylab import *

cmap = cm.get_cmap('tab20', 20)
palette_default = [matplotlib.colors.rgb2hex(cmap(i)) for i in range(20)]
sns.palplot(palette_default)
/tmp/ipykernel_2725620/525643131.py:4: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
  cmap = cm.get_cmap('tab20', 20)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_23_1.png
[14]:
# Get optimized color assignment between two slices

cluster_key = "Annotation" # clusters are pre-annotated in this dataset

color_mapping_adata = spaco.colorize_mutiple_slices(
    adatas=[adata, radata],
    cluster_key=cluster_key, # use a pre-annotation as identical labeling between the two slices, if not provided, will perform auto-alignment
    colorblind_type="none",
    radius=90,
    n_neighbors=16,
    palette=palette_default,
)
color_mapping_adata
|-----> Calculating cluster distance graph for slice 0...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> Calculating cluster distance graph for slice 1...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> Merging cluster distance graph...
|-----> Calculating color distance graph...
|-----------> Calculating color perceptual distance...
|-----------> Constructing color distance graph...
|-----> Optimizing color mapping...
[14]:
{'CCKIN': '#dbdb8d',
 'CMPN': '#1f77b4',
 'CP': '#c7c7c7',
 'IMN': '#7f7f7f',
 'MCG': '#9467bd',
 'MSN': '#d62728',
 'NPYIN': '#bcbd22',
 'NTNG1EX': '#e377c2',
 'OBNBL': '#98df8a',
 'OLIGO': '#ff9896',
 'SCGNIN': '#9edae5',
 'SSTIN': '#ffbb78',
 'TLNBL': '#17becf',
 'VLMC': '#ff7f0e',
 'dpEX': '#2ca02c',
 'mpEX': '#aec7e8',
 'nptxEX': '#c5b0d5',
 'reaEGC': '#c49c94',
 'sfrpEGC': '#f7b6d2',
 'wntEGC': '#8c564b'}
[15]:
# Get visualization for `adata` with optimized color assignment
color_mapping = {k: color_mapping_adata[k] for k in adata.obs[cluster_key].cat.categories}

# Set new colors for adata
palette_spaco = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata, color=cluster_key , spot_size=30, palette=palette_spaco)
sns.palplot(palette_spaco)

# Get visualization for `radata` with optimized color assignment
color_mapping2 = {k: color_mapping_adata[k] for k in radata.obs[cluster_key].cat.categories}

# Set new colors for adata
palette_spaco2 = list(color_mapping2.values())

# Spaco colorization
sc.pl.spatial(radata, color=cluster_key, spot_size=30, palette=palette_spaco2)
sns.palplot(palette_spaco2)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_25_0.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_25_1.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_25_2.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_25_3.png

Colorization on multiple plots (between clustering results)

[16]:
# get a pre-defined color palette from matplotlib default
sc.pl.spatial(adata, spot_size=30, color='seurat_clusters')
palette_default = adata.uns['seurat_clusters_colors'].copy()
sns.palplot(palette_default)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_27_0.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_27_1.png
[17]:
# Get optimized color assignment between two clustering results
# Spaco applies an automatic label alignment which yields a new label in .obs and suffix `_spaco2`
cluster_keys = ['seurat_clusters','spatial_leiden_e30_s8']
color_mapping_adata = spaco.colorize_mutiple_runs(
    adata=adata,
    cluster_keys=cluster_keys,
    colorblind_type="none",
    radius=90,
    n_neighbors=16,
    palette=palette_default,
)

cluster_keys = [cluster_key + "_spaco2" for cluster_key in cluster_keys]

adata
|-----> Mapping clusters between runs...
|-----------------> <insert> 'seurat_clusters_spaco2' to obs in AnnData Object.
|-----------> Mapping run 1 to run 0...
|-----------------> <insert> 'spatial_leiden_e30_s8_spaco2' to obs in AnnData Object.
|-----------> Mapped cluster name added to `adata.obs['***_spaco2']`. Result color mapping will base on new cluster name.
|-----> Calculating cluster distance graph for run 0...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> Calculating cluster distance graph for run 1...
|-----------> Calculating cell neighborhood...
|-----------> Filtering out neighborhood outliers...
|-----------> Calculating cluster interlacement score...
|-----------> Constructing cluster interlacement graph...
|-----> Merging cluster distance graph...
|-----> Calculating color distance graph...
|-----------> Calculating color perceptual distance...
|-----------> Constructing color distance graph...
|-----> Optimizing color mapping...
[17]:
AnnData object with n_obs × n_vars = 4811 × 27600
    obs: 'CellID', 'spatial_leiden_e30_s8', 'Batch', 'cell_id', 'seurat_clusters', 'inj_uninj', 'D_V', 'inj_M_L', 'Annotation', 'seurat_clusters_spaco2', 'spatial_leiden_e30_s8_spaco2'
    var: 'Gene'
    uns: 'Injury_10DPI_rep1_SS200000147BL_B5', '__type', 'angle_dict', 'Annotation_colors', 'seurat_clusters_colors'
    obsm: 'X_pca', 'X_spatial', 'spatial'
    layers: 'counts'
[18]:
# Get visualization for `seurat_clusters` with optimized color assignment
color_mapping = {k: color_mapping_adata[k] for k in adata.obs[cluster_keys[0]].cat.categories}

# Set new colors for adata
palette_spaco = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata, color=cluster_keys[0] , spot_size=30, palette=palette_spaco)
sns.palplot(palette_spaco)


# Get visualization for `spatial_leiden_e30_s8` with optimized color assignment
color_mapping = {k: color_mapping_adata[k] for k in adata.obs[cluster_keys[1]].cat.categories}

# Set new colors for adata
palette_spaco = list(color_mapping.values())

# Spaco colorization
sc.pl.spatial(adata, color=cluster_keys[1] , spot_size=30, palette=palette_spaco)
sns.palplot(palette_spaco)
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_29_0.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_29_1.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_29_2.png
../../../_images/tutorials_notebooks_8_interactive_tools_2_spaco_demo_29_3.png
[19]:
import session_info

session_info.show(excludes=['base'])
[19]:
Click to view session information
-----
PIL                 9.4.0
anndata             0.8.0
cycler              0.10.0
dateutil            2.8.2
matplotlib          3.7.1
numpy               1.22.4
pandas              1.5.3
scanpy              1.9.3
seaborn             0.12.2
session_info        1.0.0
spaco               NA
squidpy             1.2.3
-----
Click to view modules imported as dependencies
asciitree           NA
asttokens           NA
backcall            0.2.0
cffi                1.15.1
cloudpickle         2.2.1
colormath           3.0.0
comm                0.1.3
cython_runtime      NA
dask                2023.5.0
dask_image          2023.03.0
debugpy             1.6.7
decorator           5.1.1
docrep              0.3.2
entrypoints         0.4
executing           1.2.0
fasteners           0.18
h5py                3.8.0
igraph              0.10.6
imageio             2.25.1
importlib_metadata  NA
importlib_resources NA
ipykernel           6.22.0
jedi                0.18.2
jinja2              3.1.2
joblib              1.2.0
kiwisolver          1.4.4
lack                0.0.4
leidenalg           0.10.1
llvmlite            0.39.1
markupsafe          2.1.2
matplotlib_inline   0.1.6
matplotlib_scalebar 0.8.1
mpl_toolkits        NA
natsort             8.2.0
networkx            2.8.8
numba               0.56.4
numcodecs           0.11.0
packaging           23.0
parso               0.8.3
patsy               0.5.3
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
platformdirs        3.5.0
prompt_toolkit      3.0.38
psutil              5.9.5
ptyprocess          0.7.0
pure_eval           0.2.2
pyciede2000         NA
pycparser           2.21
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.15.1
pylab               NA
pynndescent         0.5.8
pyparsing           3.0.9
pytz                2022.7.1
pywt                1.4.1
scipy               1.10.1
setuptools          65.6.3
six                 1.16.0
skimage             0.19.3
sklearn             1.2.1
stack_data          0.6.2
statsmodels         0.14.0rc0
stdlib_list         v0.8.0
texttable           1.6.7
threadpoolctl       3.1.0
tifffile            2023.2.3
tlz                 0.12.0
toolz               0.12.0
tornado             6.3.1
tqdm                4.64.1
traitlets           5.9.0
typing_extensions   NA
umap                0.5.3
validators          0.20.0
wcwidth             0.2.6
xarray              2023.1.0
yaml                6.0.1
zarr                2.16.0
zipp                NA
zmq                 25.0.2
-----
IPython             8.12.1
jupyter_client      8.2.0
jupyter_core        5.3.0
-----
Python 3.8.12 | packaged by conda-forge | (default, Sep 29 2021, 19:50:30) [GCC 9.4.0]
Linux-5.19.0-46-generic-x86_64-with-glibc2.10
-----
Session information updated at 2023-08-15 15:27
[ ]: