spateo.io#

Submodules#

Package Contents#

Functions#

alpha_shape(...)

Compute the alpha shape (concave hull) of a set of points.

get_concave_hull(→ Tuple[shapely.geometry.Polygon, List])

Return the convex hull of all nanoballs that have non-zero UMI (or at least > min_agg_umi UMI).

read_bgi(→ anndata.AnnData)

Read BGI read file as AnnData.

read_bgi_agg(→ anndata.AnnData)

Read BGI read file to calculate total number of UMIs observed per

read_image(→ anndata.AnnData)

Load an image into the AnnData object.

read_nanostring(→ anndata.AnnData)

Read NanoString CosMx data as AnnData.

read_slideseq(→ anndata.AnnData)

Read Slide-seq data as AnnData.

read_10x(→ anndata.AnnData)

Read 10x Visium data as AnnData.

read_10x_as_anndata(→ anndata.AnnData)

Read 10x Visium matrix directory as AnnData.

spateo.io.alpha_shape(x: numpy.ndarray, y: numpy.ndarray, alpha: float = 1, buffer: float = 1, vectorize: bool = True) Tuple[shapely.geometry.MultiPolygon | shapely.geometry.Polygon, List][source]#

Compute the alpha shape (concave hull) of a set of points. Code adapted from: https://gist.github.com/dwyerk/10561690

Parameters:
x

x-coordinates of the DNA nanoballs or buckets, etc.

y

y-coordinates of the DNA nanoballs or buckets, etc.

alpha

alpha value to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!

buffer

the buffer used to smooth and clean up the shapley identified concave hull polygon.

vectorize

Whether to vectorize the alpha-shape calculation instead of looping through.

Returns:

The computed concave hull. edge_points: The coordinates of the edge of the resultant concave hull.

Return type:

alpha_hull

spateo.io.get_concave_hull(path: str, binsize: int = 20, min_agg_umi: int | None = None, alpha: float = 1.0, buffer: float | None = None) Tuple[shapely.geometry.Polygon, List][source]#

Return the convex hull of all nanoballs that have non-zero UMI (or at least > min_agg_umi UMI).

Parameters:
path

Path to read file.

binsize

The number of spatial bins to aggregate RNAs captured by DNBs in those bins. By default it is 20, which is close to the size of a single cell. If stereo-seq chip used is bigger than 1 x 1 mm, you may need to increase the binsize.

min_agg_umi

the minimal aggregated UMI number for the bucket.

alpha

alpha value to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!

buffer

the buffer used to smooth and clean up the shapley identified concave hull polygon.

Returns:

The computed concave hull. edge_points: The coordinates of the edge of the resultant concave hull.

Return type:

alpha_hull

spateo.io.read_bgi(path: str, binsize: int | None = None, segmentation_adata: anndata.AnnData | None = None, labels_layer: str | None = None, labels: numpy.ndarray | str | None = None, seg_binsize: int = 1, label_column: str | None = None, add_props: bool = True, version: typing_extensions.Literal[stereo] = 'stereo') anndata.AnnData[source]#

Read BGI read file as AnnData.

Parameters:
path

Path to read file.

binsize

Size of pixel bins. Should only be provided when labels (i.e. the segmentation_adata and labels arguments) are not used.

segmentation_adata

AnnData containing segmentation results.

labels_layer

Layer name in segmentation_adata containing labels.

labels

Numpy array or path to numpy array saved with np.save that contains labels.

seg_binsize

the bin size used in cell segmentation, used in conjunction with labels and will be overwritten when labels_layer and segmentation_adata are not None.

label_column

Column that contains already-segmented cell labels. If this column is present, this takes prescedence.

add_props

Whether or not to compute label properties, such as area, bounding box, centroid, etc.

version

BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

Bins x genes or labels x genes AnnData.

spateo.io.read_bgi_agg(path: str, stain_path: str | None = None, binsize: int = 1, gene_agg: Dict[str, List[str] | Callable[[str], bool]] | None = None, prealigned: bool = False, label_column: str | None = None, version: typing_extensions.Literal[stereo] = 'stereo') anndata.AnnData[source]#

Read BGI read file to calculate total number of UMIs observed per coordinate.

Parameters:
path

Path to read file.

stain_path

Path to nuclei staining image. Must have the same coordinate system as the read file.

binsize

Size of pixel bins.

gene_agg

Dictionary of layer keys to gene names to aggregate. For example, {‘mito’: [‘list’, ‘of’, ‘mitochondrial’, ‘genes’]} will yield an AnnData with a layer named “mito” with the aggregate total UMIs of the provided gene list.

prealigned

Whether the stain image is already aligned with the minimum x and y RNA coordinates.

label_column

Column that contains already-segmented cell labels.

version

BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

An AnnData object containing the UMIs per coordinate and the nucleus staining image, if provided. The total UMIs are stored as a sparse matrix in .X, and spliced and unspliced counts (if present) are stored in .layers[‘spliced’] and .layers[‘unspliced’] respectively. The nuclei image is stored as a Numpy array in .layers[‘nuclei’].

spateo.io.read_image(adata: anndata.AnnData, filename: str, scale_factor: float, slice: str | None = None, img_layer: str | None = None) anndata.AnnData[source]#

Load an image into the AnnData object.

Parameters:
adata

AnnData object

filename

The path of the image

scale_factor

The scale factor of the image. Define: pixels/DNBs

slice

Name of the slice. Will be used when displaying multiple slices.

img_layer

Name of the image layer.

Returns:

  • uns[‘spatial’][slice][‘images’][img_layer] – The stored image

  • uns[‘spatial’][slice][‘scalefactors’][img_layer] – The scale factor for the spots

spateo.io.read_nanostring(path: str, meta_path: str | None = None, binsize: int | None = None, label_columns: str | List[str] | None = None, add_props: bool = True, version: typing_extensions.Literal[cosmx] = 'cosmx') anndata.AnnData[source]#

Read NanoString CosMx data as AnnData.

Parameters:
path

Path to transcript detection CSV file.

meta_path

Path to cell metadata CSV file.

scale

Physical length per coordinate. For visualization only.

scale_unit

Scale unit.

binsize

Size of pixel bins

label_columns

Columns that contain already-segmented cell labels. Each unique combination is considered a unique cell.

add_props

Whether or not to compute label properties, such as area, bounding box, centroid, etc.

version

NanoString technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

Bins x genes or labels x genes AnnData.

spateo.io.read_slideseq(path: str, beads_path: str, binsize: int | None = None, version: typing_extensions.Literal[slide2] = 'slide2') anndata.AnnData[source]#

Read Slide-seq data as AnnData.

Parameters:
path

Path to Slide-seq digital expression matrix CSV.

beads_path

Path to CSV file containing bead locations.

binsize

Size of pixel bins.

version

Slideseq technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

spateo.io.read_10x(matrix_dir: str, positions_path: str, version: typing_extensions.Literal[visium] = 'visium') anndata.AnnData[source]#

Read 10x Visium data as AnnData.

Parameters:
matrix_dir

Directory containing matrix files (barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz)

positions_path

Path to CSV containing spatial coordinates

version

10x technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

spateo.io.read_10x_as_anndata(matrix_dir: str) anndata.AnnData[source]#

Read 10x Visium matrix directory as AnnData.

Parameters:
matrix_dir

Path to directory containing matrix files.

Returns:

AnnData of barcodes x genes.