spateo.io¶

Submodules¶

Functions¶

`alpha_shape`(...)	Compute the alpha shape (concave hull) of a set of points.
`get_concave_hull`(→ Tuple[shapely.geometry.Polygon, List])	Return the convex hull of all nanoballs that have non-zero UMI (or at least > min_agg_umi UMI).
`read_bgi`(→ anndata.AnnData)	Read BGI read file as AnnData.
`read_bgi_agg`(→ anndata.AnnData)	Read BGI read file to calculate total number of UMIs observed per
`read_image`(→ anndata.AnnData)	Load an image into the AnnData object.
`read_nanostring`(→ anndata.AnnData)	Read NanoString CosMx data as AnnData.
`read_slideseq`(→ anndata.AnnData)	Read Slide-seq data as AnnData.
`read_10x`(→ anndata.AnnData)	Read 10x Visium data as AnnData.
`read_10x_as_anndata`(→ anndata.AnnData)	Read 10x Visium matrix directory as AnnData.

Package Contents¶

spateo.io.alpha_shape(x: numpy.ndarray, y: numpy.ndarray, alpha: float = 1, buffer: float = 1, vectorize: bool = True) → Tuple[shapely.geometry.MultiPolygon | shapely.geometry.Polygon, List][source]¶

Compute the alpha shape (concave hull) of a set of points. Code adapted from: https://gist.github.com/dwyerk/10561690

Parameters:

x: x-coordinates of the DNA nanoballs or buckets, etc.
y: y-coordinates of the DNA nanoballs or buckets, etc.
alpha: alpha value to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!
buffer: the buffer used to smooth and clean up the shapley identified concave hull polygon.
vectorize: Whether to vectorize the alpha-shape calculation instead of looping through.

Returns:

The computed concave hull. edge_points: The coordinates of the edge of the resultant concave hull.

Return type:

alpha_hull

spateo.io.get_concave_hull(path: str, binsize: int = 20, min_agg_umi: int | None = None, alpha: float = 1.0, buffer: float | None = None) → Tuple[shapely.geometry.Polygon, List][source]¶

Return the convex hull of all nanoballs that have non-zero UMI (or at least > min_agg_umi UMI).

Parameters:

path: Path to read file.
binsize: The number of spatial bins to aggregate RNAs captured by DNBs in those bins. By default it is 20, which is close to the size of a single cell. If stereo-seq chip used is bigger than 1 x 1 mm, you may need to increase the binsize.
min_agg_umi: the minimal aggregated UMI number for the bucket.
alpha: alpha value to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!
buffer: the buffer used to smooth and clean up the shapley identified concave hull polygon.

Returns:

The computed concave hull. edge_points: The coordinates of the edge of the resultant concave hull.

Return type:

alpha_hull

spateo.io.read_bgi(path: str, binsize: int | None = None, segmentation_adata: anndata.AnnData | None = None, labels_layer: str | None = None, labels: numpy.ndarray | str | None = None, seg_binsize: int = 1, label_column: str | None = None, add_props: bool = True, version: typing_extensions.Literal[stereo] = 'stereo') → anndata.AnnData[source]¶

Read BGI read file as AnnData.

Parameters:

path: Path to read file.
binsize: Size of pixel bins. Should only be provided when labels (i.e. the segmentation_adata and labels arguments) are not used.
segmentation_adata: AnnData containing segmentation results.
labels_layer: Layer name in segmentation_adata containing labels.
labels: Numpy array or path to numpy array saved with np.save that contains labels.
seg_binsize: the bin size used in cell segmentation, used in conjunction with labels and will be overwritten when labels_layer and segmentation_adata are not None.
label_column: Column that contains already-segmented cell labels. If this column is present, this takes prescedence.
add_props: Whether or not to compute label properties, such as area, bounding box, centroid, etc.
version: BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

Bins x genes or labels x genes AnnData.

spateo.io.read_bgi_agg(path: str, stain_path: str | None = None, binsize: int = 1, gene_agg: Dict[str, List[str] | Callable[[str], bool]] | None = None, prealigned: bool = False, label_column: str | None = None, version: typing_extensions.Literal[stereo] = 'stereo') → anndata.AnnData[source]¶

Read BGI read file to calculate total number of UMIs observed per coordinate.

Parameters:

path: Path to read file.
stain_path: Path to nuclei staining image. Must have the same coordinate system as the read file.
binsize: Size of pixel bins.
gene_agg: Dictionary of layer keys to gene names to aggregate. For example, {‘mito’: [‘list’, ‘of’, ‘mitochondrial’, ‘genes’]} will yield an AnnData with a layer named “mito” with the aggregate total UMIs of the provided gene list.
prealigned: Whether the stain image is already aligned with the minimum x and y RNA coordinates.
label_column: Column that contains already-segmented cell labels.
version: BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

An AnnData object containing the UMIs per coordinate and the nucleus staining image, if provided. The total UMIs are stored as a sparse matrix in .X, and spliced and unspliced counts (if present) are stored in .layers[‘spliced’] and .layers[‘unspliced’] respectively. The nuclei image is stored as a Numpy array in .layers[‘nuclei’].

spateo.io.read_image(adata: anndata.AnnData, filename: str, scale_factor: float, slice: str | None = None, img_layer: str | None = None) → anndata.AnnData[source]¶

Load an image into the AnnData object.

Parameters:

adata: AnnData object
filename: The path of the image
scale_factor: The scale factor of the image. Define: pixels/DNBs
slice: Name of the slice. Will be used when displaying multiple slices.
img_layer: Name of the image layer.

Returns:

uns[‘spatial’][slice][‘images’][img_layer] – The stored image
uns[‘spatial’][slice][‘scalefactors’][img_layer] – The scale factor for the spots

spateo.io.read_nanostring(path: str, meta_path: str | None = None, binsize: int | None = None, label_columns: str | List[str] | None = None, add_props: bool = True, version: typing_extensions.Literal[cosmx] = 'cosmx') → anndata.AnnData[source]¶

Read NanoString CosMx data as AnnData.

Parameters:

path: Path to transcript detection CSV file.
meta_path: Path to cell metadata CSV file.
scale: Physical length per coordinate. For visualization only.
scale_unit: Scale unit.
binsize: Size of pixel bins
label_columns: Columns that contain already-segmented cell labels. Each unique combination is considered a unique cell.
add_props: Whether or not to compute label properties, such as area, bounding box, centroid, etc.
version: NanoString technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

Returns:

Bins x genes or labels x genes AnnData.

spateo.io.read_slideseq(path: str, beads_path: str, binsize: int | None = None, version: typing_extensions.Literal[slide2] = 'slide2') → anndata.AnnData[source]¶

Read Slide-seq data as AnnData.

Parameters:

path: Path to Slide-seq digital expression matrix CSV.
beads_path: Path to CSV file containing bead locations.
binsize: Size of pixel bins.
version: Slideseq technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

spateo.io.read_10x(matrix_dir: str, positions_path: str, version: typing_extensions.Literal[visium] = 'visium') → anndata.AnnData[source]¶

Read 10x Visium data as AnnData.

Parameters:

matrix_dir: Directory containing matrix files (barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz)
positions_path: Path to CSV containing spatial coordinates
version: 10x technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.

spateo.io.read_10x_as_anndata(matrix_dir: str) → anndata.AnnData[source]¶

Read 10x Visium matrix directory as AnnData.

Parameters:

matrix_dir: Path to directory containing matrix files.

Returns:

AnnData of barcodes x genes.