spateo.io.bgi¶
IO functions for BGI stereo technology.
Attributes¶
Functions¶
|
Read a BGI read file as a pandas DataFrame. |
|
Convert a BGI dataframe that contains cell labels to a labels matrix. |
|
Convert a BGI dataframe that contains cell labels to a (filled) labels matrix. |
|
Read BGI read file to calculate total number of UMIs observed per |
|
Read BGI read file as AnnData. |
Module Contents¶
- spateo.io.bgi.read_bgi_as_dataframe(path: str, label_column: str | None = None) pandas.DataFrame [source]¶
Read a BGI read file as a pandas DataFrame.
- Parameters:
- path
Path to read file.
- label_column
Column name containing positive cell labels.
- Returns:
- Pandas Dataframe with the following standardized column names.
gene: Gene name/ID (whatever was used in the original file)
x, y: X and Y coordinates
- total, spliced, unspliced: Counts for each RNA species.
The latter two is only present if they are in the original file.
- spateo.io.bgi.dataframe_to_labels(df: pandas.DataFrame, column: str, shape: Tuple[int, int] | None = None) numpy.ndarray [source]¶
Convert a BGI dataframe that contains cell labels to a labels matrix.
- Parameters:
- df
Read dataframe, as returned by
read_bgi_as_dataframe()
.- columns
Column that contains cell labels as positive integers. Any labels that are non-positive are ignored.
- Returns:
Labels matrix
- spateo.io.bgi.dataframe_to_filled_labels(df: pandas.DataFrame, column: str, shape: Tuple[int, int] | None = None) numpy.ndarray [source]¶
Convert a BGI dataframe that contains cell labels to a (filled) labels matrix.
- Parameters:
- df
Read dataframe, as returned by
read_bgi_as_dataframe()
.- columns
Column that contains cell labels as positive integers. Any labels that are non-positive are ignored.
- Returns:
Labels matrix
- spateo.io.bgi.read_bgi_agg(path: str, stain_path: str | None = None, binsize: int = 1, gene_agg: Dict[str, List[str] | Callable[[str], bool]] | None = None, prealigned: bool = False, label_column: str | None = None, version: typing_extensions.Literal[stereo] = 'stereo') anndata.AnnData [source]¶
Read BGI read file to calculate total number of UMIs observed per coordinate.
- Parameters:
- path
Path to read file.
- stain_path
Path to nuclei staining image. Must have the same coordinate system as the read file.
- binsize
Size of pixel bins.
- gene_agg
Dictionary of layer keys to gene names to aggregate. For example, {‘mito’: [‘list’, ‘of’, ‘mitochondrial’, ‘genes’]} will yield an AnnData with a layer named “mito” with the aggregate total UMIs of the provided gene list.
- prealigned
Whether the stain image is already aligned with the minimum x and y RNA coordinates.
- label_column
Column that contains already-segmented cell labels.
- version
BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.
- Returns:
An AnnData object containing the UMIs per coordinate and the nucleus staining image, if provided. The total UMIs are stored as a sparse matrix in .X, and spliced and unspliced counts (if present) are stored in .layers[‘spliced’] and .layers[‘unspliced’] respectively. The nuclei image is stored as a Numpy array in .layers[‘nuclei’].
- spateo.io.bgi.read_bgi(path: str, binsize: int | None = None, segmentation_adata: anndata.AnnData | None = None, labels_layer: str | None = None, labels: numpy.ndarray | str | None = None, seg_binsize: int = 1, label_column: str | None = None, add_props: bool = True, version: typing_extensions.Literal[stereo] = 'stereo') anndata.AnnData [source]¶
Read BGI read file as AnnData.
- Parameters:
- path
Path to read file.
- binsize
Size of pixel bins. Should only be provided when labels (i.e. the segmentation_adata and labels arguments) are not used.
- segmentation_adata
AnnData containing segmentation results.
- labels_layer
Layer name in segmentation_adata containing labels.
- labels
Numpy array or path to numpy array saved with np.save that contains labels.
- seg_binsize
the bin size used in cell segmentation, used in conjunction with labels and will be overwritten when labels_layer and segmentation_adata are not None.
- label_column
Column that contains already-segmented cell labels. If this column is present, this takes prescedence.
- add_props
Whether or not to compute label properties, such as area, bounding box, centroid, etc.
- version
BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future.
- Returns:
Bins x genes or labels x genes AnnData.