spateo.io.bgi ============= .. py:module:: spateo.io.bgi .. autoapi-nested-parse:: IO functions for BGI stereo technology. Attributes ---------- .. autoapisummary:: spateo.io.bgi.VERSIONS spateo.io.bgi.COUNT_COLUMN_MAPPING Functions --------- .. autoapisummary:: spateo.io.bgi.read_bgi_as_dataframe spateo.io.bgi.dataframe_to_labels spateo.io.bgi.dataframe_to_filled_labels spateo.io.bgi.read_bgi_agg spateo.io.bgi.read_bgi Module Contents --------------- .. py:data:: VERSIONS .. py:data:: COUNT_COLUMN_MAPPING .. py:function:: read_bgi_as_dataframe(path: str, label_column: Optional[str] = None) -> pandas.DataFrame Read a BGI read file as a pandas DataFrame. :param path: Path to read file. :param label_column: Column name containing positive cell labels. :returns: Pandas Dataframe with the following standardized column names. * `gene`: Gene name/ID (whatever was used in the original file) * `x`, `y`: X and Y coordinates * `total`, `spliced`, `unspliced`: Counts for each RNA species. The latter two is only present if they are in the original file. .. py:function:: dataframe_to_labels(df: pandas.DataFrame, column: str, shape: Optional[Tuple[int, int]] = None) -> numpy.ndarray Convert a BGI dataframe that contains cell labels to a labels matrix. :param df: Read dataframe, as returned by :func:`read_bgi_as_dataframe`. :param columns: Column that contains cell labels as positive integers. Any labels that are non-positive are ignored. :returns: Labels matrix .. py:function:: dataframe_to_filled_labels(df: pandas.DataFrame, column: str, shape: Optional[Tuple[int, int]] = None) -> numpy.ndarray Convert a BGI dataframe that contains cell labels to a (filled) labels matrix. :param df: Read dataframe, as returned by :func:`read_bgi_as_dataframe`. :param columns: Column that contains cell labels as positive integers. Any labels that are non-positive are ignored. :returns: Labels matrix .. py:function:: read_bgi_agg(path: str, stain_path: Optional[str] = None, binsize: int = 1, gene_agg: Optional[Dict[str, Union[List[str], Callable[[str], bool]]]] = None, prealigned: bool = False, label_column: Optional[str] = None, version: typing_extensions.Literal[stereo] = 'stereo') -> anndata.AnnData Read BGI read file to calculate total number of UMIs observed per coordinate. :param path: Path to read file. :param stain_path: Path to nuclei staining image. Must have the same coordinate system as the read file. :param binsize: Size of pixel bins. :param gene_agg: Dictionary of layer keys to gene names to aggregate. For example, `{'mito': ['list', 'of', 'mitochondrial', 'genes']}` will yield an AnnData with a layer named "mito" with the aggregate total UMIs of the provided gene list. :param prealigned: Whether the stain image is already aligned with the minimum x and y RNA coordinates. :param label_column: Column that contains already-segmented cell labels. :param version: BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future. :returns: An AnnData object containing the UMIs per coordinate and the nucleus staining image, if provided. The total UMIs are stored as a sparse matrix in `.X`, and spliced and unspliced counts (if present) are stored in `.layers['spliced']` and `.layers['unspliced']` respectively. The nuclei image is stored as a Numpy array in `.layers['nuclei']`. .. py:function:: read_bgi(path: str, binsize: Optional[int] = None, segmentation_adata: Optional[anndata.AnnData] = None, labels_layer: Optional[str] = None, labels: Optional[Union[numpy.ndarray, str]] = None, seg_binsize: int = 1, label_column: Optional[str] = None, add_props: bool = True, version: typing_extensions.Literal[stereo] = 'stereo') -> anndata.AnnData Read BGI read file as AnnData. :param path: Path to read file. :param binsize: Size of pixel bins. Should only be provided when labels (i.e. the `segmentation_adata` and `labels` arguments) are not used. :param segmentation_adata: AnnData containing segmentation results. :param labels_layer: Layer name in `segmentation_adata` containing labels. :param labels: Numpy array or path to numpy array saved with `np.save` that contains labels. :param seg_binsize: the bin size used in cell segmentation, used in conjunction with `labels` and will be overwritten when `labels_layer` and `segmentation_adata` are not None. :param label_column: Column that contains already-segmented cell labels. If this column is present, this takes prescedence. :param add_props: Whether or not to compute label properties, such as area, bounding box, centroid, etc. :param version: BGI technology version. Currently only used to set the scale and scale units of each unit coordinate. This may change in the future. :returns: Bins x genes or labels x genes AnnData.