spateo.io.bgi
=============

.. py:module:: spateo.io.bgi

.. autoapi-nested-parse::

   IO functions for BGI stereo technology.


Attributes
----------

.. autoapisummary::

   spateo.io.bgi.VERSIONS
   spateo.io.bgi.COUNT_COLUMN_MAPPING


Functions
---------

.. autoapisummary::

   spateo.io.bgi.read_bgi_as_dataframe
   spateo.io.bgi.dataframe_to_labels
   spateo.io.bgi.dataframe_to_filled_labels
   spateo.io.bgi.read_bgi_agg
   spateo.io.bgi.read_bgi


Module Contents
---------------

.. py:data:: VERSIONS

.. py:data:: COUNT_COLUMN_MAPPING

.. py:function:: read_bgi_as_dataframe(path: str, label_column: Optional[str] = None) -> pandas.DataFrame

   Read a BGI read file as a pandas DataFrame.

   :param path: Path to read file.
   :param label_column: Column name containing positive cell labels.

   :returns:

             Pandas Dataframe with the following standardized column names.
                 * `gene`: Gene name/ID (whatever was used in the original file)
                 * `x`, `y`: X and Y coordinates
                 * `total`, `spliced`, `unspliced`: Counts for each RNA species.
                     The latter two is only present if they are in the original file.


.. py:function:: dataframe_to_labels(df: pandas.DataFrame, column: str, shape: Optional[Tuple[int, int]] = None) -> numpy.ndarray

   Convert a BGI dataframe that contains cell labels to a labels matrix.

   :param df: Read dataframe, as returned by :func:`read_bgi_as_dataframe`.
   :param columns: Column that contains cell labels as positive integers. Any labels
                   that are non-positive are ignored.

   :returns: Labels matrix


.. py:function:: dataframe_to_filled_labels(df: pandas.DataFrame, column: str, shape: Optional[Tuple[int, int]] = None) -> numpy.ndarray

   Convert a BGI dataframe that contains cell labels to a (filled) labels matrix.

   :param df: Read dataframe, as returned by :func:`read_bgi_as_dataframe`.
   :param columns: Column that contains cell labels as positive integers. Any labels
                   that are non-positive are ignored.

   :returns: Labels matrix


.. py:function:: read_bgi_agg(path: str, stain_path: Optional[str] = None, binsize: int = 1, gene_agg: Optional[Dict[str, Union[List[str], Callable[[str], bool]]]] = None, prealigned: bool = False, label_column: Optional[str] = None, version: typing_extensions.Literal[stereo] = 'stereo') -> anndata.AnnData

   Read BGI read file to calculate total number of UMIs observed per
   coordinate.

   :param path: Path to read file.
   :param stain_path: Path to nuclei staining image. Must have the same coordinate
                      system as the read file.
   :param binsize: Size of pixel bins.
   :param gene_agg: Dictionary of layer keys to gene names to aggregate. For
                    example, `{'mito': ['list', 'of', 'mitochondrial', 'genes']}` will
                    yield an AnnData with a layer named "mito" with the aggregate total
                    UMIs of the provided gene list.
   :param prealigned: Whether the stain image is already aligned with the minimum
                      x and y RNA coordinates.
   :param label_column: Column that contains already-segmented cell labels.
   :param version: BGI technology version. Currently only used to set the scale and
                   scale units of each unit coordinate. This may change in the future.

   :returns: An AnnData object containing the UMIs per coordinate and the nucleus
             staining image, if provided. The total UMIs are stored as a sparse matrix in
             `.X`, and spliced and unspliced counts (if present) are stored in
             `.layers['spliced']` and `.layers['unspliced']` respectively.
             The nuclei image is stored as a Numpy array in `.layers['nuclei']`.


.. py:function:: read_bgi(path: str, binsize: Optional[int] = None, segmentation_adata: Optional[anndata.AnnData] = None, labels_layer: Optional[str] = None, labels: Optional[Union[numpy.ndarray, str]] = None, seg_binsize: int = 1, label_column: Optional[str] = None, add_props: bool = True, version: typing_extensions.Literal[stereo] = 'stereo') -> anndata.AnnData

   Read BGI read file as AnnData.

   :param path: Path to read file.
   :param binsize: Size of pixel bins. Should only be provided when labels
                   (i.e. the `segmentation_adata` and `labels` arguments) are not used.
   :param segmentation_adata: AnnData containing segmentation results.
   :param labels_layer: Layer name in `segmentation_adata` containing labels.
   :param labels: Numpy array or path to numpy array saved with `np.save` that
                  contains labels.
   :param seg_binsize: the bin size used in cell segmentation, used in conjunction
                       with `labels` and will be overwritten when `labels_layer` and
                       `segmentation_adata` are not None.
   :param label_column: Column that contains already-segmented cell labels. If this
                        column is present, this takes prescedence.
   :param add_props: Whether or not to compute label properties, such as area,
                     bounding box, centroid, etc.
   :param version: BGI technology version. Currently only used to set the scale and
                   scale units of each unit coordinate. This may change in the future.

   :returns: Bins x genes or labels x genes AnnData.