spateo.segmentation.density
===========================

.. py:module:: spateo.segmentation.density

.. autoapi-nested-parse::

   Functions to segment regions of a slice by UMI density.


Functions
---------

.. autoapisummary::

   spateo.segmentation.density._create_spatial_adjacency
   spateo.segmentation.density._schc
   spateo.segmentation.density._segment_densities
   spateo.segmentation.density.segment_densities
   spateo.segmentation.density.merge_densities


Module Contents
---------------

.. py:function:: _create_spatial_adjacency(shape: Tuple[int, int]) -> scipy.sparse.csr_matrix

   Create a sparse adjacency matrix for a 2D grid graph of specified shape.
   https://stackoverflow.com/a/16342639

   :param shape: Shape of grid

   :returns: A sparse adjacency matrix


.. py:function:: _schc(X: numpy.ndarray, distance_threshold: Optional[float] = None) -> numpy.ndarray

   Spatially-constrained hierarchical clustering.

   Perform hierarchical clustering with Ward linkage on an array
   containing UMI counts per pixel. Spatial constraints are
   imposed by limiting the neighbors of each node to immediate 4
   pixel neighbors.

   This function runs in two steps. First, it computes a Ward linkage tree
   by calling :func:`sklearn.cluster.ward_tree`, with `return_distance=True`,
   which yields distances between clusters. then, if `distance_threshold` is not
   provided, a dynamic threshold is calculated by finding the inflection (knee)
   of the distance (x) vs number of clusters (y) line using the top 1000
   distances, making the assumption that for the vast majority of cases, there
   will be less than 1000 density clusters.

   :param X: UMI counts per pixel
   :param distance_threshold: Distance threshold for the Ward linkage
                              such that clusters will not be merged if they have
                              greater than this distance.

   :returns: Clustering result as a Numpy array of same shape, where clusters are
             indicated by integers.


.. py:function:: _segment_densities(X: Union[scipy.sparse.spmatrix, numpy.ndarray], k: int, dk: int, distance_threshold: Optional[float] = None) -> numpy.ndarray

   Segment a matrix containing UMI counts into regions by UMI density.

   :param X: UMI counts per pixel
   :param k: Kernel size for Gaussian blur
   :param dk: Kernel size for final dilation
   :param distance_threshold: Distance threshold for the Ward linkage
                              such that clusters will not be merged if they have
                              greater than this distance.

   :returns: Clustering result as a Numpy array of same shape, where clusters are
             indicated by positive integers.


.. py:function:: segment_densities(adata: anndata.AnnData, layer: str, binsize: int, k: int, dk: int, distance_threshold: Optional[float] = None, background: Optional[Union[Tuple[int, int], typing_extensions.Literal[False]]] = None, out_layer: Optional[str] = None)

   Segment into regions by UMI density.

   The tissue is segmented into UMI density bins according to the following
   procedure.
   1. The UMI matrix is binned according to `binsize` (recommended >= 20).
   2. The binned UMI matrix (from the previous step) is Gaussian blurred with
       kernel size `k`. Note that `k` is in terms of bins, not pixels.
   3. The elements of the blurred, binned UMI matrix is hierarchically clustered
       with Ward linkage, distance threshold `distance_threshold`, and spatial
       constraints (immediate neighbors). This yields pixel density bins
       (a.k.a. labels) the same shape as the binned matrix.
   4. Each density bin is diluted with kernel size `dk`, starting from the
       bin with the smallest mean UMI (a.k.a. least dense) and going to
       the bin with the largest mean UMI (a.k.a. most dense). This is done in
       an effort to mitigate RNA diffusion and "choppy" borders in subsequent
       steps.
   5. If `background` is not provided, the density bin that is most common in the
       perimeter of the matrix is selected to be background, and thus its label
       is changed to take a value of 0. A pixel can be manually selected to be
       background by providing a `(x, y)` tuple instead. This feature can be
       turned off by providing `False`.
   6. The density bin matrix is resized to be the same size as the original UMI
       matrix.

   :param adata: Input Anndata
   :param layer: Layer that contains UMI counts to segment based on.
   :param binsize: Size of bins to use. For density segmentation, pixels are binned
                   to reduce runtime. 20 is usually a good starting point. Note that this
                   value is relative to the original binsize used to read in the
                   AnnData.
   :param k: Kernel size for Gaussian blur, in bins
   :param dk: Kernel size for final dilation, in bins
   :param distance_threshold: Distance threshold for the Ward linkage
                              such that clusters will not be merged if they have
                              greater than this distance.
   :param background: Pixel that should be categorized as background. By
                      default, the bin that is most assigned to the outermost pixels are
                      categorized as background. Set to False to turn off background detection.
   :param out_layer: Layer to put resulting bins. Defaults to `{layer}_bins`.


.. py:function:: merge_densities(adata: anndata.AnnData, layer: str, mapping: Optional[Dict[int, int]] = None, out_layer: Optional[str] = None)

   Merge density bins either using an explicit mapping or in a semi-supervised
   way.

   :param adata: Input Anndata
   :param layer: Layer that was used to generate density bins. Defaults to
                 using `{layer}_bins`. If not present, will be taken as a literal.
   :param mapping: Mapping to use to transform bins
   :param out_layer: Layer to store results. Defaults to same layer as input.