spateo.segmentation.em ====================== .. py:module:: spateo.segmentation.em .. autoapi-nested-parse:: Implementation of EM algorithm to identify parameter estimates for a Negative Binomial mixture model. https://iopscience.iop.org/article/10.1088/1742-6596/1324/1/012093/meta Written by @HailinPan, optimized by @Lioscro. Attributes ---------- .. autoapisummary:: spateo.segmentation.em.progress Functions --------- .. autoapisummary:: spateo.segmentation.em.lamtheta_to_r spateo.segmentation.em.muvar_to_lamtheta spateo.segmentation.em.lamtheta_to_muvar spateo.segmentation.em.nbn_pmf spateo.segmentation.em.nbn_em spateo.segmentation.em.conditionals spateo.segmentation.em.confidence spateo.segmentation.em.run_em Module Contents --------------- .. py:data:: progress .. py:function:: lamtheta_to_r(lam: float, theta: float) -> float Convert lambda and theta to r. .. py:function:: muvar_to_lamtheta(mu: float, var: float) -> Tuple[float, float] Convert the mean and variance to lambda and theta. .. py:function:: lamtheta_to_muvar(lam: float, theta: float) -> Tuple[float, float] Convert the lambda and theta to mean and variance. .. py:function:: nbn_pmf(n, p, X) Helper function to compute PMF of negative binomial distribution. This function is used instead of calling :func:`stats.nbinom` directly because there is some weird behavior when float32 is used. This function essentially casts the `n` and `p` parameters as floats. .. py:function:: nbn_em(X: numpy.ndarray, w: Tuple[float, float] = (0.99, 0.01), mu: Tuple[float, float] = (10.0, 300.0), var: Tuple[float, float] = (20.0, 400.0), max_iter: int = 2000, precision: float = 0.001) -> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray] Run the EM algorithm to estimate the parameters for background and cell UMIs. :param X: Numpy array containing mixture counts :param w: Initial proportions of cell and background as a tuple. :param mu: Initial means of cell and background negative binomial distributions. :param var: Initial variances of cell and background negative binomial distributions. :param max_iter: Maximum number of iterations. :param precision: Desired precision. Algorithm will stop once this is reached. :returns: Estimated `w`, `r`, `p`. .. py:function:: conditionals(X: numpy.ndarray, em_results: Union[Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]], Dict[int, Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]], bins: Optional[numpy.ndarray] = None) -> Tuple[numpy.ndarray, numpy.ndarray] Compute the conditional probabilities, for each pixel, of observing the observed number of UMIs given that the pixel is background/foreground. :param X: UMI counts per pixel :param em_results: Return value of :func:`run_em`. :param bins: Pixel bins, as was passed to :func:`run_em`. :returns: Two Numpy arrays, the first corresponding to the background conditional probabilities, and the second to the foreground conditional probabilities :raises SegmentationError: If `em_results` is a dictionary but `bins` was not provided. .. py:function:: confidence(X: numpy.ndarray, em_results: Union[Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]], Dict[int, Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]], bins: Optional[numpy.ndarray] = None) -> numpy.ndarray Compute confidence of each pixel being a cell, using the parameters estimated by the EM algorithm. :param X: Numpy array containing mixture counts. :param em_results: Return value of :func:`run_em`. :param bins: Pixel bins, as was passed to :func:`run_em`. :returns: Numpy array of confidence scores within the range [0, 1]. .. py:function:: run_em(X: numpy.ndarray, downsample: Union[int, float] = 0.001, params: Union[Dict[str, Tuple[float, float]], Dict[int, Dict[str, Tuple[float, float]]]] = dict(w=(0.5, 0.5), mu=(10.0, 300.0), var=(20.0, 400.0)), max_iter: int = 2000, precision: float = 1e-06, bins: Optional[numpy.ndarray] = None, seed: Optional[int] = None) -> Union[Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]], Dict[int, Tuple[Tuple[float, float], Tuple[float, float], Tuple[float, float]]]] EM :param X: UMI counts per pixel. :param use_peaks: Whether to use peaks of convolved image as samples for the EM algorithm. :param min_distance: Minimum distance between peaks when `use_peaks=True` :param downsample: Use at most this many samples. If `use_peaks` is False, samples are chosen randomly with probability proportional to the log UMI counts. When `bins` is provided, the size of each bin is used as a scaling factor. If this is a float, then samples are downsampled by this fraction. :param params: Initial parameters. This is a dictionary that contains `w`, `mu`, `var` as its keys, each corresponding to initial proportions, means and variances of background and foreground pixels. The values must be a 2-element tuple containing the values for background and foreground. This may also be a nested dictionary, where the outermost key maps bin labels provided in the `bins` argument. In this case, each of the inner dictionaries will be used as the initial paramters corresponding to each bin. :param max_iter: Maximum number of EM iterations. :param precision: Stop EM algorithm once desired precision has been reached. :param bins: Bins of pixels to estimate separately, such as those obtained by density segmentation. Zeros are ignored. :param seed: Random seed. :returns: Tuple of parameters estimated by the EM algorithm if `bins` is not provided. Otherwise, a dictionary of tuple of parameters, with bin labels as keys.