spateo.digitization =================== .. py:module:: spateo.digitization .. autoapi-nested-parse:: Spatiotemporal modeling of spatial transcriptomics Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/spateo/digitization/borderline/index /autoapi/spateo/digitization/boundary_old/index /autoapi/spateo/digitization/contour/index /autoapi/spateo/digitization/grid/index /autoapi/spateo/digitization/utils/index /autoapi/spateo/digitization/utils_old/index Attributes ---------- .. autoapisummary:: spateo.digitization.SKM spateo.digitization.lm Functions --------- .. autoapisummary:: spateo.digitization.get_borderline spateo.digitization.grid_borderline spateo.digitization.identify_boundary spateo.digitization.boundary_gridding spateo.digitization.gen_cluster_image spateo.digitization.extract_cluster_contours spateo.digitization.set_domains spateo.digitization.fill_grid_label spateo.digitization.format_boundary_line spateo.digitization.draw_seg_grid spateo.digitization.euclidean_dist spateo.digitization.segment_bd_line spateo.digitization.extend_layer spateo.digitization.field_contour_line spateo.digitization.field_contours spateo.digitization.add_ep_boundary spateo.digitization.add_gp_boundary spateo.digitization.effective_L2_error spateo.digitization.calc_op_field spateo.digitization.extract_cluster_contours spateo.digitization.gen_cluster_image spateo.digitization.set_domains spateo.digitization.digitize spateo.digitization.gridit spateo.digitization.digitize_general spateo.digitization.order_borderline Package Contents ---------------- .. py:function:: get_borderline(adata: spateo.digitization.utils.AnnData, cluster_key: str, source_clusters: int, target_clusters: int, bin_size: int = 1, spatial_key: str = 'spatial', borderline_key: str = 'borderline', k_size: int = 8, min_area: int = 30, dilate_k_size: int = 3) -> spateo.digitization.utils.np.ndarray Identify the borderline at the interface of the source and target cell clusters. The borderline will be identified by first retrieving the outline/contour formed by the source clusters, which will then be cleaned up to retrieve the borderline by masking with the expanded contours formed by the target clusters. :param adata: The adata object to be used for identifying the borderline. :param cluster_key: The key name of the spatial cluster in `adata.obs` :param source_clusters: The source cluster(s) that will interface with the target clusters. :param target_clusters: The target cluster(s) that will interface with the source clusters. :param bin_size: The size of the binning. :param spatial_key: The key name of the spatial coordinates in `adata.obs` :param borderline_key: The key name in `adata.obs` that will be used to store the borderline. :param k_size: Kernel size of the elliptic structuring element. :param min_area: Minimal area threshold corresponding to the resulting contour(s). :param dilate_k_size: Kernel size of the cv2.dilate function. :returns: The matrix that stores the image information of the borderline between the source and target cluster(s). Note that the adata object will also be updated with the `boundary_line` key that stores the information about whether the bucket is on the borderline. :rtype: borderline_img .. py:function:: grid_borderline(adata: spateo.digitization.utils.AnnData, borderline_img: spateo.digitization.utils.np.ndarray, borderline_list: spateo.digitization.utils.List, layer_num: int = 3, column_num: int = 25, layer_width: int = 10, spatial_key: str = 'spatial', init: bool = False) -> None Extend the borderline to either interior or exterior side to each create `layer_num` layers, and segment such layers to `column_num` columns. :param adata: The adata object to be used for identifying the interior/exterior layers and columns. :param borderline_img: The matrix that stores the image information of the borderline between the source and target cluster(s). :param borderline_list: An order list of np.arrays of coordinates of the borderlines. :param layer_num: Number of layers to extend on either interior or exterior side. :param column_num: Number of columns to segment for each layer. :param layer_width: Layer/column boundary width. This only affects grid_label. :param spatial_key: The key name in `adata.obsm` of the spatial coordinates. Default to "spatial". Passed to `fill_grid_label` function. :param init: Whether to generate (and potentially overwrite) the `layer_label_key` and `column_label_key` in `fill_grid_label` function. :returns: 1. layer_label_key: this key points to layer labels. 2. column_label_key: this key points to column labels. :rtype: Nothing but update the adata object with following keys in `.obs` .. py:data:: SKM .. py:data:: lm .. py:function:: identify_boundary(adata: spateo.digitization.contour.AnnData, cluster_key, source_id, target_id, bin_size: int = 1, spatial_key: str = 'spatial', boundary_key: str = 'boundary_line', k_size=8, min_area=30, dilate_k_size: int = 3) .. py:function:: boundary_gridding(adata: spateo.digitization.contour.AnnData, boundary_line_img, boundary_line_list, n_layer=3, n_column=25, layer_width=10, spatial_key: str = 'spatial', init: bool = False) .. py:function:: gen_cluster_image(adata: anndata.AnnData, bin_size: Optional[int] = None, spatial_key: str = 'spatial', cluster_key: str = 'scc', label_mapping_key: str = 'cluster_img_label', cmap: str = 'tab20', show: bool = True) -> numpy.ndarray Generate matrix/image of spatial clusters with distinct labels/colors. :param adata: The adata object used to create the matrix/image for clusters. :param bin_size: The size of the binning. :param spatial_key: The key name of the spatial coordinates in `adata.obs` :param cluster_key: The key name of the spatial cluster in `adata.obs` :param label_mapping_key: The key name to store the label index values, mapped from the cluster names in `adata.obs`. Note that background is 0 so `label_mapping_key` starts from 1. :param cmap: The colormap that will be used to draw colors for the resultant cluster image. :param show: Whether to visualize the cluster image. :returns: A numpy array that stores the image of clusters, each with a distinct color. When `show` is True, `plt.imshow(cluster_rgb_image)` will be used to plot the clusters each with distinct labels prepared from the designated cmap. :rtype: cluster_label_image .. py:function:: extract_cluster_contours(cluster_label_image: numpy.ndarray, cluster_labels: Union[int, List], bin_size: int, k_size: float = 2, min_area: float = 9, close_kernel: int = cv2.MORPH_ELLIPSE, show: bool = True) -> Tuple[Tuple, numpy.ndarray, numpy.ndarray] Extract contour(s) for area(s) formed by buckets of the same spatial cluster. :param cluster_label_image: the image that sets the pixels of the cluster of interests as the front color (background is 0). :param cluster_labels: The label value(s) of clusters of interests. :param bin_size: The size of the binning. :param k_size: Kernel size of the elliptic structuring element. :param min_area: Minimal area threshold corresponding to the resulting contour(s). :param close_kernel: The value to indicate the structuring element. By default, we use a circular structuring element. :param show: Visualize the result. :returns: The Tuple coordinates of contours identified. cluster_image_close: The resultant image of the area of interest with small area removed. cluster_image_contour: The resultant image of the contour, generated from `cluster_image_close`. :rtype: contours .. py:function:: set_domains(adata_high_res: anndata.AnnData, adata_low_res: Optional[anndata.AnnData] = None, spatial_key: str = 'spatial', cluster_key: str = 'scc', domain_key_prefix: str = 'domain', bin_size_high: Optional[int] = None, bin_size_low: Optional[int] = None, k_size: float = 2, min_area: float = 9) -> None Set the domains for each bucket based on spatial clusters. Use adata object of low resolution for contour identification but adata object of high resolution for domain assignment. :param adata_high_res: The anndata object in high spatial resolution. The adata with smaller binning (or single cell segmetnation) is more suitable to define more fine grained spatial domains. :param adata_low_res: The anndata object in low spatial resolution. When using data with big binning, it can often produce better spatial domain clustering results with the `scc` method and thus domain/domain contour identification. :param spatial_key: The key in `.obsm` of the spatial coordinate for each bucket. Should be same key in both `adata_high_res` and `adata_low_res`. :param cluster_key: The key in `.obs` (`adata_low_res`) to the spatial cluster. :param domain_key_prefix: The key prefix in `.obs` (in `adata_high_res`) that will be used to store the spatial domain for each bucket. The full key name will be set as: `domain_key_prefix` + "_" + `cluster_key`. :param bin_size_low: The binning size of the `adata_high_res` object. :param bin_size_low: The binning size of the `adata_low_res` object (only works when `adata_low_res` is provided). :param k_size: Kernel size of the elliptic structuring element. :param min_area: Minimal area threshold corresponding to the resulting contour(s). :returns: Nothing but update the `adata_high_res` with the `domain` in `domain_key_prefix` + "_" + `cluster_key`. .. py:function:: fill_grid_label(adata, spatial_key, seg_grid_img, bdl_seg_coor_x, bdl_seg_coor_y, curr_layer, curr_sign, layer_label_key: str = 'layer_label', column_label_key: str = 'column_label', init: bool = False) .. py:function:: format_boundary_line(boundary_line_img, pt_start, pt_end) .. py:function:: draw_seg_grid(boundary_line_img, bdl_seg_coor_x, bdl_seg_coor_y, gridline_width=1, mode='grid') .. py:function:: euclidean_dist(point_x: Tuple, point_y: Tuple) .. py:function:: segment_bd_line(boundary_line_list, n_column) .. py:function:: extend_layer(boundary_line_img, boundary_line_list, extend_width=10) .. py:function:: field_contour_line(ctr_seq, pnt_pos, min_pnt, max_pnt) .. py:function:: field_contours(contour, pnt_xy, pnt_Xy, pnt_xY, pnt_XY) Identify four boundary lines according to given corner points. :param contour: _description_ :type contour: _type_ :param pnt_xy: _description_ :type pnt_xy: _type_ :param pnt_Xy: _description_ :type pnt_Xy: _type_ :param pnt_xY: _description_ :type pnt_xY: _type_ :param pnt_XY: _description_ :type pnt_XY: _type_ :returns: _description_ :rtype: _type_ .. py:function:: add_ep_boundary(op_field, op_line, value) Add equal weight boundary to op_field. :param op_field: _description_ :type op_field: _type_ :param op_line: _description_ :type op_line: _type_ :param value: _description_ :type value: _type_ :returns: _description_ :rtype: _type_ .. py:function:: add_gp_boundary(op_field, op_line, value_s, value_e) Add growing weight boundary to op_field. :param op_field: _description_ :type op_field: _type_ :param op_line: _description_ :type op_line: _type_ :param value_s: _description_ :type value_s: _type_ :param value_e: _description_ :type value_e: _type_ :returns: _description_ :rtype: _type_ .. py:function:: effective_L2_error(op_field_i, op_field_j, field_mask) Calculate effective L2 error between two fields. :param op_field_i: _description_ :type op_field_i: _type_ :param op_field_j: _description_ :type op_field_j: _type_ :param field_mask: _description_ :type field_mask: _type_ :returns: _description_ :rtype: _type_ .. py:function:: calc_op_field(op_field, min_line, max_line, edge_line_a, edge_line_b, field_border, field_mask, max_err=1e-05, max_itr=100000.0, lp=1, hp=100) Calculate op_field (weights) for given boundary weights. :param op_field: _description_ :type op_field: _type_ :param min_line: _description_ :type min_line: _type_ :param max_line: _description_ :type max_line: _type_ :param edge_line_a: _description_ :type edge_line_a: _type_ :param edge_line_b: _description_ :type edge_line_b: _type_ :param field_border: _description_ :type field_border: _type_ :param field_mask: _description_ :type field_mask: _type_ :param max_err: _description_. Defaults to 1e-5. :type max_err: _type_, optional :param max_itr: _description_. Defaults to 1e5. :type max_itr: _type_, optional :param lp: _description_. Defaults to 1. :type lp: int, optional :param hp: _description_. Defaults to 100. :type hp: int, optional :returns: _description_ :rtype: _type_ .. py:function:: extract_cluster_contours(cluster_label_image: numpy.ndarray, cluster_labels: Union[int, List], bin_size: int, k_size: float = 2, min_area: float = 9, close_kernel: int = cv2.MORPH_ELLIPSE, show: bool = True) -> Tuple[Tuple, numpy.ndarray, numpy.ndarray] Extract contour(s) for area(s) formed by buckets of the same spatial cluster. :param cluster_label_image: the image that sets the pixels of the cluster of interests as the front color (background is 0). :param cluster_labels: The label value(s) of clusters of interests. :param bin_size: The size of the binning. :param k_size: Kernel size of the elliptic structuring element. :param min_area: Minimal area threshold corresponding to the resulting contour(s). :param close_kernel: The value to indicate the structuring element. By default, we use a circular structuring element. :param show: Visualize the result. :returns: The Tuple coordinates of contours identified. cluster_image_close: The resultant image of the area of interest with small area removed. cluster_image_contour: The resultant image of the contour, generated from `cluster_image_close`. :rtype: contours .. py:function:: gen_cluster_image(adata: anndata.AnnData, bin_size: Optional[int] = None, spatial_key: str = 'spatial', cluster_key: str = 'scc', label_mapping_key: str = 'cluster_img_label', cmap: str = 'tab20', show: bool = True) -> numpy.ndarray Generate matrix/image of spatial clusters with distinct labels/colors. :param adata: The adata object used to create the matrix/image for clusters. :param bin_size: The size of the binning. :param spatial_key: The key name of the spatial coordinates in `adata.obs` :param cluster_key: The key name of the spatial cluster in `adata.obs` :param label_mapping_key: The key name to store the label index values, mapped from the cluster names in `adata.obs`. Note that background is 0 so `label_mapping_key` starts from 1. :param cmap: The colormap that will be used to draw colors for the resultant cluster image. :param show: Whether to visualize the cluster image. :returns: A numpy array that stores the image of clusters, each with a distinct color. When `show` is True, `plt.imshow(cluster_rgb_image)` will be used to plot the clusters each with distinct labels prepared from the designated cmap. :rtype: cluster_label_image .. py:function:: set_domains(adata_high_res: anndata.AnnData, adata_low_res: Optional[anndata.AnnData] = None, spatial_key: str = 'spatial', cluster_key: str = 'scc', domain_key_prefix: str = 'domain', bin_size_high: Optional[int] = None, bin_size_low: Optional[int] = None, k_size: float = 2, min_area: float = 9) -> None Set the domains for each bucket based on spatial clusters. Use adata object of low resolution for contour identification but adata object of high resolution for domain assignment. :param adata_high_res: The anndata object in high spatial resolution. The adata with smaller binning (or single cell segmetnation) is more suitable to define more fine grained spatial domains. :param adata_low_res: The anndata object in low spatial resolution. When using data with big binning, it can often produce better spatial domain clustering results with the `scc` method and thus domain/domain contour identification. :param spatial_key: The key in `.obsm` of the spatial coordinate for each bucket. Should be same key in both `adata_high_res` and `adata_low_res`. :param cluster_key: The key in `.obs` (`adata_low_res`) to the spatial cluster. :param domain_key_prefix: The key prefix in `.obs` (in `adata_high_res`) that will be used to store the spatial domain for each bucket. The full key name will be set as: `domain_key_prefix` + "_" + `cluster_key`. :param bin_size_low: The binning size of the `adata_high_res` object. :param bin_size_low: The binning size of the `adata_low_res` object (only works when `adata_low_res` is provided). :param k_size: Kernel size of the elliptic structuring element. :param min_area: Minimal area threshold corresponding to the resulting contour(s). :returns: Nothing but update the `adata_high_res` with the `domain` in `domain_key_prefix` + "_" + `cluster_key`. .. py:function:: digitize(adata: spateo.digitization.utils.AnnData, ctrs: spateo.digitization.utils.Tuple, ctr_idx: int, pnt_xy: spateo.digitization.utils.Tuple[int, int], pnt_Xy: spateo.digitization.utils.Tuple[int, int], pnt_xY: spateo.digitization.utils.Tuple[int, int], pnt_XY: spateo.digitization.utils.Tuple[int, int], spatial_key: str = 'spatial', dgl_layer_key: str = 'digital_layer', dgl_column_key: str = 'digital_column', max_itr: int = 1000000.0, lh: float = 1, hh: float = 100) -> None Calculate the "heat" for a closed area of interests by solving a PDE, partial differential equation, the heat equation. Boundary conditions are defined upon four user provided coordinates that set the direction of heat diffusion. The value of "heat" will be used for define different spatial layers, domains and grids. :param adata: The adata object to digitize. :param ctrs: Contours generated by `cv2.findContours`. :param ctr_idx: The index of the contour of interests. :param pnt_xy: Corner point to define an area of interest. pnt_xy corresponds to the point with minimal layer and minimal column value. :param pnt_Xy: Corner point corresponds to the point with maximal column value but minimal layer value. :param pnt_xY: Corner point corresponds to the point with minimal column value but maximal layer value. :param pnt_XY: Corner point corresponds to the point with maximal layer and maximal columns value. :param spatial_key: The key name in `adata.obsm` of the spatial coordinates. Default to "spatial". :param dgl_layer_key: The key name in `adata.obs` to store layer digital-heat (temperature). Default to "digital_layer". :param dgl_column_key: The key name to store column digital-heat (temperature). :param max_itr: Maximum number of iterations dedicated to solving the heat equation. :param lh: lowest digital-heat (temperature). Defaults to 1. :param hh: highest digital-heat (temperature). Defaults to 100. :returns: 1. dgl_layer_key: The key in `adata.obs` points to the values of layer digital-heat (temperature). 2. dgl_column_key: The key in `adata.obs` points to the values of column digital-heat (temperature). :rtype: Nothing but update the `adata` object with the following keys in `.obs` .. py:function:: gridit(adata: spateo.digitization.utils.AnnData, layer_num: int, column_num: int, lh: float = 1, hh: float = 100, dgl_layer_key: str = 'digital_layer', dgl_column_key: str = 'digital_column', layer_border_width: int = 2, column_border_width: int = 2, layer_label_key: str = 'layer_label', column_label_key: str = 'column_label', grid_label_key: str = 'grid_label') -> None Segment the area of interests into specific layer/column number, according to precomputed digitization heat value. :param adata: The adata object to do layer/column/grid segmentation. :param layer_num: Number of layers to segment. :param column_num: Number of columns to segment. :param lh: lowest digital-heat. Default to 1. :param hh: highest digi-heat. Default to 100. :param layer_border_width: Layer boundary width. Only affect grid_label. :param column_border_width: Column boundary width. Only affect grid_label. :param dgl_layer_key: The key name of layer digitization heat in `adata.obs`. Default to "digital_layer", precomputed. :param dgl_column_key: The key name of column digitization heat in `adata.obs`. Default to "digital_column", precomputed. :param layer_label_key: The key name to store layer labels in `adata.obs`. Default to "layer_label", will be added. :param column_label_key: The key name to store column labels in `adata.obs`. Default to "column_label", will be added. :param grid_label_key: The key name to store grid labels in `adata.obs`. Default to "grid_label", will be added. :returns: 1. layer_label_key: this key points to layer labels. 2. column_label_key: this key points to column labels. 3. grid_label_key: this key points to grid labels. :rtype: Nothing but update the adata object with the following keys in `.obs` .. py:function:: digitize_general(pc: numpy.ndarray, adj_mtx: numpy.ndarray, boundary_lower: numpy.ndarray, boundary_upper: numpy.ndarray, max_itr: int = 1000000.0, lh: float = 1, hh: float = 100) -> numpy.ndarray Calculate the "heat" for a general point cloud of interests by solving a PDE, partial differential equation, the heat equation. The two polar boundaries are given by their indices within the point cloud. The neighbor network of the point cloud is given as an adjacency matrix. :param pc: An array of 3-D coordinates, representing the point cloud. :param adj: A 2-D adjacency matrix of the neighbor network. :param boundary_low: The indices of points selected as lower boundary in the point cloud. :param boundary_low: The indices of points selected as upper boundary in the point cloud. :param max_itr: Maximum number of iterations dedicated to solving the heat equation. :param lh: lowest digital-heat (temperature). Defaults to 1. :param hh: highest digital-heat (temperature). Defaults to 100. :returns: An array of "heat" values of each point in the point cloud. .. py:function:: order_borderline(borderline_img: numpy.ndarray, pt_start: Tuple[int, int], pt_end: Tuple[int, int]) -> Tuple[List, numpy.ndarray] Retrieve the borderline segment given the start end end point with the coordinates ordered. :param borderline_img: The matrix that stores the image of the borderline. :param pt_start: The coordinate tuple of the start point. :param pt_end: The coordinate tuple of the start point. :returns: List of points along the borderline segment. ordered_bdl_img: A numpy aray that stores the image of the borderline segment. :rtype: ordered_bdl_list