spateo.tools.spatial_smooth =========================== .. py:module:: spateo.tools.spatial_smooth Functions --------- .. autoapisummary:: spateo.tools.spatial_smooth.smooth spateo.tools.spatial_smooth.compute_jaccard_similarity_matrix spateo.tools.spatial_smooth.sparse_matrix_median spateo.tools.spatial_smooth.smooth_process_column spateo.tools.spatial_smooth.get_eligible_rows spateo.tools.spatial_smooth.sample_from_eligible_neighbors spateo.tools.spatial_smooth.subsample_neighbors_dense spateo.tools.spatial_smooth.subsample_neighbors_sparse Module Contents --------------- .. py:function:: smooth(X: Union[numpy.ndarray, scipy.sparse.csr_matrix], W: Union[numpy.ndarray, scipy.sparse.csr_matrix], ct: Optional[numpy.ndarray] = None, gene_expr_subset: Optional[Union[numpy.ndarray, scipy.sparse.csr_matrix]] = None, min_jaccard: Optional[float] = 0.05, manual_mask: Optional[numpy.ndarray] = None, normalize_W: bool = True, return_discrete: bool = False, smoothing_threshold: Optional[float] = None, n_subsample: Optional[int] = None, return_W: bool = False) -> Tuple[scipy.sparse.csr_matrix, Optional[Union[numpy.ndarray, scipy.sparse.csr_matrix]], Optional[numpy.ndarray]] Leverages neighborhood information to smooth gene expression. :param X: Gene expression array or sparse matrix (shape n x m, where n is the number of cells and m is the number of genes) :param W: Spatial weights matrix (shape n x n) :param ct: Optional, indicates the cell type label for each cell (shape n x 1). If given, will smooth only within each cell type. :param gene_expr_subset: Optional, array corresponding to the expression of select genes (shape n x k, where k is the number of genes in the subset). If given, will smooth only over cells that largely match the expression patterns over these genes (assessed using a Jaccard index threshold that is greater than the median score). :param min_jaccard: Optional, and only used if 'gene_expr_subset' is also given. Minimum Jaccard similarity score to be considered "nonzero". :param manual_mask: Optional, binary array of shape n x n. For each cell (row), manually indicate which neighbors ( if any) to use for smoothing. :param normalize_W: Set True to scale the rows of the weights matrix to sum to 1. Use this to smooth by taking an average over the entire neighborhood, including zeros. Set False to take the average over only the nonzero elements in the neighborhood. :param return_discrete: Set True to return :param smoothing_threshold: Optional, sets the threshold for smoothing in terms of the number of neighboring cells that must express each gene for a cell to be smoothed for that gene. The more gene-expressing neighbors, the more confidence in the biological signal. Can be given as a float between 0 and 1, in which case it will be interpreted as a proportion of the total number of neighbors. :param n_subsample: Optional, sets the number of random neighbor samples to use in the smoothing. If not given, will use all neighbors (nonzero weights) for each cell. :param return_W: Set True to return the weights matrix post-processing :returns: Smoothed gene expression array or sparse matrix W: If return_W is True, returns the weights matrix post-processing d: Only if normalize_W is True, returns the row sums of the weights matrix :rtype: x_new .. py:function:: compute_jaccard_similarity_matrix(data: Union[numpy.ndarray, scipy.sparse.csr_matrix], chunk_size: int = 1000, min_jaccard: float = 0.1) -> numpy.ndarray Compute the Jaccard similarity matrix for input data with rows corresponding to samples and columns corresponding to features, processing in chunks for memory efficiency. :param data: A dense numpy array or a sparse matrix in CSR format, with rows as features :param chunk_size: The number of rows to process in a single chunk :param min_jaccard: Minimum Jaccard similarity to be considered "nonzero" :returns: A square matrix of Jaccard similarity coefficients :rtype: jaccard_matrix .. py:function:: sparse_matrix_median(spmat: scipy.sparse.csr_matrix, nonzero_only: bool = False) -> scipy.sparse.csr_matrix Computes the median value of a sparse matrix, used here for determining a threshold value for Jaccard similarity. :param spmat: The sparse matrix to compute the median value of :param nonzero_only: If True, only consider nonzero values in the sparse matrix :returns: The median value of the sparse matrix :rtype: median_value .. py:function:: smooth_process_column(i: int, X: Union[numpy.ndarray, scipy.sparse.csr_matrix], W: Union[numpy.ndarray, scipy.sparse.csr_matrix], threshold: float) -> scipy.sparse.csr_matrix Helper function for parallelization of smoothing via probabilistic selection of expression values. :param i: Index of the column to be processed :param X: Dense or sparse array input data matrix :param W: Dense or sparse array pairwise spatial weights matrix :param threshold: Threshold value for the number of feature-expressing neighbors for a given row to be included in the smoothing. :param random_state: Optional, set a random seed for reproducibility :returns: Processed column after probabilistic smoothing :rtype: smoothed_column .. py:function:: get_eligible_rows(W: Union[numpy.ndarray, scipy.sparse.csr_matrix], feat: Union[numpy.ndarray, scipy.sparse.csr_matrix], threshold: float) -> numpy.ndarray Helper function for parallelization of smoothing via probabilistic selection of expression values. :param W: Dense or sparse array pairwise spatial weights matrix :param feat: 1D array of feature expression values :param threshold: Threshold value for the number of feature-expressing neighbors for a given row to be included in the smoothing. :returns: Array of row indices that meet the threshold criterion :rtype: eligible_rows .. py:function:: sample_from_eligible_neighbors(W: Union[numpy.ndarray, scipy.sparse.csr_matrix], feat: Union[numpy.ndarray, scipy.sparse.csr_matrix], eligible_rows: numpy.ndarray) Sample feature values probabilistically based on weights matrix W. :param W: Dense or sparse array pairwise spatial weights matrix :param feat: 1D array of feature expression values :param eligible_rows: Array of row indices that meet a prior-determined threshold criterion :returns: Array of sampled values :rtype: sampled_values .. py:function:: subsample_neighbors_dense(W: numpy.ndarray, n: int, verbose: bool = False) -> numpy.ndarray Given dense spatial weights matrix W and number of random neighbors n to take, perform subsampling. :param W: Spatial weights matrix :param n: Number of neighbors to keep for each row :param verbose: Set True to print warnings for cells with fewer than n neighbors :returns: Subsampled spatial weights matrix :rtype: W_new .. py:function:: subsample_neighbors_sparse(W: scipy.sparse.csr_matrix, n: int, verbose: bool = False) -> scipy.sparse.csr_matrix Given sparse spatial weights matrix W and number of random neighbors n to take, perform subsampling. :param W: Spatial weights matrix :param n: Number of neighbors to keep for each row :param verbose: Set True to print warnings for cells with fewer than n neighbors :returns: Subsampled spatial weights matrix :rtype: W_new