Abstract
Spatial transcriptomics has allowed researchers to analyze transcriptome data in its tissue sample’s spatial context. Various methods have been developed for detecting spatially variable genes (SV genes), whose gene expression over the tissue space shows strong spatial autocorrelation. Such genes are often used to define clusters in cells or spots downstream. However, highly variable (HV) genes, whose quantitative gene expressions show significant variation from cell to cell, are conventionally used in clustering analyses. In this report, we investigate whether adding highly variable genes to spatially variable genes can improve the cell type clustering performance in spatial transcriptomics data. We tested the clustering performance of HV genes, SV genes, and the union of both gene sets (concatenation) on over 50 real spatial transcriptomics datasets across multiple platforms, using a variety of spatial and non-spatial metrics. Our results show that combining HV genes and SV genes can improve overall cell-type clustering performance.
Keywords: spatial transcriptomics, feature selection, clustering
Background
Spatial omics technologies are one of the breakthroughs in science in the last several years [1–4]. Such technologies are able to measure transcriptome information systematically in the tissue space, thus preserving the spatial context of the gene expression. The addition of spatial information allows researchers to further explore biological architecture and function and reveal more insights with respect to various disease mechanisms [5–9]. Many techniques for sequencing spatially resolved transcriptome data have been developed, including merFISH [10,11], Visium [12–14], as well as the more recent platforms at the single-cell resolution such as cosMx SMI [15–17], Xenium [18,19]. Such technologies can be categorized into two general classes: fluorescence in situ hybridization (FISH)-based methods such as merFISH, cosMx, and Xenium, which directly extract transcriptome information at a molecular level and obtain the spatial locations of the cells through imaging techniques; and Next Generation Sequencing (NGS)-based methods such as Visium, which attach probes with fixed physical locations to cryosections of tissues to obtain transcriptome information.
Many interesting features can be extracted from spatial transcriptomics data for downstream functional analysis, including spatially variable (SV) genes and highly variable (HV) genes. SV genes are unique features to spatial transcriptomics data due to the added spatial context. The expression of a SV gene shows distinct spatial autocorrelation. Such properties are indicative of the partition of the spots or cells [20–24]. In the spatial transcriptomics literature, the clustering of spatial transcriptomics datasets usually refer to defining spatial domains [25–27]. The role of spatially variable genes in clustering cell types or the spot-level cellular composition profile, however, remains relatively uninvestigated. HV genes, on the other hand, are genes whose expression values significantly vary without considering the constraints of the spots’ physical locations. HV genes are conventionally used for clustering analysis to group together cells with similar gene expression profiles. Noticeably, SV genes and HV genes are often quite distinct feature sets in spatial transcriptomics data (see Supplementary Fig. 1, Supplementary Tables 1–2), despite the overlapping of some of the genes. As a result, clustering based on SV genes or HV genes alone, may yield biases for downstream functional annotations.
We therefore asked if adding the conventional HV genes to the SV genes can reveal more biological insights and help to improve cell-type clustering performance of spatial transcriptomics data, an area unexplored currently. Towards this goal, we benchmarked the downstream clustering performance of several gene sets: HV genes, SV gene, and the union of HV and SV genes. We tested over 50 ST datasets across 4 ST platforms of single-cell or spot resolutions, including Vizgen’s merFISH, Nanostring’s cosMx, and 10X Genomics’ Visium and Xenium, and evaluated the results using a comprehensive set of metrics. Our results show adding HV to SV genes can help improve clustering performance and reveal more biological insights for downstream analysis.
Results
Overview of Computational Workflow
The workflow for this study is shown in Fig. 1. It starts with spatial transcriptomics data which has two components: a gene expression matrix, and spatial data which consists of the spatial coordinates of each spot or cell. After data preprocessing (see Methods), we extracted the HV genes using the gene expression matrix and the SV genes using both gene expression and spatial coordinates. We use Leiden clustering, a community detection-based clustering method commonly used for clustering transcriptomics data, as our default clustering method [28,29]. Since our main interest is to cluster based on the expression profiles of different gene sets, Leiden clustering is a very suitable option. For the gene sets: HV genes, SV genes, and concatenation (union set of HV and SV genes), we reduced the feature dimensions using Principal Component Analysis (PCA). We then constructed shared nearest neighbor networks (sNN) using the top Principal Components (PCs) and performed Leiden clustering on the sNN. Besides Leiden clustering, we also investigated other methods for clustering gene expression profiles such as kmeans clustering [30], Monocle3 [31], cellTree [32], and SC3 [33] (see Supplementary Table 3). We analyzed the clustering performance using supervised metrics such as AMI and weighted F1 scores across clusters, unsupervised metric Pearson Gamma coefficients, and supervised spatial metrics such as Spatial Concordance (SC) and mean Spatial AMI. Besides the overall clustering performance, we also examined local, cluster-specific and spot/cell-specific metrics. We applied the above pipeline to a total of 51 real datasets across four spatial transcriptomics platforms, including Visium, Xenium, merFISH and CosMx.
Figure 1.
Study workflow. The workflow is composed of four general steps. Step 1: extraction of the HV and SV genes. Step 2: add the HV genes to the SV genes. Step 3: perform cell-type clustering analysis on HV genes, SV genes, the union of SV and HV genes. Step 4: evaluate the cell-type clustering performance using non-spatial and spatial clustering metrics.
Clustering Accuracy on Real Spatial Transcriptomics Datasets
We compared the clustering accuracy performance of computational strategies on diverse technical platforms and tissue types (Supplementary Table 4) with matching ground truth labels (Methods). We selected datasets where the ground truth labels have been verified in the original publications. These datasets include both human and animal tissue, as well as multiple underlying conditions such as ovarian cancer, breast cancer, etc.
The accuracies of the main clustering methods of the HV genes, SV genes, and their union gene set across the real datasets are shown in Fig. 2. Both AMI and weighted F1 metrics increase significantly in general when combining HV genes and SV genes, as compared to using HV or SV genes alone. Moreover, we also included the clustering performance of using all genes to see whether the improvement in clustering accuracy was simply due to having more genes. The results show that including all genes does not significantly further increase the accuracies compared to the scenario of combining HV and SV genes, rather it can decrease the accuracies (eg Visium and Xenium). Similarly, we observed significant improvement in unsupervised metric Pearson Gamma when combining HV and SV genes, as opposed to HV or SV genes alone (Fig. 3). Again, including all genes does not significantly increase Pearson Gamma either, compared to combining HV and SV genes.
Figure 2.
Comparisons of supervised, non-spatial clustering performance of default clustering method (Leiden) on real spatial transcriptomics datasets, in four representative platforms including merFISH, cosMx, Xenium, and Visium at default HV genes threshold level (low) and default SV genes threshold level (low). (a, b) boxplot of AMI and weighted F1 for 51 real datasets, divided by platform. (b, d) heatmaps of AMI and weighted F1 for all 51 real datasets, ordered by platform. Note: ****: p-value<1e-3; ***: 1e-3 ≤ p-value < 1e-2; **: 1e-2 ≤ p-value < 5e-2; *: 5e-2 ≤ p-value < 0.1; ns: 0.1 ≤ p-value ≤ 1.
Figure 3.
Comparisons of unsupervised, non-spatial clustering performance of default clustering method (Leiden) on 51 real spatial transcriptomics datasets, in four representative platforms including merFISH, cosMx, Xenium, and Visium at default HV genes threshold level (low) and default SV genes threshold level (low). (a) boxplot of Pearson Gamma, divided by platform. (b) heatmap of Pearson Gamma, ordered by platform. Note: ****: p-value<1e-3; ***: 1e-3 ≤ p-value < 1e-2; **: 1e-2 ≤ p-value < 5e-2; *: 5e-2 ≤ p-value < 0.1; ns: 0.1 ≤ p-value ≤ 1.
We further examined the specific advantages of combining HV genes and SV genes through a closer look at spot/cell-level clustering performance (Fig. 4, Supplementary Fig.2, 3). In the cosMx dataset, for patient 5–2’s field of view (FOV) 7 (Fig. 4a–b), combining the HV genes and SV genes led to more accurate identification of tumor cells, as well as immune cell types such as B-cells, neutrophils, and plasmablast cells. In another representative kidney Xenium dataset (Fig. 4c–d), combining HV genes and SV genes improved the delineation between the proximal convoluted tube (PCT) and proximal convoluted tube – thick ascending limb (PCT-TAL), as well as the classification for other cell types such as endothelial cells (ENDO), mesangial cell (MES), and thick ascending limb (TAL). Similar improvement in delineation of specific cell types in other datasets were observed, for example, combining HV genes and SV genes lead to better classification of inhibitory neurons in merFISH mouse hypothalamus data (Supplementary Fig2a-b), as well as better identification of cancer cells, connective tissues, and immune cells in Visium’s Breast Cancer dataset (Supplementary Fig.2c-d).
Figure 4.
Comparison of cluster performance of SV genes, HV genes, and their union set for Leiden for representative datasets on the tSNE space for the union set. (a) comparison of clustering labels for cosMx NSCLC dataset for patient 2, FOV 7. (b) comparison of tSNE space highlighting mis-classified clusters for each gene set for cosMx NSCLC dataset for patient 2 FOV 7, with cluster-specific F1 scores for each gene set summarized in a table. (c) comparison of clustering labels for Xenium Kidney dataset sample N7. (d) comparison of tSNE space highlighting mis-classified clusters for each gene set in Xenium Kidney dataset sample N7, with cluster-specific F1 scores for each gene set summarized in a table. Annotation: TAL: thick ascending limb; ENDO: endothelial cells; PCT-TAL: proximal convoluted tube – thick ascending limb; MES: mesangial cell; PCT: proximal convoluted tube.
Spatially adjusted Clustering Accuracy on Real Spatial Transcriptomics Datasets
Since spatial transcriptomics data measure gene expression in situ, we also evaluated the clustering performance of each computational strategy by taking into account the spatial distribution of the spots. Towards this, we derived two novel spatially adjusted clustering metrics: spatial concordance (SC) and mean spatial AMI, to measure the clustering accuracy in the spatial context (see Methods). As shown in Fig. 5, we observe a significant improvement in spatially-adjusted clustering accuracy in most platforms when combining HV genes and SV genes, as compared to using SV genes alone (the baseline). As in conventional clustering metrics, we also observed platform specific variations in the clustering performance of gene sets. We also examined the clustering labels and cell/spot-level spatial clustering performance (Fig. 6, Supplementary Fig. 4, 5) in the tissue context. For the representative cosMx dataset (Fig. 6a–b), we observed that the tumor cells have distinct spatial patterns, whereas the immune cells types such as B-cells, neutrophil, and plasmablast have much more subtle distributions across the tissue. Combining the HV and SV genes significantly improves the classification of tumor cells per spatial AMI metric, but not much for the other cell types. In the representative Xenium dataset (Fig. 6c), we also observed improved cluster-specific mean spatial AMI for the cell types of thick ascending limb (TAL), proximal convoluted tube (PCT), proximal convoluted tube – thick ascending limb (PCT-TAL), endothelial cells (ENDO), mesangial cell (MES). Notably, the improvement is more striking for ENDO, and MES. In addition to the coxMx and Xenium datasets, we observed similar improvements in combining HV genes and SV genes in the representative merFISH dataset, where we observed improved mean spatial AMI in inhibitory neurons (see Supplementary Fig.4b), as well as in the representative Visium dataset, where we observed improved mean spatial AMI in cancer cells, connective tissues, and immune cells (see Supplementary Fig.4d).
Figure 5.
Comparisons of spatial clustering performance of default clustering method (Leiden) on 51 real spatial transcriptomics datasets, in four representative platforms including merFISH, cosMx, Xenium, and Visium at default HV genes threshold level (low) and default SV genes threshold level (low). (a, b) boxplots of SC (Spatial Concordance) and Mean Spatial AMI, divided by platform. (b, d) heatmaps of SC and Mean Spatial AMI, ordered by platform. Note: ****: p-value<1e-3; ***: 1e-3 ≤ p-value < 1e-2; **: 1e-2 ≤ p-value < 5e-2; *: 5e-2 ≤ p-value < 0.1; ns: 0.1 ≤ p-value ≤ 1.
Figure 6.
Comparison of cluster performance of SV genes, HV genes, and their union set for Leiden for representative datasets. (a) comparison of clustering labels for cosMx NSCLC dataset for patient 2 FOV 7. (b) comparison of tissue space highlighting mis-classified clusters for each gene set in cosMx NSCLC dataset for patient 2 FOV 7, with cluster-specific spatial AMI scores for each gene set summarized in a table. (c) comparison of clustering labels for Xenium Kidney dataset sample N7. (d) comparison of tissue space highlighting mis-classified clusters for each gene set in Xenium Kidney dataset sample N7, with cluster-specific spatial AMI scores for each gene set summarized in a table. Annotation: TAL: thick ascending limb; ENDO: endothelial cells; PCT-TAL: proximal convoluted tube – thick ascending limb; MES: mesangial cell; PCT: proximal convoluted tube.
The Effect of Clustering Method and Gene Set Selection Threshold
To further validate our findings, we also examined how our conclusion is affected by different clustering methods, by the stringency thresholds of HV and SV gene selection. Besides Leiden clustering whose results are featured in the main figures, we also performed other clustering analysis using kmeans (using pearson correlation, spearman correlation, and euclidean distances as distance measures), SC3, cellTree, and Monocle3 (see Supplementary Table 3). In general, we observed strong consistency between many clustering methods with respect to supervised non-spatial and spatial clustering metrics in terms of gene set clustering performance rankings (see Supplementary Fig.6). Specifically, we observed very similar results in Monocle3 (see Supplementary Fig.7) to our main results by Leiden clustering, where combining HV and SV genes led to improved clustering performance. For the remainder of the methods, such as SC3, cellTree and kmeans (using pearson correlation, spearman correlation, and Euclidean distance as distance metrics), we also generally observed better results by combing HV and SV genes, as compared to just using either SV genes, HV genes, or both gene sets alone, suggesting a complementary relationship between the two gene sets (see Supplementary Fig.8–12).
To evaluate the effect of stringency thresholds of HV and SV gene selection, we defined three threshold levels for HV and SV genes: low, medium and high based on the gene set and the data platform (see Selection of HV and SV genes). As the threshold level rises, the number of HV and SV genes selected tend to decrease, and the degree of overlap between the respective gene sets also decreases. As shown in Supplementary Fig. 13, as the thresholding level for the HV genes rises, the clustering accuracy tends to decrease, so does the accuracy of the respective concatenation gene sets. However, we observed an improvement in the clustering accuracy when combining HV and SV genes nonetheless, regardless of the HV genes threshold level. Similarly, as the threshold of SV genes rises, we observe a decrease in clustering accuracy in SV genes and the respective concatenation gene sets. However, combining HV and SV genes improves clustering accuracy regardless (see Supplementary Fig. 14).
Discussion
Spatial transcriptomics technologies allow the creation of a more comprehensive map of biological systems. Relative to single cell RNA-Seq technologies, the addition of spatial information has the potential to help discover novel SV markers, which are then used for identifying “spatial domains” in the transcriptomics data; SV genes that share similar spatial expression patterns are also used to define cell types as well as relate cell type composition to tissue structure [20–24]. However, these SV genes are not necessarily the best markers to identify biologically insightful clusters, eg. cell types and homogeneity in cell type compositions, a task conventionally accomplished by those HV gene markers [34–37]. We asked these questions in this study: (1) if SV gene based clustering can be improved by adding additional HV genes, which are conventionally used in single cell RNA-Seq and bulk RNA-Seq analysis for clustering; (2) If so, how is the clustering performance of these gene sets affected by clustering method, data platform and HV and SV selection thresholds. Since spatial transcriptomics platforms are becoming increasingly diverse, it’s important to recognize the platform and tissue context in interpreting cell type or spot-level clustering. We therefore chose to rely on the original study results for ground truth labels. By analyzing multiple datasets across various platforms, we hope to uncover consistent biological truths while minimizing the impact of noise associated with the ground truth. Using multiple metrics, including supervised, unsupervised, and spatially adjusted metrics, as well as closer looks at the spot/cell level clustering performance, we demonstrated a complementary effect between SV genes and HV genes in terms of cell type clustering. As shown in Fig. 7, clustering metrics show that it is more desirable to use a combination of HV and SV genes rather than either gene sets alone. Since no current gold-standard pipelines exist for obtaining the most biologically insightful clustering in spatial transcriptomics data, our study fills the niche to provide recommendations through conducting a systematic evaluation study. We have confirmed through our evaluation study that combining these two types of markers is a desirable strategy to improve the cell-type clustering accuracy, as compared to the current strategy of using SV genes only for such tasks.
Figure 7.
Summary of cell-type clustering performance for HV genes, SV genes and concatenation. (a) radar chart using the mean AMI, weighted F1, ASC, Pearson Gama, Spatial Concordance, and Mean Spatial AMI for HV genes, SV genes, and concatenation. (b) ranking of HV genes, SV genes, and concatenation for AMI, weighted F1, Pearson Gama, Spatial Concordance, and Mean Spatial AMI for HV genes, SV genes, and concatenation.
Methods
Ground truths for real data sets
The quality of the ground truth labels are essential to the evaluation of methods’ performance. For the real datasets, we obtained ground truth labels from the original studies (see Supplementary Table 4). The ground truth labels were obtained through either manual supervised annotation with scRNA-seq references, or through supervised cell segmentation using platform-specific topological data (for some Xenium datasets and cosMx datasets). The ground truth labels are validated in the original studies and therefore suitable for the purpose of our benchmark study.
Real Datasets and Preprocessing
The present study utilized a set of 51 real datasets in order to account for potential confounding effects arising from a range of factors, including technology platform, resolution, tissue type, and clinical phenotype. To ensure a comprehensive representation of major current Spatial Transcriptomics platforms, we chose 10 datasets from Visium, including a Mouse Olfactory Bulb study [12], an Ovarian Cancer study [38], and a Breast Cancer study [39]. We also included 12 datasets from merFISH (Vizgen) on the Mouse Brain Hypothalamic region [10], 10 datasets from Xenium on the human kidney [40] and Mouse Brain Anterior Thalamic Nuclei (ATN) [41], and 20 datasets from CosMx (Nanostring) on human Non-Small Cell Lung Cancer (NSCLC) [40]. The Visium dataset provides non-single-cell resolution, whereas the remaining datasets offer single-cell resolution. The datasets cover a diverse range of tissue type and disease type, allowing for robust and comprehensive analysis.
Prior to extracting the HV and SV genes, we preprocessed the data by first filtering the raw gene expression dataset. In order to not over filter the data prior to analysis, we computed the average expression amongst the genes and the cells and removed those that were statistical outliers. Furthermore, we removed small population cell types who took up less than 5% of the entire cell population. For the merFISH datasets specifically, we rescaled the raw data by a factor of 1000, similar to the preprocessing steps in the SPARK paper [22]. For data normalization, we used log-normalization for the merFISH datasets. For the remaining datasets, we normalized the data using the method developed by Lause et al. [42] and performed downstream dimensions reduction analysis on the pearson residuals.
Selection of HV and SV genes
For merFISH datasets, we used a LOESS regression model with each gene’s log mean expression as the independent variable and the coefficient of variance as the dependent variable. We obtain the difference between the genes’ predicted coefficient of variance and their actual coefficient of variance values. We retain genes whose difference in coefficient of variance that’s larger than zero. We set the following three thresholds for HV genes in the merFISH datasets: the 50th threshold for the low threshold, the 70th percentile for the medium threshold, and the 90th percentile for the high threshold. For the remaining datasets, obtained the HV genes by looking at the genes whose residual variance of the normalized gene expression is larger than 1. Similar to the merFISH datasets, we set the low threshold at the 50th percentile, the medium threshold at the 70th percentile, and the high threshold at the 90th percentile.
We used SPARK to detect SV genes. We set the low threshold at the common p-value cutoff at 0.05. We used an additional custom procedure to further threshold SV genes. For the medium threshold level, we set the p-value cutoff at the 25th percentile. For the high threshold level, we set the p-value threshold at the 50th percentile. See Supplementary Tables 1–2 for details on the number of HV genes, SV genes, concatenation genes for each dataset. For the concatenation gene sets, we remove any potential duplicated genes that are both highly variable and spatially variable so that each gene appears in the concatenation gene set at most once.
Clustering Methods
We focus on results generated via Leiden clustering on shared nearest networks, a commonly used approach in Spatial Transcriptomics analysis [29]. We further validated our results using multiple additional common clustering methods for transcriptomics data, including, kmeans clustering [30], Monocle3 [31], cellTree [32], and SC3 [33].
Leiden [29] is a community detection clustering algorithm. Using the principal components of each dataset and gene set, Leiden builds a shared nearest neighbor network and clusters the cells based on the connectivity information in said network. We tuned the resolution parameter using a grid search strategy. At each attempted resolution value, we repeatedly run Leiden clustering 10 times under different random seeds. We keep the resolution parameters corresponding to the same number of clusters as there are in the ground truth labels, and select the final clustering labels by picking the majority set. The number of nearest neighbors for building the network is set to 15 for all datasets. We used the euclidean distance between the principal components to compute unsupervised clustering metrics for Leiden.
Kmeans [30] is a very common clustering algorithm that partitions the cells into a predefined number of clusters with the nearest centroid. We set the predefined number of clusters to the number of ground truth clusters. We performed kmeans clustering on the selected principal components of each dataset and gene set. We explored three different distance measures: euclidean distance, pearson correlation, and spearman correlation. These measures were directly used to compute the unsupervised clustering metric, Pearson Gamma Coefficient, for kmeans.
cellTree [32] is a clustering algorithm originally developed for scRNA-seq data. cellTree uses Latent Dirichlet Allocation (LDA) to model single-cell data. The fitted LDA model is composed of a set of topic distributions for each cell, and per-topic gene distributions. Per-cell topic histograms can then be used as a low-dimensional embedding to evaluate cell similarity and infer hierarchical relationship, while analysis of the topics themselves can provide useful biological insights on the sets of genes driving the different stages of the process studied. The result of cellTree contains the empirical topic probability per cell. By extracting the maximum probability topic, we can assign cluster labels. We use the raw gene expression for each respective gene set as input, and set the number of topics (clusters) to the number of ground truth clusters. We further use the cosine distance between the topic distribution of the cells to compute the unsupervised clustering metric Pearson Gamma for cellTree.
Monocle3 [31] also uses a community detection algorithm. Instead of a shared nearest neighbor network, Monocle3 uses k-nearest neighbor network. We also tuned and selected the optimal resolution parameter using the same optimization strategy employed for Leiden. The number of nearest neighbors for building the network is set to 15 for all datasets. We used the cosine distance between the principal components to compute the unsupervised clustering metric for Monocle3.
SC3 [33] clusters cells via consensus clustering. SC3 runs kmeans under a combination of distance (euclidean, pearson correlation, and spearman correlation) and dimension reduction (PCA, graph laplacian) strategies. SC3 then computes a consensus matrix measuring the similarity between each set of clustering labels using a Cluster-based similarity partitioning algorithm (CSPA). Finally, SC3 This performs hierarchical clustering on the consensus matrix using euclidean distance to obtain the final consensus cluster labels. We used the euclidean distance of the consensus matrix to compute the unsupervised clustering metric Pearson Gamma for SC3.
Evaluation Metrics
Adjusted Mutual Information (AMI):
To evaluate general accuracy of clustering in spatial transcriptomics data, we computed the Adjusted Mutual Information (AMI) [43] of each set of clustering results compared with the ground truth. AMI measures the level of concordance between two sets of labels and is widely used for measuring clustering accuracy in scRNA-Seq datasets with no spatial context [44–46]. AMI is bounded between 0 and 1, with higher values indicating better clustering performance.
Weighted F1:
To evaluate clustering accuracy while accounting for the specific classification accuracy of each cluster / cell type, we used weighted F1. We match the clustering labels with the ground truth labels via cluster / cell-type-specific average gene expression. Then we computed the F1 score for each cluster and obtained weighted F1 by computing the average of cluster-specific F1 scores, weighted by the cluster sample size.
Pearson Gamma:
To evaluate the consistency of clustering results, we also computed the Pearson Gamma coefficient of the clustering labels, which measures the correlation between the pairwise distance between data points and their cluster memberships. Specifically, we compute a binary membership matrix where an entry is 1 if the respective data points are assigned the same label and 0 otherwise. The correlation between such pairwise membership and the pairwise distances are computed as the Pearson Gamma coefficient.
Spatial Concordance (SC):
To evaluate the clustering accuracy in the spatial context, we develop a metric that takes into account the local spatial heterogeneity of cell types, which we denote as spatial concordance (SC). The larger SC is, the more accurate spatial clustering is compared to the ground truth. We denote the set of ground truth labels as and the set of clustering labels of a certain computational strategy as (assuming the labels in have been matched to the ones in ). and For each spot , we compute the entropy of ground truth labels within its local spatial neighborhood, denoted as . The higher is, the more heterogeneous the neighborhood of . The set of local entropy values are then standardized as below, denoted as .
The spatial concordance is computed as below, where a match in a more heterogeneous neighborhood is weighed higher than one in a relatively homogeneous neighborhood, prioritizing edge cases on domain borders.
Mean Spatial AMI:
Similar to Spatial Concordance, we compute the AMI of the local neighborhood of each spot . Then, we compute the mean spatial AMI by computing the average spatial AMI weighted by the entropy of each spot / cell.
Hypothesis Testing
We performed paired permutation hypothesis tests on the evaluation metrics for different gene sets in order to assess the robustness of the clustering performance. Each hypothesis test was performed with 10,000 permutations.
Acknowledgment
We thank Garmire lab membership for the help discussion throughout the project.
Funding
L.X. Garmire was supported by NIH/NIGMS, R01 LM012373 and R01 LM012907 awarded by NLM, and R01 HD084633 awarded by NICHD. J.Kang’s research was supported by NIH grants R01DA048993, R01 MH105561 and R01GM124061.
Footnotes
Declarations
Competing interests
The authors declare no competing interests
Ethics approval and consent to participate
All data utilized in this study has been previously published and is publicly available. The research performed in this study conformed to the principles of the Helsinki Declaration
Consent for publication
Not Applicable
Availability of data and materials
All data and code used in this evaluation study can be found in the following link https://github.com/lanagarmire/ST_benchmark
References
- 1.Rao A, Barkley D, França GS, Yanai I. Exploring tissue architecture using spatial transcriptomics. Nature. 2021;596:211–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Liu Y, Yang M, Deng Y, Su G, Enninful A, Guo CC, et al. High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in Tissue. Cell. 2020;183:1665–1681.e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Deng Y, Bartosovic M, Kukanja P, Zhang D, Liu Y, Su G, Enninful A, Bai Z, Castelo-Branco G, Fan R. Spatial-CUT&Tag: spatially resolved chromatin modification profiling at the cellular level. Science. 2022;375(6581):681–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zhang D, Deng Y, Kukanja P, Agirre E, Bartosovic M, Dong M, et al. Spatial epigenome-transcriptome co-profiling of mammalian tissues. Nature. 2023;616:113–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lee JK, Wang J, Sa JK, Ladewig E, Lee HO, Lee IH, Kang HJ, Rosenbloom DS, Camara PG, Liu Z, Van Nieuwenhuizen P. Spatiotemporal genomic architecture informs precision oncology in glioblastoma. Nature Genet. 2017;49(4):594–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hunter MV, Moncada R, Weiss JM, Yanai I, White RM. Spatially resolved transcriptomics reveals the architecture of the tumor-microenvironment interface. Nat Commun. 2021;12:6278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Moncada R, Barkley D, Wagner F, Chiodin M, Devlin JC, Baron M, Hajdu CH, Simeone DM, Yanai I. Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas. Nature Biotech. 2020;38(3):333–42. [DOI] [PubMed] [Google Scholar]
- 8.Takei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, Shah S, Thomassie J, Suo S, Eng CH, Guttman M. Integrated spatial genomics reveals global architecture of single nuclei. Nature. 2021;590(7845):344–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ravasio A, Myaing MZ, Chia S, Arora A, Sathe A, Cao EY, et al. Single-cell analysis of EphA clustering phenotypes to probe cancer cell heterogeneity. Commun Biol. 2020;3:429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Moffitt JR, Bambah-Mukku D, Eichhorn SW, Vaughn E, Shekhar K, Perez JD, Rubinstein ND, Hao J, Regev A, Dulac C, Zhuang X. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science. 2018;362(6416):eaau5324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Xia C, Fan J, Emanuel G, Hao J, Zhuang X. Spatial transcriptome profiling by MERFISH reveals subcellular RNA compartmentalization and cell cycle-dependent gene expression. PNAS. 2019;116(39):19490–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ståhl PL, Salmén F, Vickovic S, Lundmark A, Navarro JF, Magnusson J, Giacomello S, Asp M, Westholm JO, Huss M, Mollbrink A. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 2016;353(6294):78–82. [DOI] [PubMed] [Google Scholar]
- 13.Thrane K, Eriksson H, Maaskola J, Hansson J, Lundeberg J. Spatially Resolved Transcriptomics Enables Dissection of Genetic Heterogeneity in Stage III Cutaneous Malignant Melanoma. Cancer Res. 2018;78:5970–9. [DOI] [PubMed] [Google Scholar]
- 14.Asp M, Giacomello S, Larsson L, Wu C, Fürth D, Qian X, et al. A Spatiotemporal Organ-Wide Gene Expression and Cell Atlas of the Developing Human Heart. Cell. 2019;179:1647–1660.e19. [DOI] [PubMed] [Google Scholar]
- 15.Sankowski R, Süß P, Benkendorff A, Böttcher C, Fernandez-Zapata C, Chhatbar C, Cahueau J, Monaco G, Gasull AD, Khavaran A, Grauvogel J. Multiomic spatial landscape of innate immune cells at human central nervous system borders. Nature Medicine. 2024;30(1):186–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.He S, Bhatt R, Birditt B, Brown C, Brown E, Chantranuvatana K, Danaher P, Dunaway D, Filanoski B, Garrison RG, Geiss G. High-plex multiomic analysis in FFPE tissue at single-cellular and subcellular resolution by spatial molecular imaging. bioRxiv. 2021. [Google Scholar]
- 17.Moffet JJ, Fatunla OE, Freytag L, Kriel J, Jones JJ, Roberts-Thomson SJ, Pavenko A, Scoville DK, Zhang L, Liang Y, Morokoff AP. Spatial architecture of high-grade glioma reveals tumor heterogeneity within distinct domains. Neuro-Oncology Advances. 2023;5(1):vdad142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Floriddia EM, Lourenço T, Zhang S, van Bruggen D, Hilscher MM, Kukanja P, Gonçalves dos Santos JP, Altinkök M, Yokota C, Llorens-Bobadilla E, Mulinyawe SB. Distinct oligodendrocyte populations have spatial preference and different responses to spinal cord injury. Nature Commun. 2020;11(1):5860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chen WT, Lu A, Craessaerts K, Pavie B, Frigerio CS, Corthout N, Qian X, Laláková J, Kühnemund M, Voytyuk I, Wolfs L. Spatial transcriptomics and in situ sequencing to study Alzheimer’s disease. Cell. 2020;182(4):976–91. [DOI] [PubMed] [Google Scholar]
- 20.Edsgärd D, Johnsson P, Sandberg R. Identification of spatial expression trends in single-cell gene expression data. Nat Methods. 2018;15:339–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Svensson V, Teichmann SA, Stegle O. SpatialDE: identification of spatially variable genes. Nature Methods. 2018;15(5):343–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sun S, Zhu J, Zhou X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies. Nature Methods. 2020;17(2):193–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Dries R, Zhu Q, Dong R, Eng CH, Li H, Liu K, Fu Y, Zhao T, Sarkar A, Bao F, George RE. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biology. 2021;22:1–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Miller BF, Bambah-Mukku D, Dulac C, Zhuang X, Fan J. Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomic data with nonuniform cellular densities. Genome Res. 2021;31:1843–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hu J, Li X, Coleman K, Schroeder A, Ma N, Irwin DJ, et al. SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat Methods. 2021;18:1342–51. [DOI] [PubMed] [Google Scholar]
- 26.Dong K, Zhang S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nature Communications. 2022;13(1):1739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Xu C, Jin X, Wei S, Wang P, Luo M, Xu Z, Yang W, Cai Y, Xiao L, Lin X, Liu H. DeepST: identifying spatial domains in spatial transcriptomics by deep learning. Nucleic Acids Research. 2022;50(22):e131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Heumos L, Schaar AC, Lance C, Litinetskaya A, Drost F, Zappia L, Lücken MD, Strobl DC, Henao J, Curion F. Best practices for single-cell analysis across modalities. Nature Reviews Genetics. 2023;24(8):550–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Traag VA, Waltman L, van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep. 2019;9:5233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory. 1982;28(2):129–37. [Google Scholar]
- 31.Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566(7745):496–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.duVerle DA, Yotsukura S, Nomura S, Aburatani H, Tsuda K. CellTree: an R/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data. BMC Bioinformatics. 2016;17:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR, Hemberg M. SC3: consensus clustering of single-cell RNA-seq data. Nature Methods. 2017;14(5):483–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zhu X, Wolfgruber TK, Tasato A, Arisdakessian C, Garmire DG, Garmire LX. Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists. Genome Medicine. 2017;9:108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Garmire DG, Zhu X, Mantravadi A, Huang Q, Yunits B, Liu Y, et al. GranatumX: A Community-engaging, Modularized, and Flexible Webtool for Single-cell Data Analysis. Genomics Proteomics Bioinformatics. 2021;19:452–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Poirion OB, Zhu X, Ching T, Garmire L. Single-cell transcriptomics bioinformatics and computational challenges. Frontiers in Genetics. 2016;7:163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Huang Q, Liu Y, Du Y, Garmire LX. Evaluation of cell type annotation R packages on single-cell RNA-seq data. Genomics, Proteomics and Bioinformatics. 2021;19(2):267–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Sanders BE, Wolsky R, Doughty ES, Wells KL, Ghosh D, Ku L, et al. Small cell carcinoma of the ovary hypercalcemic type (SCCOHT): A review and novel case with dual germline SMARCA4 and BRCA2 mutations. Gynecologic Oncology Reports. 2022;44:101077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Andersson A, Larsson L, Stenbeck L, Salmén F, Ehinger A, Wu SZ, et al. Spatial deconvolution of HER2-positive breast cancer delineates tumor-associated cell type interactions. Nature Communications. 2021;12:6012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Benjamin K, Bhandari A, Kepple JD, Qi R, Shang Z, Xing Y, An Y, Zhang N, Hou Y, Crockford TL, McCallion O. Multiscale topology classifies cells in subcellular spatial transcriptomics. Nature. 2024;1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kapustina M, Zhang AA, Tsai JY, Bristow BN, Kraus L, Sullivan KE, Erwin SR, Wang L, Stach TR, Clements J, Lemire AL. The cell-type-specific spatial organization of the anterior thalamic nuclei of the mouse brain. Cell Reports. 2024;43(3). [DOI] [PubMed] [Google Scholar]
- 42.Lause J, Berens P, Kobak D. Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data. Genome Biology. 2021;22:1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Vinh NX, Epps J, Bailey J. Information theoretic measures for clusterings comparison: is a correction for chance necessary?. In Proceedings of the 26th Annual International Conference on Machine Learning. 2009;1073–1080. [Google Scholar]
- 44.Peyvandipour A, Shafi A, Saberian N, Draghici S. Identification of cell types from single cell data using stable clustering. Scientific Reports. 2020;10(1):12349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wang X, Sun Z, Zhang Y, Xu Z, Xin H, Huang H, Duerr RH, Chen K, Ding Y, Chen W. BREM-SC: a bayesian random effects mixture model for joint clustering single cell multi-omics data. Nucleic Acids Research. 2020;48(11):5814–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Li R, Guan J, Zhou S. Single-cell RNA-seq data clustering: A survey with performance comparison study. J Bioinform Comput Biol. 2020;18:2040005. [DOI] [PubMed] [Google Scholar]
- 47.Ramazzotti D, Lal A, Wang B, Batzoglou S, Sidow A. Multi-omic tumor data reveal diversity of molecular mechanisms that correlate with survival. Nature Communications. 2018;9(1):1–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Argelaguet R, Arnol D, Bredikhin D, Deloro Y, Velten B, Marioni JC, Stegle O. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biology. 2020;21:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lopez R, Regier J, Cole MB, Jordan MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nature Methods. 2018;15(12):1053–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, Brudno M, Haibe-Kains B, Goldenberg A. Similarity network fusion for aggregating data types on a genomic scale. Nature Methods. 2014;11(3):333–7. [DOI] [PubMed] [Google Scholar]
- 51.Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data and code used in this evaluation study can be found in the following link https://github.com/lanagarmire/ST_benchmark