Skip to main content
. 2022 Sep 5;25(1):188–202. doi: 10.1093/biostatistics/kxac035

Fig. 5.

Fig. 5

The Inline graphic metric is an internal validity measure for assessing the performance of induced cluster labels. Multidimensional scaling (MDS) plots with shapes representing true cell type labels from the Inline graphic scRNA-seq data set and colors representing induced (or predicted) cluster labels from four hierarchical clustering methods implemented in the hclust() function in the base R stats package including (a) Ward’s method, (b) single linkage method, (c) complete linkage method, and (d) unweighted pair group method with arithmetic mean (UPGMA). (e) Scatter plot of Inline graphic (an internal validity metric) compared to Adjusted Rand Index (ARI) (an external validity metric) demonstrating shared information between the two metrics, which Inline graphic (calculated with the HPE algorithm 1 using Inline graphic) recovers without the need of an externally labeled set of observations. (f) A performance plot with three internal validity metrics (Inline graphic-axis scaled between 0 and 1): (i) Inline graphic (for ease of comparison) calculated from labels induced using with Inline graphic (Inline graphic-axis), (ii) mean silhouette score, and (iii) within-clusters sums of square (WCSS). The “peak” of the Inline graphic metric at the correct Inline graphic indicates that Inline graphic accurately identifies the most accurate label in a comparable fashion to established internal fitness measure, namely a “peak” at the mean silhouette score and a “bend” in the WCSS curve.