Defining the CTC-S and CTC-N subgroups. (A) Principal component analysis (PCA) loading plot of the marker set. The x and y axes represent the principal components 1 (PC1) = 16.6% variance and PC2 = 6.1% variance, respectively. 3 of the 5 stem markers (Aldh1a1, Aldh1a2, and Klf4) are located in the first quadrant, indicating that stem markers tend to correlate with both PC1 and PC2 positively. The PCA loading data were download from the ClustVis and visualized by the imageGP (available online: http://www.ehbio.com/ImageGP/index.php/Home/Index/index.html,accessed on 23 April 2020). (B) PCA scores plot of 72 samples. All samples were divided into three clusters. The cluster located in the first quadrant was defined as CTCs with stem-like features (CTC-S) since they present with stem markers PCA loadings in (A), as the other two clusters were combined and defined as CTC with non-stem-like features (CTC-N). The corresponding ellipses were plotted based on a 95% probability from the same group. (C) The correlation heatmap was visualized by MORPHEUS (available online: https://software.broadinstitute.org/morpheus/, accessed on 7 March 2020). We chose the average linkage method to perform the hierarchical clustering. The heatmap demonstrates the distinct subgroups of CTC-S and CTC-N. The colors of the square matrices illustrate the Pearson’s correlation coefficient, with red indicating a strong correlation and green a weak correlation. All samples are listed in the same order in both horizontal and vertical axes.