Skip to main content
. 2013 Jul 8;110(30):12253–12258. doi: 10.1073/pnas.1304376110

Fig. 1.

Fig. 1.

Data set contains two clusters determined by two variables Inline graphic and Inline graphic such that points around Inline graphic and Inline graphic naturally form clusters. There are 200 observations (100 for each cluster) and 1,002 variables (Inline graphic, Inline graphic and 1,000 random noise variables). We plot the data in the 2D space of Inline graphic and Inline graphic. Graphs with true cluster labels and predicted cluster labels obtained by clustering using only Inline graphic and Inline graphic and clustering by using all variables are laid from left to right. The predicted labels are the same as the true labels only when Inline graphic and Inline graphic are used for clustering; however, the performance is much worse when all variables are used.