Figure 1.
Gene selection based on the loadings on the principal components. Graphs A and B show the score plot of the samples and the loading plot of the genes, respectively, before any filtering is implemented. Graphs C and D show the score and loading plots after the filtering. Graph E displays quantitatively the decisions that went into the choice of the filtering threshold. It displays the distortion in the observed patterns, as measured through the squared difference, and the number of genes retained for analysis as the threshold is varied. The chosen filter threshold was 0.001. Filtering reduces the number of genes from 7070 to 425. At the same time, the score plot of the samples remains largely unchanged and displays the same initial patterns, signifying a minimal loss of information. The loading plot displays strong linear structures of genes. (For more details about the samples used, see Supplementary Material online at