Skip to main content
. 2018 Sep 25;19:142. doi: 10.1186/s13059-018-1511-4

Table 1.

Clinical characteristics of selected training and validation sets used in this study

Dataset Accession ID Platform Tumor / Normal samples (n) Late stage tumors (%) Staging system Availability of metastasis info
Training sets
 Jorissen and Sieber, 2008b [53] GSE13294 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 155/0 No
 Watanabe and Hashimoto, 2008 [54] GSE14095 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 189/0 No
 Jorissen and Sieber, 2008 [55] GSE14333 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 290/0 77.55 TNM/Duke Yes
 Smith and Beauchamp, 2009a [56] GSE17536 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 177/0 80 TNM/Duke Yes
 Mori, Mimori, Yokobori T, 2010 [57] GSE21815 Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Probe Name version) 131/9 59.54 TNM/Duke Yes
 Vilar and Morgan, 2011a [58] GSE26682.GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 176/0 No
 Vilar and Morgan, 2011b [58] GSE26682.GPL96 [HG-U133A] Affymetrix Human Genome U133A Array 155/0 No
 NHS-HPFS [41] GSE32651 Illumina DASL HumanRef-8 v3 718/0 13.83 TNM No
Validation sets
 Lips and Morreau, 2008 [59] GSE12225.GPL3676 NKI-CMF Homo sapiens 35 k oligo array 42/0 28.57 TNM Yes
 Staub and Rosenthal, 2009 [60] GSE12945 [HG-U133A] Affymetrix Human Genome U133A Array 62/0 41.94 TNM Yes
 Jorissen and Sieber, 2008a [53] GSE13067 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 33/0 No
 Smith and Beauchamp, 2009b [56] GSE17538.GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 63/0 88.1 TNM/Duke Yes
 expO, IGC, 2005 GSE2109 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 427/0 51.6 TNM/Duke Yes
 Tsukamoto and Sugihara, 2010 [61] GSE21510 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 123/25 79.57 TNM/Duke Yes
 Medema and Tanis, 2011 [62] GSE33113 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 90/6 TNM/Duke Yes
 Marisa and Boige, 2012 [63] GSE39582 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 566/0 87.75 TNM/Duke Yes
 TCGAa [5] TCGA.COAD Agilent 244 K Custom Gene Expression G4502A-07-3 122/4 42.4 TNM Yes
 TCGAb [5] TCGA.RNASeqV2 [RNASeqV2] Illumina HiSeq RNA sequencing 181/14 53.09 TNM Yes

The normal samples in these datasets were all from adjacent normal tissues. The percentage of late-stage and high-grade samples were calculated where the information is available