Table 1.
Dataset | Accession ID | Platform | Tumor / Normal samples (n) | Late stage tumors (%) | Staging system | Availability of metastasis info |
---|---|---|---|---|---|---|
Training sets | ||||||
Jorissen and Sieber, 2008b [53] | GSE13294 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 155/0 | – | – | No |
Watanabe and Hashimoto, 2008 [54] | GSE14095 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 189/0 | – | – | No |
Jorissen and Sieber, 2008 [55] | GSE14333 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 290/0 | 77.55 | TNM/Duke | Yes |
Smith and Beauchamp, 2009a [56] | GSE17536 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 177/0 | 80 | TNM/Duke | Yes |
Mori, Mimori, Yokobori T, 2010 [57] | GSE21815 | Agilent-014850 Whole Human Genome Microarray 4x44K G4112F (Probe Name version) | 131/9 | 59.54 | TNM/Duke | Yes |
Vilar and Morgan, 2011a [58] | GSE26682.GPL570 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 176/0 | – | – | No |
Vilar and Morgan, 2011b [58] | GSE26682.GPL96 | [HG-U133A] Affymetrix Human Genome U133A Array | 155/0 | – | – | No |
NHS-HPFS [41] | GSE32651 | Illumina DASL HumanRef-8 v3 | 718/0 | 13.83 | TNM | No |
Validation sets | ||||||
Lips and Morreau, 2008 [59] | GSE12225.GPL3676 | NKI-CMF Homo sapiens 35 k oligo array | 42/0 | 28.57 | TNM | Yes |
Staub and Rosenthal, 2009 [60] | GSE12945 | [HG-U133A] Affymetrix Human Genome U133A Array | 62/0 | 41.94 | TNM | Yes |
Jorissen and Sieber, 2008a [53] | GSE13067 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 33/0 | – | – | No |
Smith and Beauchamp, 2009b [56] | GSE17538.GPL570 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 63/0 | 88.1 | TNM/Duke | Yes |
expO, IGC, 2005 | GSE2109 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 427/0 | 51.6 | TNM/Duke | Yes |
Tsukamoto and Sugihara, 2010 [61] | GSE21510 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 123/25 | 79.57 | TNM/Duke | Yes |
Medema and Tanis, 2011 [62] | GSE33113 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 90/6 | – | TNM/Duke | Yes |
Marisa and Boige, 2012 [63] | GSE39582 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 566/0 | 87.75 | TNM/Duke | Yes |
TCGAa [5] | TCGA.COAD | Agilent 244 K Custom Gene Expression G4502A-07-3 | 122/4 | 42.4 | TNM | Yes |
TCGAb [5] | TCGA.RNASeqV2 | [RNASeqV2] Illumina HiSeq RNA sequencing | 181/14 | 53.09 | TNM | Yes |
The normal samples in these datasets were all from adjacent normal tissues. The percentage of late-stage and high-grade samples were calculated where the information is available