. 2018 Dec 27;40:318–326. doi: 10.1016/j.ebiom.2018.12.054

Table 1.

Summary of the eight datasets included in the study.

Dataset	Platform	Sample size	Included cohorts^a
Training set	Affymetrix Human Genome U133A Array	519	TCGA
Validation set 1	Affymetrix Human Genome U133A Array	409	GSE14764, GSE23554, GSE26712, GSE3149
Validation set 2	Affymetrix Human Genome U133 Plus 2.0 Array	606	GSE18520, GSE19829, GSE26193, GSE30161, GSE63885, GSE9891
Validation set 3	Agilent-014850 Whole Human Genome Microarray 4x44K G4112F	634	GSE17260, GSE32062, GSE53963, GSE73614
Validation set 4	Operon human v3 ~35 K 70-mer two-color oligonucleotide microarrays	415	GSE13876
Validation set 5	ABI Human Genome Survey Microarray Version 2	194	GSE49997

We used ComBat to adjust the batch effects between different cohorts within the same platform. Gene expression values of all probes were adjusted in each dataset, respectively.