Skip to main content
. 2021 Aug 23;12:702424. doi: 10.3389/fgene.2021.702424

TABLE 1.

Basic information of datasets included in this study.

Dataset ID Number of samples Brief introduction about the dataset Number of deaths Number of relapses
Discovery set 1042
GSE19188 40 A genome-wide gene expression analysis on early-stage NSCLC 24
GSE30219 85 Identification of a group of metastatic-prone tumors in lung cancer according to “Off-context” gene expression defined by the authors 45 27 (83)
GSE31210 226 Gene expression analysis on pathological stage I–II lung adenocarcinomas 35 64 (226)
GSE31546 16 Development of an EGFR mutation gene expression signature to predict response and clinical outcome, and identification of genes associated with the EGFR-dependent phenotype 2
GSE37745 106 Biomarker discovery in NSCLC 77
GSE50081 127 Validation of a histology-independent prognostic gene signature for early-stage NSCLC, including stage IA patients 51 37 (124)
GSE68465 442 Gene expression-based survival prediction in LUAD 236 178 (178)
Validation set 535
TCGA-LUAD 535 The LUAD cohort of TCGA, a landmark cancer genomics program, molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types. 187