Table 2. Summary of the total number of BMKs identified per differential expression analysis (padj<0.01) or multivariate analysis and those selected for validation in the discovery set or confirmed in the validation set.
Biomarker selection in discovery set | Patients considered | Number of identified BMKs (padj<0.01) | Number of selected BMKs |
Univariate (DEGs) | 1,016 | 450 | |
CRC I-IV vs CON | S+K ∩ S | 341 | 341 |
CRC I-IV vs CON | S | 430 | 89 |
HG AA vs CON | S+K | 2* | 2 |
HG AA vs CON | S | 2* | 1 |
CRIC I-II vs CON | S | 115 | 13 |
CRC I-II-III vs CON | S | 128 | 4 |
Multivariate | 141 | 74 | |
Voom | S+K | 16 | 8 |
glmnet | S+K | 15 | 10 |
NSC | S+K | 90 | 41 |
Evolutionary random forest | S+K | 20 | 15 |
Total unique BMKs | 524 |
Analyses performed on validation set | Patients considered | Number of identified BMKs (padj<0.01) | Number of confirmed BMKs |
Univariate (DEGs) | 3,733 | 212 | |
CON vs CRC | S+K ∩ S | 614 | 124 |
CON vs CRC | S | 715 | 3 |
CON vs AA | S+K ∩ S | 10 | 1 |
CON vs CRC I-II | S+K ∪ S | 601 | 24 |
CON vs CRC I-II-III | S+K ∪ S | 852 | 19 |
CON vs CRC | S+K ∪ S | 324** | 16 |
CON vs CRC III-IV | S+K ∪ S | 617** | 25 |
Multivariate | 345 | 14 | |
Voom | S+K | 97 | 2 |
NSC | S+K | 159 | 0 |
glmnet | S+K | 89 | 12 |
Total unique BMKs | 226 |
New BMKs were added to the selection pool subsequently in the discovery set. S=Swiss samples only, K=Korean samples only. *padj, **DEGs confirmed with absolute log2 fold change regardless of padj on validation set.
*padj<0.05.
**DEGs confirmed with absolute log2 fold change >0.5 or <−0.5 regardless of padj on validation set.
AAadvanced adenomaBMKbiomarkerCONcontrol subjectsCRCcolorectal cancer DEGdifferential expressed geneKKoreanNSCnearest shrunken centroidspadjadjusted p valueSSwiss