Table 1. Estimates of classification performance obtained by repeated 10-fold cross-validation procedure.
Data used for the classifier | Classification performance (AUC) |
{SNPs} | 0.51 |
{Alc, Smk, Age, Pck} | 0.60 |
{Fh} | 0.66 |
{Fh, Alc, Smk, Age, Pck} | 0.73 |
{SNPs}+{Alc, Smk, Age, Pck} | 0.62 |
{SNPs}+{Fh} | 0.64 |
{SNPs}+{Fh, Alc, Smk, Age, Pck} | 0.73 |
The classification algorithm is Support Vector Machines (SVM). Only SNPs selected by Recursive Feature Elimination (RFE) are used. The following abbreviations are used for variable names: Age (age at interview), Smk (tobacco use), Alc (alcohol consumption), Fh (family history of esophageal cancer), and Pck (consumption of pickled vegetables). The “+” symbol in the Data column denotes that the analysis was performed by ensembling approach.