Table 4.
Methods | Train | Test | ||||
---|---|---|---|---|---|---|
Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | Accuracy | |
Logistic regression | 0.75 | 0.89 | 0.82 | 0.73 | 0.86 | 0.79 |
Random forest | 0.73 | 0.77 | 0.75 | 0.73 | 0.78 | 0.75 |
Supporting vector machine | 0.74 | 0.89 | 0.82 | 0.73 | 0.87 | 0.80 |
Naïve Bayes | 0.63 | 0.89 | 0.76 | 0.63 | 0.88 | 0.75 |
Neural network | 0.76 | 0.87 | 0.81 | 0.72 | 0.81 | 0.76 |
Linear discriminant analysis | 0.73 | 0.88 | 0.80 | 0.71 | 0.87 | 0.79 |
Mixture discriminant analysis | 0.74 | 0.89 | 0.81 | 0.71 | 0.84 | 0.77 |
Flexible discriminant analysis | 0.73 | 0.88 | 0.80 | 0.71 | 0.87 | 0.79 |
The mean methylation percentage of each genomic region was considered as the independent variable for constructing the models, which means that all of the models were based on these five independent variables without adjustment for gender, age, smoking status, and alcohol status. Sensitivity, specificity, and classification accuracy were the mean value in five-fold cross-validations with 1000 replications