Table 3.
Methods | Train | Test | ||||
---|---|---|---|---|---|---|
Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | Accuracy | |
Logistic Regression | 0.683 | 0.873 | 0.773 | 0.645 | 0.830 | 0.732 |
Random Forest | 0.726 | 0.739 | 0.732 | 0.728 | 0.741 | 0.734 |
Supporting Vector Machine | 0.635 | 0.907 | 0.764 | 0.599 | 0.881 | 0.731 |
Naive Bayes | 0.539 | 0.916 | 0.718 | 0.532 | 0.910 | 0.709 |
Neural Network | 0.701 | 0.841 | 0.768 | 0.667 | 0.794 | 0.726 |
Linear Discriminant Analysis | 0.617 | 0.906 | 0.754 | 0.594 | 0.894 | 0.735 |
Mixture Discriminant Analysis | 0.618 | 0.868 | 0.736 | 0.564 | 0.843 | 0.695 |
Flexible Discriminant Analysis | 0.616 | 0.907 | 0.754 | 0.594 | 0.894 | 0.735 |
Gradient Boosting Machine | 0.826 | 0.856 | 0.840 | 0.699 | 0.728 | 0.713 |
The mean methylation percentage of each genomic region was considered as the independent variable for constructing the models, which means that all of the models were based on these five independent variables without adjustment for gender, age, smoking status and alcohol status. Sensitivity, specificity and classification accuracy were the mean value in fivefold cross-validations with 1,000 replications.