. 2018 Sep 5;9:356. doi: 10.3389/fgene.2018.00356

Table 3.

Diagnosis accuracy, sensitivity and specificity of different classification models with fivefold cross-validation.

Methods	Train			Test

	Sensitivity	Specificity	Accuracy	Sensitivity	Specificity	Accuracy
Logistic Regression	0.683	0.873	0.773	0.645	0.830	0.732
Random Forest	0.726	0.739	0.732	0.728	0.741	0.734
Supporting Vector Machine	0.635	0.907	0.764	0.599	0.881	0.731
Naive Bayes	0.539	0.916	0.718	0.532	0.910	0.709
Neural Network	0.701	0.841	0.768	0.667	0.794	0.726
Linear Discriminant Analysis	0.617	0.906	0.754	0.594	0.894	0.735
Mixture Discriminant Analysis	0.618	0.868	0.736	0.564	0.843	0.695
Flexible Discriminant Analysis	0.616	0.907	0.754	0.594	0.894	0.735
Gradient Boosting Machine	0.826	0.856	0.840	0.699	0.728	0.713

The mean methylation percentage of each genomic region was considered as the independent variable for constructing the models, which means that all of the models were based on these five independent variables without adjustment for gender, age, smoking status and alcohol status. Sensitivity, specificity and classification accuracy were the mean value in fivefold cross-validations with 1,000 replications.