. 2017 Dec 15;9:129. doi: 10.1186/s13148-017-0430-7

Table 4.

Diagnosis accuracy, sensitivity, and specificity of different classification models with five-fold cross-validation

Methods	Train			Test
	Sensitivity	Specificity	Accuracy	Sensitivity	Specificity	Accuracy
Logistic regression	0.75	0.89	0.82	0.73	0.86	0.79
Random forest	0.73	0.77	0.75	0.73	0.78	0.75
Supporting vector machine	0.74	0.89	0.82	0.73	0.87	0.80
Naïve Bayes	0.63	0.89	0.76	0.63	0.88	0.75
Neural network	0.76	0.87	0.81	0.72	0.81	0.76
Linear discriminant analysis	0.73	0.88	0.80	0.71	0.87	0.79
Mixture discriminant analysis	0.74	0.89	0.81	0.71	0.84	0.77
Flexible discriminant analysis	0.73	0.88	0.80	0.71	0.87	0.79

The mean methylation percentage of each genomic region was considered as the independent variable for constructing the models, which means that all of the models were based on these five independent variables without adjustment for gender, age, smoking status, and alcohol status. Sensitivity, specificity, and classification accuracy were the mean value in five-fold cross-validations with 1000 replications