Skip to main content
. 2022 Jan 16;12(1):212. doi: 10.3390/diagnostics12010212

Table 3.

The area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy, and k-fold of prediction models generated from machine-learning algorithms in the Ansan/Ansung cohort.

99 Features Logistic
Regression
XGBoost Decision
Tree
KNN SVM Random
Forest
ANN
AUC of ROC 0.866
(0.865–0.867)
0.866
(0.865–0.867)
0.647
(0.646–0.647)
0.662
(0.661–0.663)
0.597
(0.596–0.597)
0.836
(0.835–0.836)
0.816
Accuracy 0.867
(0.867–0.868)
0.868
(0.868–0.869)
0.793
(0.792–0.793)
0.826
(0.825–0.827)
0.859
(0.858–0.859)
0.841
(0.840–0.841)
k-fold 0.858
(0.853–0.863)
0.859
(0.856–0.863)
0.786
(0.764–0.786)
0.821
(0.818–0.825)
0.851
(0.848–0.854)
0.833
(0.831–0.834)
Top 15 features
AUC of ROC 0.849
(0.848–0.850)
0.853 (0.853–0.854) 0.639 (0.638–0.640) 0.694 (0.693–0.695) 0.574
(0.574–0.575)
0.831
(0.830–0.832)
0.822
Accuracy 0.868
(0.867–0.868)
0.877
(0.876–0.877)
0.798
(0.797–0.798)
0.837
(0.836–0.837)
0.855
(0.854–0.856)
0.860
(0.859–0.860)
k-fold 0.856
(0.850–0.862)
0.861
(0.853–0.870)
0.777
(0.768–0.785)
0.827
(0.818–0.831)
0.850
(0.846–0.852)
0.856
(0.853–0.859)
Top 9 features
AUC of ROC 0.849
(0.848–0.850)
0.853 (0.852–0.853) 0.636 (0.635–0.636) 0.691 (0.690–0.692) 0.561 (0.560–0.561) 0.836
(0.835–0.837)
0.862
Accuracy 0.867
(0.867–0.868)
0.868
(0.867–0.868)
0.791
(0.790–0.792)
0.834
(0.833–0.834)
0.853
(0.852–0.853)
0.862
(0.862–0.863)
k-fold 0.856
(0.851–0.861)
0.861
(0.857–0.864)
0.779
(0.764–0.795)
0.828
(0.824–0.835)
0.848
(0.843–0.853)
0.857
(0.853–0.859)

Prediction models were generated from the training set with 80% of the Ansan/Ansung cohort, and its 20% was used as a test set. KNN, K-Nearest Neighbor; SVM, support vector machine; ANN, artificial neural network. The top 15-feature prediction model generated from XGBoost included serum glucose, waist circumference, blood HbA1c, serum total bilirubin, season to enroll the study, body fat, pulse, hip circumference, serum HDL, ALT, and γ-GTP, gender, serum creatinine, residence area, and PRS for insulin resistance. The top 9-feature prediction model generated from XGBoost contained serum glucose, waist circumference, body fat, serum ALT, serum total bilirubin, pulse, serum HDL, and gender.