Skip to main content
. 2024 Jan 18;49(3):748–761. doi: 10.1007/s00261-023-04151-1

Table 4.

Diagnostic performance of different machine learning classifiers in internal validation cohort and external validation cohort

Model AUC Accuracy Sensitivity Specificity PPV NPV F1 score
Internal validation cohort
XGBoost 0.89 (0.76–1.00) 0.86 0.83 0.97 0.94 0.72 0.87
LightGBM 0.87 (0.74–0.99) 0.84 0.83 0.91 0.94 0.69 0.88
Logistic 0.86 (0.69–1.00) 0.76 0.82 0.86 0.94 0.48 0.88
RandomForest 0.83 (0.66–0.98) 0.79 0.82 0.85 0.92 0.58 0.86
MLP 0.70 (0.50–0.90) 0.63 0.73 0.75 0.82 0.44 0.77
External validation cohort
XGBoost 0.84 (0.69–0.99) 0.79 1.00 0.50 0.64 1.00 0.87
LightGBM 0.77 (0.60–0.95) 0.61 0.61 0.60 0.73 0.46 0.67
Logistic 0.75 (0.56–0.94) 0.68 0.67 0.70 0.80 0.54 0.73
RandomForest 0.75 (0.56–0.93) 0.71 0.78 0.60 0.78 0.60 0.78
MLP 0.57 (0.33–0.79) 0.50 0.44 0.60 0.67 0.38 0.53

AUC, area under curve; LightGBM, light gradient boosting; MLP, multi-layer perceptron; NPV, negative predictive value; PPV, positive predictive value; XGBoost, eXtreme Gradient Boosting