Table 4.
Diagnostic performance of different machine learning classifiers in internal validation cohort and external validation cohort
| Model | AUC | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 score |
|---|---|---|---|---|---|---|---|
| Internal validation cohort | |||||||
| XGBoost | 0.89 (0.76–1.00) | 0.86 | 0.83 | 0.97 | 0.94 | 0.72 | 0.87 |
| LightGBM | 0.87 (0.74–0.99) | 0.84 | 0.83 | 0.91 | 0.94 | 0.69 | 0.88 |
| Logistic | 0.86 (0.69–1.00) | 0.76 | 0.82 | 0.86 | 0.94 | 0.48 | 0.88 |
| RandomForest | 0.83 (0.66–0.98) | 0.79 | 0.82 | 0.85 | 0.92 | 0.58 | 0.86 |
| MLP | 0.70 (0.50–0.90) | 0.63 | 0.73 | 0.75 | 0.82 | 0.44 | 0.77 |
| External validation cohort | |||||||
| XGBoost | 0.84 (0.69–0.99) | 0.79 | 1.00 | 0.50 | 0.64 | 1.00 | 0.87 |
| LightGBM | 0.77 (0.60–0.95) | 0.61 | 0.61 | 0.60 | 0.73 | 0.46 | 0.67 |
| Logistic | 0.75 (0.56–0.94) | 0.68 | 0.67 | 0.70 | 0.80 | 0.54 | 0.73 |
| RandomForest | 0.75 (0.56–0.93) | 0.71 | 0.78 | 0.60 | 0.78 | 0.60 | 0.78 |
| MLP | 0.57 (0.33–0.79) | 0.50 | 0.44 | 0.60 | 0.67 | 0.38 | 0.53 |
AUC, area under curve; LightGBM, light gradient boosting; MLP, multi-layer perceptron; NPV, negative predictive value; PPV, positive predictive value; XGBoost, eXtreme Gradient Boosting