Table 3.
The seven best machine learning models.
| Model | Performance metrics and rank | |||||
|
|
Area under the ROCa curve | Area under the PRCb | Accuracy (%) | ∆i AICCc | Sensitivity (%) | |
| Automatic selection: random forest | 0.976 (1) | 0.958 (1) | 92.6 (1) | 0 (1) | 90.7 (1) | |
| Manual selection |
|
|
|
|
|
|
|
|
CVRd | 0.954 (5) | 0.922 (3) | 90.6 (4) | 15 (4) | 89.7 (2) |
|
|
Naïve Bayes | 0.960 (2) | 0.928 (2) | 90.2 (5) | 25 (5) | 89.0 (3) |
|
|
Simple logistic | 0.958 (3) | 0.921 (4) | 90.9 (2) | 6 (2) | 88.2 (4) |
|
|
Logistic model tree | 0.957 (4) | 0.920 (5) | 90.8 (3) | 7 (3) | 88.0 (5) |
|
|
Multi-class classifier | 0.932 (6) | 0.868 (6) | 89.9 (6) | 30 (6) | 86.8 (6) |
|
|
Logistic regression | 0.932 (7) | 0.868 (7) | 89.9 (7) | 30 (7) | 86.8 (7) |
aROC: receiver operating characteristic.
bPRC: precision-recall curve.
cAICC: corrected Akaike’s information criterion (∆i AICC = AIC Ci – AIC C min).
dCVR: classification via regression.