Table 4.
Studies Published up to 2011 (n = 20) | Studies Published after 2011 (n = 20) | All Studies (n = 40) | ||||
---|---|---|---|---|---|---|
Validation | n | % | n | % | n | % |
Internal validation | 5 | 25.0 | 13 | 65.0 | 18 | 45.0 |
Cross-validation | 1 | 5.0 | 5 | 25.0 | 6 | 15.0 |
Split sample | 3 | 15.0 | 3 | 15.0 | 6 | 15.0 |
Bootstrapping | 1 | 5.0 | 5 | 25.0 | 6 | 15.0 |
External validation | 1 | 5.0 | 5 | 25.0 | 6 | 15.0 |
Both internal and external validation | 0 | 0.0 | 3 | 15.0 | 3 | 7.5 |
Neither internal nor external validation | 14 | 70.0 | 5 | 25.0 | 19 | 47.5 |
Performance measures | n | % | n | % | n | % |
Calibration (1) | 2 | 10.0 | 9 | 45.0 | 11 | 27.5 |
Hosmer–Lemeshow test | 2 | 10.0 | 7 | 35.0 | 9 | 22.5 |
Graph (plot/intercept/slope) | 0 | 0.0 | 3 | 15.0 | 3 | 7.5 |
Calibration in the large | 0 | 0.0 | 1 | 5.0 | 1 | 2.5 |
Discrimination (2) | 9 | 45.0 | 20 | 100.0 | 29 | 72.5 |
AUC | 8 | 40.0 | 18 | 90.0 | 26 | 65.0 |
C-index | 0 | 0.0 | 3 | 15.0 | 3 | 7.5 |
Discrimination slope | 0 | 0.0 | 1 | 5.0 | 1 | 2.5 |
ROC plot (without AUC calculation) | 1 | 5.0 | 0 | 0.0 | 1 | 2.5 |
Overall model performance (3) | 0 | 0.0 | 1 | 5.0 | 1 | 2.5 |
Brier score | 0 | 0 | 1 | 5.0 | 1 | 2.5 |
Nagelkerk’s R2 | 0 | 0 | 1 | 5.0 | 1 | 2.5 |
Reclassification (4) | 0 | 0.0 | 4 | 20.0 | 4 | 10.0 |
Net reclassification improvement | 0 | 0.0 | 4.0 | 20.0 | 4 | 10.0 |
Integrated discrimination index | 0 | 0.0 | 2.0 | 10.0 | 2 | 5.0 |
Clinically usefulness | 3 | 15.0 | 8 | 40.0 | 11 | 27.5 |
Sensitivity/specificity | 3 | 15.0 | 5 | 25.0 | 8 | 20.0 |
Decision curve | 0 | 0.0 | 3 | 15.0 | 3 | 7.5 |
No performance measure at all | 11 | 55.0 | 0 | 0.0 | 11 | 27.5 |
* For extended table with all references see Supplementary Table S2. (1) Two studies reported multiple calibration measures. (2) Two studies reported multiple discrimination measures. (3) One study reported both performance measures. (4) Two studies reported both reclassification measures.