Skip to main content
. 2020 Oct 28;17(21):7919. doi: 10.3390/ijerph17217919

Table 4.

Absolute (n) and relative frequencies (%) of methods used when evaluating risk prediction models for melanoma regarding the methodological type of validation and the type of measures describing model performance (n = 40 studies) *.

Studies Published up to 2011 (n = 20) Studies Published after 2011 (n = 20) All Studies (n = 40)
Validation n % n % n %
Internal validation 5 25.0 13 65.0 18 45.0
Cross-validation 1 5.0 5 25.0 6 15.0
Split sample 3 15.0 3 15.0 6 15.0
Bootstrapping 1 5.0 5 25.0 6 15.0
External validation 1 5.0 5 25.0 6 15.0
Both internal and external validation 0 0.0 3 15.0 3 7.5
Neither internal nor external validation 14 70.0 5 25.0 19 47.5
Performance measures n % n % n %
Calibration (1) 2 10.0 9 45.0 11 27.5
Hosmer–Lemeshow test 2 10.0 7 35.0 9 22.5
Graph (plot/intercept/slope) 0 0.0 3 15.0 3 7.5
Calibration in the large 0 0.0 1 5.0 1 2.5
Discrimination (2) 9 45.0 20 100.0 29 72.5
AUC 8 40.0 18 90.0 26 65.0
C-index 0 0.0 3 15.0 3 7.5
Discrimination slope 0 0.0 1 5.0 1 2.5
ROC plot (without AUC calculation) 1 5.0 0 0.0 1 2.5
Overall model performance (3) 0 0.0 1 5.0 1 2.5
Brier score 0 0 1 5.0 1 2.5
Nagelkerk’s R2 0 0 1 5.0 1 2.5
Reclassification (4) 0 0.0 4 20.0 4 10.0
Net reclassification improvement 0 0.0 4.0 20.0 4 10.0
Integrated discrimination index 0 0.0 2.0 10.0 2 5.0
Clinically usefulness 3 15.0 8 40.0 11 27.5
Sensitivity/specificity 3 15.0 5 25.0 8 20.0
Decision curve 0 0.0 3 15.0 3 7.5
No performance measure at all 11 55.0 0 0.0 11 27.5

* For extended table with all references see Supplementary Table S2. (1) Two studies reported multiple calibration measures. (2) Two studies reported multiple discrimination measures. (3) One study reported both performance measures. (4) Two studies reported both reclassification measures.