. 2020 Oct 28;17(21):7919. doi: 10.3390/ijerph17217919

Table 4.

Absolute (n) and relative frequencies (%) of methods used when evaluating risk prediction models for melanoma regarding the methodological type of validation and the type of measures describing model performance (n = 40 studies) *.

	Studies Published up to 2011 (n = 20)		Studies Published after 2011 (n = 20)		All Studies (n = 40)
Validation	n	%	n	%	n	%
Internal validation	5	25.0	13	65.0	18	45.0
Cross-validation	1	5.0	5	25.0	6	15.0
Split sample	3	15.0	3	15.0	6	15.0
Bootstrapping	1	5.0	5	25.0	6	15.0
External validation	1	5.0	5	25.0	6	15.0
Both internal and external validation	0	0.0	3	15.0	3	7.5
Neither internal nor external validation	14	70.0	5	25.0	19	47.5
Performance measures	n	%	n	%	n	%
Calibration ⁽¹⁾	2	10.0	9	45.0	11	27.5
Hosmer–Lemeshow test	2	10.0	7	35.0	9	22.5
Graph (plot/intercept/slope)	0	0.0	3	15.0	3	7.5
Calibration in the large	0	0.0	1	5.0	1	2.5
Discrimination ⁽²⁾	9	45.0	20	100.0	29	72.5
AUC	8	40.0	18	90.0	26	65.0
C-index	0	0.0	3	15.0	3	7.5
Discrimination slope	0	0.0	1	5.0	1	2.5
ROC plot (without AUC calculation)	1	5.0	0	0.0	1	2.5
Overall model performance ⁽³⁾	0	0.0	1	5.0	1	2.5
Brier score	0	0	1	5.0	1	2.5
Nagelkerk’s R²	0	0	1	5.0	1	2.5
Reclassification ⁽⁴⁾	0	0.0	4	20.0	4	10.0
Net reclassification improvement	0	0.0	4.0	20.0	4	10.0
Integrated discrimination index	0	0.0	2.0	10.0	2	5.0
Clinically usefulness	3	15.0	8	40.0	11	27.5
Sensitivity/specificity	3	15.0	5	25.0	8	20.0
Decision curve	0	0.0	3	15.0	3	7.5
No performance measure at all	11	55.0	0	0.0	11	27.5

* For extended table with all references see Supplementary Table S2. ⁽¹⁾ Two studies reported multiple calibration measures. ⁽²⁾ Two studies reported multiple discrimination measures. ⁽³⁾ One study reported both performance measures. ⁽⁴⁾ Two studies reported both reclassification measures.