Table 2.
Model performance |
Discriminative ability
Measuring how well DSS distinguishes between outcomes (e.g., risk groups) in external validations, using area under the ROC curve (AUC) analyses. AUC: <0.6 = poor; 0.6–0.7 = moderate; 0.7–0.8 = strong; >0.8 = very strong [18] |
van Calster levels of calibration (16)
Measuring how well predicted outcomes resemble observed outcomes: - Mean calibration – Correct average predicted risk. - Weak calibration – Correct average prediction effects. - Moderate calibration – Comparison between predicted and observed outcome. - Strong calibration – Event rate equals predicted risk for every covariate pattern. |
Reilly levels of evidence (17)
Measure for how thoroughly DSS is validated: - Level 1 – Derivation from a prediction model and not externally validated yet. - Level 2 – Narrow validation in one setting. - Level 3 – Broad validation in varied settings and populations. - Level 4 – Narrow impact analysis of model as decision rule in one setting. - Level 5 – Broad impact analysis of model as decision rule in varied settings and populations. |
User friendliness |
Predictors routinely collected
Are all predictors in the DSS collected on a routine basis in clinical practice, or are special techniques needed? |
Easy use and access
Can the DSS easily be calculated (manually or using a computer) with an accessible regression formula, scoring system, nomogram, decision tree or online application? |