Skip to main content
. Author manuscript; available in PMC: 2013 Feb 18.
Published in final edited form as: Epidemiology. 2010 Jan;21(1):128–138. doi: 10.1097/EDE.0b013e3181c30fb2

Table 1.

Characteristics of some traditional and novel performance measures

Aspect Measure Visualization Characteristics
Overall performance R2 Brier Validation graph Better with lower distance between Y and Ŷ. Captures calibration and discrimination aspects.
Discrimination C statistic ROC curve Rank order statistic; Interpretation for a pair of patients with and without the outcome
Discrimination slope Box plot Difference in mean of predictions between outcomes; Easy visualization
Calibration Calibration-in-the-large Calibration or validation graph Compare mean(y) versus mean(ŷ); essential aspect for external validation
Calibration slope Regression slope of linear predictor; essential aspect for internal and external validation related to ‘shrinkage’ of regression coefficients
Hosmer-Lemeshow test Compares observed to predicted by decile of predicted probability
Reclassification Reclassification table Cross-table or scatter plot Compare classifications from 2 models (one with, one without a marker) for changes
Reclassification calibration Compare observed and predicted within cross-classified categories
Net Reclassification Index (NRI) Compare classifications from 2 models for changes by outcome for a net calculation of changes in the right correction
Integrated Discrimination Index (IDI) Box plots for 2 models (one with, one without a marker) Integrates the NRI over all possible cut-offs; equivalent to difference in discrimination slopes
Clinical usefulness Net Benefit (NB) Cross-table Net number of true positives gained by using a model compared to no model at a single threshold (NB) or over a range of thresholds (DCA)
Decision curve analysis (DCA) Decision curve