Table 3.
Measures of model predictive performance
| Outcome category | Dichotomized logistic models | Continuation-ratio logit model |
|---|---|---|
| Overall accuracy: Brier score/Bootstrap validation Brier score | ||
| Death | 0.199/0.203 | 0.199/0.202 |
| NDI | 0.194/0.196 | 0.192/0.194 |
| NDI-free survival | 0.186/0.189 | 0.186/0.188 |
| Discrimination: C-statistics (95% CI)/Bootstrap validation C-statistics | ||
| Death | 0.738 (0.722–0.754)/0.729 | 0.738 (0.722–0.753)/0.729 |
| NDI | 0.623 (0.604–0.643)/0.606 | 0.637 (0.618–0.656)/0.619 |
| NDI-free survival | 0.730 (0.714–0.746)/0.720 | 0.730 (0.713–0.746)/0.721 |
| Calibration: mean of predicted probability (range) | ||
| Death | 40.3 (3.7–91.6) | 40.3 (4.0–91.2) |
| NDI | 28.1 (8.5–48.8) | 28.1 (6.6–52.1) |
| NDI-free survival | 31.5 (1.5–81.0) | 31.6 (0.8–78.9) |
| Sum over all categories | 100.0 (87.7–124.0) | 100.0 (100.0–100.0) |
| Calibration intercept/slope | ||
| Death | 0/1.026 | 0.003/1.051 |
| NDI | 0.001/1.133 | 0/1.184 |
| NDI-free survival | 0/1.035 | -0.003/1.076 |
Footnote: Brier score – mean squared difference between observed outcome and predicted probability, 0.25 or greater indicates a worthless model; C-statistics – Area Under the Curve (AUC), 0.5 indicates no discrimination, 0.7 to 0.8 moderate or acceptable, 0.8 or greater excellent; Calibration – agreement between predicted probabilities and observed rates, a calibration intercept of 0 and a slope of 1 indicate perfect calibration; bootstrap validation performance (Brier score or C-statistics) = model performance - average(bootstrap model performance on sample data – bootstrap model performance on original study data)