. 2024 Oct 31;24:261. doi: 10.1186/s12874-024-02389-x

Table 3.

Measures of model predictive performance

Outcome category	Dichotomized logistic models	Continuation-ratio logit model
	Overall accuracy: Brier score/Bootstrap validation Brier score
Death	0.199/0.203	0.199/0.202
NDI	0.194/0.196	0.192/0.194
NDI-free survival	0.186/0.189	0.186/0.188
	Discrimination: C-statistics (95% CI)/Bootstrap validation C-statistics
Death	0.738 (0.722–0.754)/0.729	0.738 (0.722–0.753)/0.729
NDI	0.623 (0.604–0.643)/0.606	0.637 (0.618–0.656)/0.619
NDI-free survival	0.730 (0.714–0.746)/0.720	0.730 (0.713–0.746)/0.721
	Calibration: mean of predicted probability (range)
Death	40.3 (3.7–91.6)	40.3 (4.0–91.2)
NDI	28.1 (8.5–48.8)	28.1 (6.6–52.1)
NDI-free survival	31.5 (1.5–81.0)	31.6 (0.8–78.9)
Sum over all categories	100.0 (87.7–124.0)	100.0 (100.0–100.0)
	Calibration intercept/slope
Death	0/1.026	0.003/1.051
NDI	0.001/1.133	0/1.184
NDI-free survival	0/1.035	-0.003/1.076

Footnote: Brier score – mean squared difference between observed outcome and predicted probability, 0.25 or greater indicates a worthless model; C-statistics – Area Under the Curve (AUC), 0.5 indicates no discrimination, 0.7 to 0.8 moderate or acceptable, 0.8 or greater excellent; Calibration – agreement between predicted probabilities and observed rates, a calibration intercept of 0 and a slope of 1 indicate perfect calibration; bootstrap validation performance (Brier score or C-statistics) = model performance - average(bootstrap model performance on sample data – bootstrap model performance on original study data)