Skip to main content
. 2022 Jul 6;7(8):844–854. doi: 10.1001/jamacardio.2022.1900

Table 2. Discrimination and Calibration Performance of Risk Prediction Models for Predicting In-Hospital Mortality Among Patients With Heart Failurea.

Factor Discrimination, C index (95% CI) Calibration
Brier score (95% CI), ×10−5 Intercept Slope
Black patients (n = 1205)
Race-specific ML model 0.79 (0.77-0.81) 19 (11-28) −0.09 0.95
Race-agnostic ML model 0.79 (0.77-0.81) 20 (11-29) −0.13 0.94
ML model (race as a covariate) 0.79 (0.77-0.81) 19 (11-29) −0.09 0.94
GWTG risk scoreb 0.69 (0.67-0.71) 30 (23-38) −0.50 0.78
LR model (race as a covariate)b 0.71 (0.69-0.72) 29 (23-40) −0.25 0.79
Race-specific LR modelb 0.74 (0.72-0.76) 24 (18-33) −0.14 0.88
Non-Black patients (n = 2264)
Race-specific ML model 0.80 (0.79-0.81) 16 (12-19) −0.04 0.90
Race-agnostic ML model 0.80 (0.79-0.81) 16 (12-18) −0.05 0.92
ML model (race as a covariate) 0.80 (0.79-0.81) 16 (12-19) −0.04 0.90
GWTG risk scoreb 0.69 (0.68-0.72) 23 (20-27) −0.19 0.83
LR model (race as a covariate)b 0.70 (0.67-0.73) 28 (25-31) −0.16 0.82
Race-specific LR modelb 0.74 (0.73-0.76) 24 (20-27) −0.10 0.91

Abbreviations: ARIC, Atherosclerosis Risk in Communities; GWTG-HF, Get With The Guidelines–Heart Failure; LR, logistic regression; ML, machine learning.

a

A higher C index and lower Brier score indicate better performance. Among calibration slope measures, an intercept closer to 0 and slope closer to 1 indicates better calibration.

b

Indicates significant difference in C indices (DeLong test P value <.005) compared with the race-specific ML model.