. 2022 Jul 6;7(8):844–854. doi: 10.1001/jamacardio.2022.1900

Table 2. Discrimination and Calibration Performance of Risk Prediction Models for Predicting In-Hospital Mortality Among Patients With Heart Failure^a.

Factor	Discrimination, C index (95% CI)	Calibration
Factor	Discrimination, C index (95% CI)	Brier score (95% CI), ×10⁻⁵	Intercept	Slope
Black patients (n = 1205)
Race-specific ML model	0.79 (0.77-0.81)	19 (11-28)	−0.09	0.95
Race-agnostic ML model	0.79 (0.77-0.81)	20 (11-29)	−0.13	0.94
ML model (race as a covariate)	0.79 (0.77-0.81)	19 (11-29)	−0.09	0.94
GWTG risk score^b	0.69 (0.67-0.71)	30 (23-38)	−0.50	0.78
LR model (race as a covariate)^b	0.71 (0.69-0.72)	29 (23-40)	−0.25	0.79
Race-specific LR model^b	0.74 (0.72-0.76)	24 (18-33)	−0.14	0.88
Non-Black patients (n = 2264)
Race-specific ML model	0.80 (0.79-0.81)	16 (12-19)	−0.04	0.90
Race-agnostic ML model	0.80 (0.79-0.81)	16 (12-18)	−0.05	0.92
ML model (race as a covariate)	0.80 (0.79-0.81)	16 (12-19)	−0.04	0.90
GWTG risk score^b	0.69 (0.68-0.72)	23 (20-27)	−0.19	0.83
LR model (race as a covariate)^b	0.70 (0.67-0.73)	28 (25-31)	−0.16	0.82
Race-specific LR model^b	0.74 (0.73-0.76)	24 (20-27)	−0.10	0.91

Abbreviations: ARIC, Atherosclerosis Risk in Communities; GWTG-HF, Get With The Guidelines–Heart Failure; LR, logistic regression; ML, machine learning.

^{^a}

A higher C index and lower Brier score indicate better performance. Among calibration slope measures, an intercept closer to 0 and slope closer to 1 indicates better calibration.

^{^b}

Indicates significant difference in C indices (DeLong test P value <.005) compared with the race-specific ML model.

Table 2. Discrimination and Calibration Performance of Risk Prediction Models for Predicting In-Hospital Mortality Among Patients With Heart Failurea.

Table 2. Discrimination and Calibration Performance of Risk Prediction Models for Predicting In-Hospital Mortality Among Patients With Heart Failure^a.