. Author manuscript; available in PMC: 2024 Jan 1.

Published in final edited form as: Am J Kidney Dis. 2022 Jul 19;81(1):36–47. doi: 10.1053/j.ajkd.2022.06.004

Table 2.

Predictive performance of models for hospital mortality prediction in the derivation (UKY) and validation (UTSW) cohorts.

Performance metrics (95% CI)	SOFA Score		APACHE II Score		Clinical model ^*
Performance metrics (95% CI)	Derivation Cohort (UKY)	Validation Cohort (UTSW)	Derivation Cohort (UKY)	Validation Cohort (UTSW)	Derivation Cohort (UKY)	Validation Cohort (UTSW)
AUC	0.71(0.71–0.71)	0.71(0.71–0.71)	0.69(0.68–0.69)	0.67(0.67–0.67)	0.79 (0.79–0.80)	0.74 (0.73–0.74)
Difference in AUC (vs. SOFA)	-	-	-	-	0.08	0.03
- P value	-	-	-	-	<0.001	<0.001
Difference in AUC (vs. APACHE II)	-	-	-	-	0.10	0.07
- P value	-	-	-	-	<0.001	<0.001
Accuracy	0.65(0.64–0.65)	0.73(0.73–0.73)	0.65(0.64–0.65)	0.52(0.50–0.54)	0.71 (0.71–0.71)	0.65 (0.64–0.66)
Precision	0.34(0.34–0.35)	0.19(0.19–0.19)	0.34(0.33–0.34)	0.14(0.13–0.14)	0.41 (0.40–0.42)	0.18 (0.17–0.18)
Sensitivity	0.68(0.67–0.68)	0.55(0.55–0.55)	0.63(0.62–0.64)	0.73(0.70–0.75)	0.72 (0.72–0.73)	0.69 (0.67–0.71)
Specificity	0.64(0.63–0.64)	0.75(0.75–0.75)	0.65(0.65–0.65)	0.50(0.47–0.52)	0.71 (0.70–0.71)	0.64 (0.63–0.65)
F1	0.46(0.45–0.46)	0.29(0.29–0.29)	0.44(0.43–0.45)	0.23(0.23–0.23)	0.52 (0.52–0.53)	0.28 (0.27–0.29)
PPV	0.34(0.34–0.35)	0.19(0.19–0.19)	0.34(0.33–0.34)	0.14(0.13–0.14)	0.41 (0.40–0.42)	0.18 (0.17–0.18)
NPV	0.88(0.87–0.88)	0.94(0.94–0.94)	0.86(0.86–0.87)	0.94(0.94–0.94)	0.90 (0.90–0.90)	0.95 (0.95–0.95)
Calibration Intercept	−1.27(−1.29 to −1.25)	−2.03(−2.03 to −2.02)	−1.28(−1.29 to −1.26)	−2.47(−2.47 to −2.46)	−1.25 (−1.27 to −1.24)	−2.23 (−2.26 to −2.20)
Calibration Slope	1.02(1.00–1.04)	1.12(1.08–1.16)	0.99(0.95–1.03)	0.75(0.73–0.78)	1.13 (1.11–1.14)	1.17 (1.12–1.21)

NRI % (vs. SOFA)
- Categorical	_-	_-	_-	_-	0.12 [0.09 to 0.15]	0.05 [−0.03 to 0.13]
- P value	_-	_-	_-	_-	< 0.001	0.20
NRI % (vs. APACHE II)
- Categorical	_-	_-	_-	_-	0.15 [0.12 to 0.18]	0.12 [0.04 to 0.20]
- P value	-	-	-	-	< 0.001	< 0.001

The proposed clinical model included 15 features, refer to the Methods section for details. The machine learning method for the reported performance evaluation of SOFA and APACHE II is Logistic Regression and for the proposed clinical model is Random Forest.

Abbreviations: APACHE (acute physiologic assessment and chronic health evaluation), AUC (area under the curve), CI (confidence interval), F1 (F1 score), NPV (negative predictive value), NRI (net reclassification index), PPV (positive predictive value), SOFA (sequential organ failure assessment), UKY (University of Kentucky), UTSW (University of Texas Southwestern)