Skip to main content
. 2020 Sep 16;3(9):e2012734. doi: 10.1001/jamanetworkopen.2020.12734

Table 2. Confusion Matrix Metrics for Random Forest and Logistic Regression Models.

Population at highest risk, %a Specificity, % Sensitivity, % PPV, %
Random forest Logistic regression Difference (95% CI)b Random forest Logistic regression Difference (95% CI)b Random forest Logistic regression Difference (95% CI)b
5 95.5 95.1 0.4 (0.0 to 0.7) 16.2 8.1 8.1 (3.9 to 11.7) 15.5 7.8 7.7 (3.7 to 11.3)
10 90.4 90.1 0.2 (−0.2 to 0.7) 27.3 19.9 7.4 (3.0 to 14.6) 12.7 9.4 3.3 (1.3 to 6.7)
20 80.3 79.9 0.3 (−0.1 to 1.4) 42.4 38.4 4.1 (−1.1 to 12.5) 9.9 8.9 1.0 (−0.1 to 3.0)

Abbreviation: PPV, positive predictive value.

a

Binary predictions are obtained from continuous risk scores by classifying this highest-risk percentage as positive.

b

The 95% CIs were estimated using 10 000 bootstrap replications.