Skip to main content
. 2018 Jan 10;25(12):1586–1592. doi: 10.1093/jamia/ocx093

Table 1.

Model performance on “Sick” task

Model Training Regime Precision Recall F1-Score AUPR
(95% CI) (95% CI)
J4.8 Prototype 0.48 0.99 0.65 0.83
(0.63-0.67) (0.81-0.85)
Logistic regression Biased 0.05 0.94 0.10 0.63
(0.09-0.11) (0.55-0.76)
Logistic regression Gold 0.83 0.88 0.85 0.90
(0.83-0.87) (0.88-0.92)
Logistic regression Silver 0.85 0.88 0.87 0.91
(0.85-0.88) (0.90-0.93)
Random forest Biased 0.04 0.91 0.07 0.59
(0.06-0.09) 0.54-0.70
Random forest Gold 0.36 0.89 0.51 0.81
(0.38-0.68) (0.78-0.84)
Random forest Silver 0.70 0.88 0.78 0.87
(0.66-0.85) (0.85-0.89)
SVM Biased 0.09 0.95 0.16 0.82
(0.13-0.20) (0.79-0.87)
SVM Gold 0.33 0.93 0.49 0.88
(0.37-0.67) (0.85-0.91)
SVM Silver 0.96 0.74 0.83 0.93
(0.81-0.85) (0.92-0.95)

The underlined value represents the final selected model from among the variants. This is the model we further analyze in the error analysis. Because the bootstrap distribution of some test statistics exhibited non-normal behavior, their corresponding confidence intervals are wider.