Skip to main content
. 2020 Jun 28;3(2):216–224. doi: 10.1093/jamiaopen/ooaa021

Table 1.

Precision, recall, and F1 score of ClinicNet, institutional order sets, and logistic regression when thresholded to similar levels of recall

Evaluation metrics
Models Precision (95% CI) Recall (95% CI) F1 (95% CI) AUROC (95% CI)
Logistic 0.204 (0.200–0.208) 0.469 (0.464–0.473) 0.285 (0.280–0.289) 0.815 (0.812–0.817)
Institutional 0.149 (0.147–0.151) 0.463 (0.458–0.469) 0.226 (0.223–0.228)
ClinicNet 0.317 (0.314–0.320) 0.468 (0.463–0.472) 0.378 (0.375–0.381) 0.908 (0.906–0.909)

Note: As institutional order sets consist of a single threshold point, AUROC is left blank. Metrics were bootstrapped with a sample size of 10 000 for 1000 iterations to get reported CIs. Evaluation was performed at the patient-level rather than the clinical item-level. The following thresholds were used: Logistic regression = 0.11, ClinicNet = 0.50. Bold indicates highest metric.

Abbreviations: AUROC: area under the receiver operating characteristics; CI: confidence interval.