Table 6.
Results | Accuracy (CI) (%) | Precision (CI) (%) | Recall (CI) (%) | F1 score (CI) (%) |
---|---|---|---|---|
Extracting clinical findings | 78.4 (70.4–86.4) |
81.8 (74.3–89.3) |
90.0 (84.2–95.8) |
85.7 (78.9–92.5) |
Determination of change on true clinical findings | 99.3 (98.2–100.0) |
97.3 (94.3–100) |
93.4 (88.6–98.2) |
95.2 (91.1–99.3) |
Determination of change on extracted clinical findings (end-to-end) | 99.2 (97.9–100.0) |
96.3 (92.8–99.8) |
93.5 (88.8–98.2) |
94.7 (90.4–99.0) |
Determination of significance on true clinical findings | 78.9 (71.0–86.8) |
79.3 (71.4–87.2) |
78.9 (71.0–86.8) |
78.8 (70.9–86.7) |
Determination of significance on extracted clinical findings (end-to-end) | 75.8 (67.5–84.1) |
75.2 (66.8–83.6) |
75.7 (67.4–84.0) |
75.3 (66.9–83.7) |