Skip to main content
. 2019 Sep 4;10(4):655–669. doi: 10.1055/s-0039-1695791

Table 4. Final scores: Precision (P), recall (R), and F1 scores at initial and final model revisions aggregated over 15 participants.

Initial Final
P R F1 P R F1
Range Mean Range Mean Range Mean
Reports 0.90 0.19 0.31 [0.67, 0.90] 0.77 ± 0.06 [0.62, 0.81] 0.72 ± 0.05 [0.70, 0.79] 0.75 ± 0.03
Sections 0.86 0.20 0.32 [0.73, 0.86] 0.79 ± 0.04 [0.45, 0.68] 0.60 ± 0.07 [0.57, 0.73] 0.68 ± 0.04
Sentences 0.84 0.13 0.22 [0.75, 0.88] 0.80 ± 0.04 [0.36, 0.62] 0.48 ± 0.06 [0.50, 0.68] 0.60 ± 0.04

Note: The initial model was trained on the same six encounters to bootstrap the learning cycle.