Table 2.
Method | Letter set | Annotation time (minutes) | Mean time (minutes) | Annotated items N | Pairwise F-measure | |
---|---|---|---|---|---|---|
First annotator | Second annotator | |||||
Manual | 1 | 41.24 | 51.22 | 46.23 | 85 | 0.96 |
2 | 35.36 | 63 | 49.18 | 96 | 0.89 | |
Predictive | 1 | 32.33 | 40.46 | 36.40395 | 91 | 0.93 |
2 | 33.00 | 41 | 37.00 | 96 | 0.88 | |
Mean decrease in time | 14.2% (11.01 min) CI 0.62–0.87, p < 0.001) | |||||
Mean decrease in F | 0.02 (95% CI 0.05–1, p = 1) |
Each letter set contained 25 pseudo-anonymised letters. Each of 4 annotators manually annotated 1 set and used annotation predictions to annotate the other set (see Method section Comparing Active Learning and Manual Annotation). Pairwise F-measure is defined as the average of F-measures between multiple annotators where each annotator is sequentially selected as the gold standard annotator.