. 2021 Jul 26;3:598916. doi: 10.3389/fdgth.2021.598916

Table 2.

Comparing manual and predictive annotation.

Method	Letter set	Annotation time (minutes)		Mean time (minutes)	Annotated items N	Pairwise F-measure
		First annotator	Second annotator
Manual	1	41.24	51.22	46.23	85	0.96
	2	35.36	63	49.18	96	0.89
Predictive	1	32.33	40.46	36.40395	91	0.93
	2	33.00	41	37.00	96	0.88
			Mean decrease in time		14.2% (11.01 min) CI 0.62–0.87, p < 0.001)
			Mean decrease in F		0.02 (95% CI 0.05–1, p = 1)

Each letter set contained 25 pseudo-anonymised letters. Each of 4 annotators manually annotated 1 set and used annotation predictions to annotate the other set (see Method section Comparing Active Learning and Manual Annotation). Pairwise F-measure is defined as the average of F-measures between multiple annotators where each annotator is sequentially selected as the gold standard annotator.