. 2021 Apr 21;6:88. [Version 1] doi: 10.12688/wellcomeopenres.16718.1

Table 3. Summary table with performance metrics reported as median (95% CI) and F ₁ interquartile variance (IQV) after 200 bootstrap iterations.

The performance metrics are compared across pipelines using different fields from PubMed entries.

Pipeline	Precision (%)	Recall (%)	F ₁ (%)	F ₁ IQV
Title	65.3 (59.0,72.7)	65.8 (55.0,72.8)	65.0 (59.5,71.0)	11.5
Abstract	77.0 (69.8,82.6)	79.8 (73.4,86.1)	78.2 (73.6,82.6)	9.0
Authors *	76.4 (69.4,82.6)	80.4 (72.8,86.1)	78.2 (73.3,82.6)	9.3
Journal *	76.4 (70.2,82.2)	79.8 (72.8,85.4)	78.0 (73.6,82.0)	8.4
Publication Type *	78.0 (71.6,84.3)	81.6 (74.7,87.4)	79.6 (75.5,84.2)	8.7
Keywords *	76.6 (70.2,83.0)	80.4 (72.8,85.5)	78.2 (73.8,82.2)	8.4
MeSH terms *	79.2 (72.1,85.2)	79.8 (72.8,86.1)	79.5 (74.3,83.3)	9.0
Chemicals *	76.0 (69.5,81.9)	80.4 (73.4,86.1)	77.8 (73.0,82.0)	9.0
Affiliations *	76.6 (69.8,82.1)	80.4 (72.8,86.7)	78.3 (73.2,81.9)	8.7
All fields	80.1 (73.0,86.1)	81.6 (74.1,87.4)	80.5 (75.7,84.9)	9.2
Optimal Fields **	80.1 (73.9,86.0)	82.3 (74.1,88.6)	80.6 (75.8,85.2)	9.4

*Tokens from the title and abstract were also included when encoding this field.

**The optimal fields were the title, abstract, MeSH terms and publication type.

Table 3. Summary table with performance metrics reported as median (95% CI) and F 1 interquartile variance (IQV) after 200 bootstrap iterations.