Table 11.
Comparison of PubMedBERT fine-tuned models, scispaCy and Stanza on BLURB named-entity recognition tasks (relaxed entity-level test F1 score—overlap counted as correct)
| scispaCy |
Stanza |
PubMedBERT |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| jnlpba | bc5cdr | max | jnlpba | bc5cdr | ncbi-disease | bc4chemd | max | |||
| BC5-chem | 7.70 | 91.42 | 91.42 | – | 94.31 | – | 92.61 | 94.31 | 95.18 | 95.37∗ |
| BC5-disease | 2.09 | 88.88 | 88.88 | – | 92.10 | 77.79 | – | 92.10 | 93.34 | 93.74∗ |
| NCBI-disease | 12.94 | 74.64 | 74.64 | – | 81.83 | 93.37 | – | 93.37 | 95.22 | 95.24∗ |
| BC2GM | 68.87 | 15.92 | 68.87 | – | – | – | – | – | 95.56 | 96.05∗ |
| JNLPBA | 87.25 | 20.50 | 87.25 | 88.33 | – | – | – | 88.33 | 88.79 | 88.81∗ |
| Mean Score | 35.77 | 58.27 | 82.21 | 88.33 | 89.42 | 85.58 | 92.61 | 92.02 | 93.62 | 93.84∗ |
Highest performance for task (row).