TABLE 2. Performance of Current BioNER Models on NCBI-Disease, BC5CDRdis, and BC5CDRchem. The Best Scores are Highlighted in Bold, and the Second Best Scores are Underlined.
| Test | ||||||||
|---|---|---|---|---|---|---|---|---|
| Training | Model | Overall | In-depth | COVID-19 | ||||
| P | R | F1 |
![]() |
![]() |
![]() |
|||
| NCBI-disease | PubMedBERT [13] | 86.6 | 88.9 | 87.7 | 94.5 | 81.1 | 77.7 | 36.0 |
| BlueBERT [14] | 85.8 | 89.5 | 87.6 | 95.9 | 80.4 | 76.7 | 13.8 | |
| BioBERT [7] | 86.7 | 90.5 | 88.6 | 95.5 | 80.9 | 84.1 | 45.7 | |
| BERT [10] | 83.8 | 87.6 | 85.6 | 95.0 | 79.0 | 70.7 | 18.7 | |
| DICTsyn | 50.7 | 58.0 | 54.1 | 88.5 | 13.8 | 0.0 | 0.0 | |
| DICTtrain | 52.7 | 55.4 | 54.0 | 88.8 | 0.0 | 0.0 | 0.0 | |
| BC5CDRdis | PubMedBERT [13] | 83.1 | 87.3 | 85.2 | 93.1 | 78.3 | 75.7 | 2.2 |
| BlueBERT [14] | 82.2 | 86.6 | 84.4 | 93.2 | 76.9 | 72.7 | 0.8 | |
| BioBERT [7] | 82.4 | 86.3 | 84.3 | 93.3 | 74.9 | 73.7 | 3.4 | |
| BERT [10] | 78.5 | 81.4 | 79.9 | 91.5 | 64.0 | 63.4 | 0.8 | |
| DICTsyn | 75.4 | 67.8 | 71.4 | 96.0 | 32.9 | 0.0 | 0.0 | |
| DICTtrain | 75.9 | 61.4 | 67.8 | 96.7 | 0.0 | 0.0 | 0.0 | |
| BC5CDRchem | PubMedBERT [13] | 92.1 | 94.2 | 93.1 | 98.3 | 85.5 | 88.2 | - |
| BlueBERT [14] | 92.8 | 92.9 | 92.8 | 98.0 | 81.6 | 86.0 | - | |
| BioBERT [7] | 92.1 | 93.1 | 92.6 | 97.8 | 82.1 | 87.0 | - | |
| BERT [10] | 89.8 | 90.0 | 89.9 | 97.0 | 72.4 | 81.1 | - | |
| DICTsyn | 71.5 | 62.2 | 66.5 | 95.9 | 32.7 | 1.4 | - | |
| DICTtrain | 71.2 | 58.8 | 64.6 | 96.2 | 0.0 | 0.0 | - | |


