Table 5.
Baseline results of different language models on the biomedical NER tasks.7
| Dataset | GPT-3.5 | Llama-2 | Claude-2 |
|---|---|---|---|
| NCBI-disease | 33.39 | 4.58 | 45.75 |
| BC2GM | 31.99 | 5.95 | 40.45 |
| BC5CDR-chem | 41.25 | 12.21 | 58.05 |
| BC5CDR-disease | 32.26 | 5.68 | 50.13 |
| JNLPBA | 31.89 | 4.30 | 34.62 |