Skip to main content
[Preprint]. 2023 Oct 17:arXiv:2306.10070v2. [Version 2]

Table 4.

Performance of LLMs for RE compared to SOTA on selected datasets (F1-score in %).

LM Method BC5CDR CHEMPROT DDI GAD
SOTA Task fine-tuning 57.03 77.24 82.36 83.96
BioGPT Task fine-tuning and few-shot 46.17 40.76
GPT-3 Few-shot 25.90 16.10 66.00
SPIRES Zero-shot 40.65
GPT-3.5 Zero-shot 57.43 33.49
One-shot 61.91 34.40
ChatGPT Zero-shot or few-shot 34.16 51.62 52.43
GPT-4 Zero-shot 66.18 63.25
One-shot 65.43 65.58