Table 2.
Out-of-domain evaluation metrics of PubMedBERT fine-tuned on the full SURUS dataset
| Label class | Interventional | Observational | ||||||
|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1 | Support | Precision | Recall | F1 | Support | |
| Disease | 0.99 | 0.90 | 0.94 | 664 | 0.95 | 0.87 | 0.91 | 302 |
| Drug | 0.91 | 0.85 | 0.87 | 4,759 | 0.81 | 0.74 | 0.76 | 338 |
| Id | 1.00 | 0.98 | 0.99 | 341 | 1.00 | 1.00 | 1.00 | 15 |
| Methodology | 0.96 | 0.89 | 0.92 | 3,851 | 0.91 | 0.77 | 0.82 | 1,627 |
| Parameter | 0.83 | 0.76 | 0.79 | 3,003 | 0.78 | 0.68 | 0.73 | 1,345 |
| Result | 0.96 | 0.96 | 0.96 | 5,164 | 0.93 | 0.91 | 0.92 | 1,976 |
| Therapy | 0.97 | 0.85 | 0.90 | 1,273 | 0.33 | 0.50 | 0.40 | 2 |
| Weighted Mean | 0.93 | 0.88 | 0.90 | 19,055 | 0.88 | 0.80 | 0.84 | 5,605 |