Skip to main content
. 2025 Jul 31;25:184. doi: 10.1186/s12874-025-02624-z

Table 2.

Out-of-domain evaluation metrics of PubMedBERT fine-tuned on the full SURUS dataset

Label class Interventional Observational
Precision Recall F1 Support Precision Recall F1 Support
Disease 0.99 0.90 0.94 664 0.95 0.87 0.91 302
Drug 0.91 0.85 0.87 4,759 0.81 0.74 0.76 338
Id 1.00 0.98 0.99 341 1.00 1.00 1.00 15
Methodology 0.96 0.89 0.92 3,851 0.91 0.77 0.82 1,627
Parameter 0.83 0.76 0.79 3,003 0.78 0.68 0.73 1,345
Result 0.96 0.96 0.96 5,164 0.93 0.91 0.92 1,976
Therapy 0.97 0.85 0.90 1,273 0.33 0.50 0.40 2
Weighted Mean 0.93 0.88 0.90 19,055 0.88 0.80 0.84 5,605