Table 3.
Performance metrics of various models using the SGE feature set evaluated on the CT-ADE-SOC28 test split. All metrics are micro-averaged.
| Model Type | Parameters (x109) | Backbone | Precision (%) | Recall (%) | F1-score (%) | Accuracy (%) | Balanced Accuracy (%) |
|---|---|---|---|---|---|---|---|
| MAJ | 0 | — | 0.00 | 0.00 | 0.00 | 88.68 | 50.00 |
| Discriminative | 0.11 | ChemBERTa & PubMedBERT | 51.65 | 55.40 | 53.46 | 89.08 | 74.39 |
| Generative | 7 – 8 | Meditron | 52.82 | 53.84 | 53.32 | 89.33 | 73.85 |
| OpenBioLLM | 52.18 | 54.75 | 53.43 | 89.20 | 74.17 | ||
| Llama-3 | 53.60 | 58.42 | 55.90 | 89.57 | 75.98 | ||
| 70 | Meditron | 61.01 | 44.10 | 51.20 | 90.49 | 70.25 | |
| OpenBioLLM | 60.28 | 42.42 | 49.79 | 90.32 | 69.42 | ||
| Llama-3 | 62.09 | 49.30 | 54.96 | 90.86 | 72.73 |