Skip to main content
. 2023 Feb 18;13(2):387. doi: 10.3390/biom13020387

Table 2.

Evaluation metrics when using different classification schemes for GPT-3 synonyms and using manual labels as a proxy for ground truth.

Index Term GPT-3 Synonym Criteria Precision Recall F1 Score F2 Score
Alprazolam All generated terms 0.264 1.000 0.418 0.642
Fentanyl All generated terms 0.220 1.000 0.361 0.585
Alprazolam All RedMed terms 1.000 0.178 0.302 0.213
Fentanyl All RedMed terms 1.000 0.115 0.206 0.140
Alprazolam Drug name filter 0.285 0.996 0.443 0.664
Fentanyl Drug name filter 0.232 1.000 0.377 0.602
Alprazolam Drug name & frequency filters 0.567 0.487 0.524 0.501
Fentanyl Drug name & frequency filters 0.521 0.465 0.491 0.475
Alprazolam Drug name & Google filters 0.698 0.859 0.770 0.821
Fentanyl Drug name & Google filters 0.568 0.793 0.662 0.735
Alprazolam Drug name, frequency, & Google filters 0.859 0.431 0.574 0.479
Fentanyl Drug name, frequency, & Google filters 0.770 0.395 0.522 0.438