Table 5.
Results for BERT and Llama 2 and 3 in the test of
BERT |
Llama 2 |
Llama 3 |
|||||||
---|---|---|---|---|---|---|---|---|---|
Cohort | Recall (accuracy) | Precision | Precision (outside) | F1 | F1 (outside) | Exact match | 80% match | Exact match | 80% match |
Superhero | 0.85 | 0.54 | 0.98 | 0.66 | 0.91 | 0.00 | 0.03 | 0.02 | 0.03 |
Dinosaur | 0.90 | 0.53 | 0.96 | 0.67 | 0.93 | 0.02 | 0.03 | 0.02 | 0.03 |
Mammal | 0.87 | 0.51 | 0.98 | 0.64 | 0.92 | 0.04 | 0.06 | 0.02 | 0.04 |
Bird | 0.86 | 0.51 | 0.93 | 0.64 | 0.89 | 0.02 | 0.02 | 0.02 | 0.03 |
“Precision (outside)” means that negative samples contain names outside the union of A, B, and C. “F1(outside)” means the F1 score between Recall and Precision(outside).