Table 4.
Results for BERT and Llama 2 and 3 in the test of
BERT |
Llama 2 |
Llama 3 |
|||||||
---|---|---|---|---|---|---|---|---|---|
Cohort | Recall (accuracy) | Precision | Precision (outside) | F1 | F1 (outside) | Exact match | 80% match | Exact match | 80% match |
Superhero | 0.95 | 0.51 | 0.96 | 0.66 | 0.95 | 0.17 | 0.17 | 0.16 | 0.17 |
Dinosaur | 0.98 | 0.50 | 0.95 | 0.66 | 0.96 | 0.18 | 0.20 | 0.16 | 0.17 |
Mammal | 0.96 | 0.52 | 0.98 | 0.67 | 0.97 | 0.14 | 0.16 | 0.13 | 0.18 |
Bird | 1.00 | 0.52 | 0.96 | 0.68 | 0.98 | 0.10 | 0.12 | 0.14 | 0.13 |
“Precision (outside)” means that negative samples contain names outside the union of A, B, and C. “F1(outside)” means the F1 score between Recall and Precision(outside).