Table 3.
Results for BERT and Llama 2 and 3 in the test of
BERT | Llama 2 | Llama 3 | |||||||
---|---|---|---|---|---|---|---|---|---|
Cohort | Recall (accuracy) | Precision | Precision (outside) | F1 | F1 (outside) | Exact match | 80% match | Exact match | 80% match |
Superhero | 0.94 | 0.55 | 1.00 | 0.69 | 0.97 | 0.07 | 0.08 | 0.10 | 0.12 |
Dinosaur | 0.96 | 0.54 | 0.98 | 0.69 | 0.97 | 0.10 | 0.12 | 0.10 | 0.12 |
Mammal | 1.00 | 0.56 | 0.93 | 0.72 | 0.96 | 0.14 | 0.14 | 0.11 | 0.15 |
Bird | 0.96 | 0.53 | 0.96 | 0.68 | 0.96 | 0.08 | 0.10 | 0.09 | 0.10 |
“Precision (outside)” means that negative samples contain names outside the union of A, B, and C. “F1(outside)” means the F1 score between Recall and Precision(outside).