Table 2.
Performance of the zero-shot inference setting using the GPT-4 Turbo model on delineating the ground-truth labels, as measured by the recall mid-token distance (d).
| Range | Frequency |
| 0≤d<10 | 58 |
| 10≤d<20 | 15 |
| 20≤d<50 | 13 |
| d≥50 | 16 |
Performance of the zero-shot inference setting using the GPT-4 Turbo model on delineating the ground-truth labels, as measured by the recall mid-token distance (d).
| Range | Frequency |
| 0≤d<10 | 58 |
| 10≤d<20 | 15 |
| 20≤d<50 | 13 |
| d≥50 | 16 |