Table 2:
Comparison of (micro-F1) performance with recent generative (except SpERT) approaches in RE. Relation triplets/pairs are considered correct only if both of the corresponding entity types are correctly generated.
| Method | Params | CONLL | ADE | NYT | |
|---|---|---|---|---|---|
|
| |||||
| 1. Fully supervised | a. SpERT* (Eberts and Ulges, 2019b) | 110M | 71.54 | 79.22 | − |
| b. TANL (Paolini et al., 2021) | 220M | 71.48 | 80.61 | 90.83 | |
| c. TANL (MT) (Paolini et al., 2021) | 220M | 72.66 | 80.00 | 90.52 | |
| d. REBEL (Huguet Cabot and Navigli, 2021) | 460M | 75.44 | 82.21 | 92.00 | |
| e. Flan T5 (Large) (Chung et al., 2022) | 760M | 75.28 | 83.15 | 91.03 | |
| f. + GPT-3-generated CoT | 760M | 80.76 | 92.17 | 95.23 | |
|
| |||||
| 2. Few-shot | a. In-Context GPT-3 (Brown et al., 2020a) | 175B | 76.53 | 82.66 | 61.79 |
| b. + CoT | 175B | 78.18 | − | − | |
| c. Flan T5 (Large) w/CoT Explanations and reference labels generated from GPT-3 |
760M | 76.13 | − | − | |