Table 3.
Subtask | Run | System | Precision | Recall | F1 |
---|---|---|---|---|---|
A | 1 | Seq2seq | 0.9093 | 0.8925 | 0.9008 |
2 | Class-Ensemble | 0.8360 | 0.8692 | 0.8522 | |
3 | Class-RoBERTa | 0.8213 | 0.8580 | 0.8392 | |
B | 1 | Seq2seq | 0.8108 | 0.7400 | 0.7738 |
2 | Class-RoBERTa | 0.6921 | 0.7256 | 0.7085 | |
3 | Class-Ensemble | 0.6916 | 0.7170 | 0.7041 | |
C | 1 | Seq2seq (SHAC SHACW) | 0.8906 | 0.8867 | 0.8886 |
2 | Seq2seq (SHACM + SHACW) | 0.8800 | 0.8804 | 0.8802 | |
3 | Class-RoBERTa (SHACM + SHACW) | 0.7423 | 0.8468 | 0.7911 |
Note: Subtasks A, B, and C correspond to extraction, generalization, and transfer learning. “Class-Ensemble” is the ensemble of the classification-based approaches while “Class-RoBERTa” is the classification approach that used RoBERTa alone. In the transfer learning subtask C, “SHAC SHACW” means that a model was first fine-tuned on SHACM and then fine-tuned on SHACW. “SHACM + SHACW” means that a model was fine-tuned on both SHACM and SHACW together. The highest scores are bolded.
SHAC: Social History Annotation Corpus.