Table 1.
List of training datasets used. Sentences vary in size, the composition of implicit sentences, and the annotation style.
| Dataset | Sentences | Implicit | Domain | Mean tokens per C/E |
|---|---|---|---|---|
| MedCaus [44] | 8682 | 17% | Medical | 8.41 / 7.68 |
| CauseNet-noncause [46] | 5000 | 0% | General | 1.61 / 1.5 |
| CauseNet-cause [46] | 5000 | 0% | General | 1.53 / 1.46 |
| SemEval 2010 | 1003 [43] | 34% | General | 1.06 / 1.02 |
| CausalTimeBank [47] | 298 | 54.7% | News | 1 / 0.99 |
| FinCausal2020 [48] | 1719 | 78.7% | Financial | 23.72 / 10.26 |
| Total Train | 15191 | |||
| Total Test | 6511 | |||