Skip to main content
. 2023 Aug 22;9(9):e19265. doi: 10.1016/j.heliyon.2023.e19265

Table 1.

List of training datasets used. Sentences vary in size, the composition of implicit sentences, and the annotation style.

Dataset Sentences Implicit Domain Mean tokens per C/E
MedCaus [44] 8682 17% Medical 8.41 / 7.68
CauseNet-noncause [46] 5000 0% General 1.61 / 1.5
CauseNet-cause [46] 5000 0% General 1.53 / 1.46
SemEval 2010 1003 [43] 34% General 1.06 / 1.02
CausalTimeBank [47] 298 54.7% News 1 / 0.99
FinCausal2020 [48] 1719 78.7% Financial 23.72 / 10.26



Total Train 15191
Total Test 6511