. 2023 Aug 22;9(9):e19265. doi: 10.1016/j.heliyon.2023.e19265

Table 1.

List of training datasets used. Sentences vary in size, the composition of implicit sentences, and the annotation style.

Dataset	Sentences	Implicit	Domain	Mean tokens per C/E
MedCaus [44]	8682	17%	Medical	8.41 / 7.68
CauseNet-noncause [46]	5000	0%	General	1.61 / 1.5
CauseNet-cause [46]	5000	0%	General	1.53 / 1.46
SemEval 2010	1003 [43]	34%	General	1.06 / 1.02
CausalTimeBank [47]	298	54.7%	News	1 / 0.99
FinCausal2020 [48]	1719	78.7%	Financial	23.72 / 10.26

Total Train	15191
Total Test	6511