. Author manuscript; available in PMC: 2023 Sep 11.

Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:12532–12555. doi: 10.18653/v1/2023.findings-acl.794

Table 5:

A summary of dataset statistics. All datasets are in English. For Art and E-CARE, we show the stats of our adapted versions. Since E-CARE has a hidden test set, we randomly split the original training set into a training and a validation set, and we use the original validation set as our test set. Note that each example in E-CARE asks for either the cause or the effect of the premise.

Dataset	Train				Val		Test
	Likely		Less Likely		Less Likely		Less Likely
MRIInterpret	10097		1005		121		—
Art	50509		50509		1781		3562
E-CARE	cause	effect	cause	effect	cause	effect	cause	effect
	6855	6580	6855	6580	762	731	1088	1044