Table 5:
A summary of dataset statistics. All datasets are in English. For Art and E-CARE, we show the stats of our adapted versions. Since E-CARE has a hidden test set, we randomly split the original training set into a training and a validation set, and we use the original validation set as our test set. Note that each example in E-CARE asks for either the cause or the effect of the premise.
| Dataset | Train | Val | Test | |||||
|---|---|---|---|---|---|---|---|---|
| Likely | Less Likely | Less Likely | Less Likely | |||||
| MRIInterpret | 10097 | 1005 | 121 | — | ||||
| Art | 50509 | 50509 | 1781 | 3562 | ||||
| E-CARE | cause | effect | cause | effect | cause | effect | cause | effect |
| 6855 | 6580 | 6855 | 6580 | 762 | 731 | 1088 | 1044 | |