Table 3.
Results for one-way ANOVA for understandability metric, followed by Tukey’s HSD post-hoc test between different levels of agreement. Note that the mean value for E4 (our counterfactual explanation) is the highest, indicating that our explanations helped users the most in understanding the classifier’s decision.
Understandability F(3, 284) = 3.39 p < 0.05 |
Helpfulness F(3, 284) = 21.5 p < 0.001 |
||||
---|---|---|---|---|---|
Explanation method | p | Explanation method | p | ||
E1 (No explanation) M=3.14 SD=1.39 |
E2 E3 E4 |
* |
E1 M=0.05 SD=0.23 |
E2 E3 E4 |
*** |
E2 (Saliency Map) M=3.31 SD=1.13 |
E1 E3 E4 |
E2 M=0.18 SD=0.39 |
E1 E3 E4 |
*** |
|
E3 (CycleGAN) M=3.24 SD=1.19 |
E1 E2 E4 |
E3 M=0.16 SD=0.37 |
E1 E2 E4 |
*** |
|
E4 (Our counterfactual explanation) M=3.72 SD=0.97 |
E1 E2 E3 |
* | E4 M=0.24 SD=0.42 |
E1 E2 E3 |
***
*** *** |
p < 0.05
p < 0.0001