. 2023 Mar 18;10:144. doi: 10.1038/s41597-023-01974-x

Table 5.

Evaluation of GNN explainers for real-world molecular datasets with ground-truth explanations.

Dataset	Method	GEA (↑)	GEF (↓)
Mutag	Random	0.044 ± 0.007	0.590 ± 0.031
	Grad	0.022 ± 0.006	0.598 ± 0.030
	GradCAM	0.085 ± 0.012	0.672 ± 0.029
	GuidedBP	0.036 ± 0.007	0.649 ± 0.030
	Integrated Grad (IG)	0.049 ± 0.010	0.443 ± 0.031
	GNNExplainer	0.031 ± 0.005	0.618 ± 0.030
	PGMExplainer	0.042 ± 0.007	0.503 ± 0.031
	PGExplainer	0.046 ± 0.007	0.504 ± 0.031
	SubgraphX	0.039 ± 0.007	0.611 ± 0.030
Benzene	Random	0.108 ± 0.003	0.513 ± 0.012
	Grad	0.122 ± 0.007	0.262 ± 0.011
	GradCAM	0.291 ± 0.007	0.551 ± 0.012
	GuidedBP	0.205 ± 0.007	0.438 ± 0.012
	Integrated Grad (IG)	0.044 ± 0.003	0.182 ± 0.010
	GNNExplainer	0.129 ± 0.005	0.444 ± 0.012
	PGMExplainer	0.154 ± 0.006	0.433 ± 0.012
	PGExplainer	0.169 ± 0.007	0.375 ± 0.012
	SubgraphX	0.371 ± 0.009	0.513 ± 0.012
Fl-Carbonyl	Random	0.087 ± 0.007	0.440 ± 0.26
	Grad	0.132 ± 0.010	0.210 ± 0.021
	GradCAM	0.005 ± 0.007	0.500 ± 0.026
	GuidedBP	0.089 ± 0.010	0.315 ± 0.024
	Integrated Grad (IG)	0.091 ± 0.007	0.174 ± 0.019
	GNNExplainer	0.094 ± 0.009	0.423 ± 0.026
	PGMExplainer	0.078 ± 0.008	0.426 ± 0.026
	PGExplainer	0.079 ± 0.009	0.372 ± 0.025
	SubgraphX	0.008 ± 0.002	0.466 ± 0.026

Arrows (↑/↓) indicate the direction of better performance. Integrated Gradient explanations obtain the lowest unfaithfulness score across all three datasets. Note that stability and fairness do not apply to these datasets because generating plausible perturbations for molecules is non-trivial, and they do not contain protected features.