. 2021 May 21;12:3008. doi: 10.1038/s41467-021-22756-2

Table 2.

Simulation results on continuous outcomes with smaller sample size and/or larger dimension.

ρ	Variable	N = 300, p = 100									N = 500, p = 200
		PermFIT	HRT	PermFIT	Vanilla	PermFIT	SHAP^a	LIME^a	SNGM^a	RFE^a	PermFIT	HRT	PermFIT	Vanilla	PermFIT	SHAP^a	LIME^a	SNGM^a	RFE^a
		DNN	DNN	RF	RF	SVM	DNN	DNN	DNN	SVM	DNN	DNN	RF	RF	SVM	DNN	DNN	DNN	SVM
0	X₁	98	99	99	100	98	100	100	100	100	100	100	100	100	100	100	100	100	100
	$X_{p_{0} + 1}$	86	91	97	100	22	86	26	27	100	100	100	100	100	15	100	20	34	100
	$X_{2 p_{0} + 1}$	96	98	100	100	92	100	99	100	100	100	100	100	100	99	100	95	100	100
	$X_{3 p_{0} + 1}$	54	71	9	19	14	19	14	18	30	83	87	15	27	10	42	15	20	26
	$X_{4 p_{0} + 1}$	46	61	9	26	18	34	16	18	36	89	96	14	13	10	29	8	13	27
	S₀	5.0	16.5	5.3	9.1	4.7	7.4	9.4	8.1	6.9	5.5	22.6	5.3	7.4	5.5	3.0	5.4	3.5	3.1
	S₁	5.5	16.7	5.4	9.2	5.8	6.5	6.4	7.5	6.4	5.6	21.5	5.1	7.0	5.5	3.4	2.5	4.0	3.5
0.2	X₁	97	100	98	100	95	100	100	100	100	100	100	100	100	100	100	100	100	100
	$X_{p_{0} + 1}$	89	93	100	100	37	83	20	22	100	100	100	100	100	29	100	13	31	100
	$X_{2 p_{0} + 1}$	94	97	98	100	84	100	99	100	100	100	100	100	100	98	100	94	100	100
	$X_{3 p_{0} + 1}$	56	70	10	24	24	27	21	22	32	93	96	5	19	13	44	10	16	17
	$X_{4 p_{0} + 1}$	53	69	11	25	21	31	16	18	26	98	95	13	25	15	32	3	13	18
	S₀	5.7	16.7	5.6	11.8	6.3	7.2	9.8	8.2	8.7	6.1	22.1	5.8	11.5	6.3	3.6	5.8	3.9	5.5
	S₁	5.9	17.6	5.4	9.5	6.4	6.7	6.0	7.4	5.0	5.9	21.9	5.0	8.1	5.6	2.8	2.3	3.7	1.4
0.5	X₁	97	96	94	100	90	100	100	100	100	100	100	99	100	99	100	100	100	100
	$X_{p_{0} + 1}$	96	98	98	100	66	94	15	24	100	100	100	100	100	52	100	11	33	100
	$X_{2 p_{0} + 1}$	97	95	99	100	84	100	99	98	100	99	100	100	100	85	100	98	100	100
	$X_{3 p_{0} + 1}$	67	64	13	18	40	34	17	17	9	92	91	14	23	32	56	7	22	1
	$X_{4 p_{0} + 1}$	65	66	9	31	37	24	11	14	6	93	97	15	24	33	58	5	16	0
	S₀	9.1	17.4	8.7	31.8	9.4	7.1	9.3	8.5	14.9	8.3	21.1	9.4	32.1	10.5	3.0	5.7	4.2	7.4
	S₁	6.6	18.1	4.7	11.5	6.3	6.5	6.8	7.2	0.3	6.0	22.3	4.7	10.4	6.2	3.0	2.4	3.3	0.0
0.8	X₁	90	87	69	100	83	100	94	96	97	100	92	82	100	95	100	93	100	100
	$X_{p_{0} + 1}$	85	89	95	100	65	95	11	20	91	100	98	99	100	64	100	8	31	94
	$X_{2 p_{0} + 1}$	90	89	73	100	75	100	80	94	90	99	98	90	100	89	100	84	97	83
	$X_{3 p_{0} + 1}$	62	58	16	33	44	34	11	20	0	80	85	16	40	48	42	6	21	0
	$X_{4 p_{0} + 1}$	63	60	14	39	48	17	7	18	0	81	79	21	42	46	44	2	20	0
	S₀	16.4	17.7	17.8	67.7	19.1	8.5	10.3	10.2	16.0	13.4	21.4	18.9	67.7	18.8	4.2	5.6	5.0	7.6
	S₁	7.4	17.5	4.6	18.2	5.9	5.4	6.6	5.6	0.0	7.1	22.1	4.4	17.2	9.5	2.2	2.8	2.5	0.0

Reported is the percentage of the important variables detected by each method (p value cutoff of 0.05), out of 100 repetitions for each simulation scenario, for five true causal features: X₁, $X_{p_{0} + 1}$ , $X_{2 p_{0} + 1}$ , $X_{3 p_{0} + 1}$ , $X_{4 p_{0} + 1}$ , and two null feature sets: S₀ and S₁.

^aNote that SHAP-DNN, LIME-DNN, SNGM-DNN, and RFE-SVM do not perform formal statistical testing, and features can only be ranked with no associated p values. The reported results for each of these four methods are based on the top 10 selected features for a simple illustration. For PermFIT methods and Vanilla-RF, p values are calculated from one-sided Z test.