Table 2.
Simulation results on continuous outcomes with smaller sample size and/or larger dimension.
ρ | Variable | N = 300, p = 100 | N = 500, p = 200 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PermFIT | HRT | PermFIT | Vanilla | PermFIT | SHAPa | LIMEa | SNGMa | RFEa | PermFIT | HRT | PermFIT | Vanilla | PermFIT | SHAPa | LIMEa | SNGMa | RFEa | ||
DNN | DNN | RF | RF | SVM | DNN | DNN | DNN | SVM | DNN | DNN | RF | RF | SVM | DNN | DNN | DNN | SVM | ||
0 | X1 | 98 | 99 | 99 | 100 | 98 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
86 | 91 | 97 | 100 | 22 | 86 | 26 | 27 | 100 | 100 | 100 | 100 | 100 | 15 | 100 | 20 | 34 | 100 | ||
96 | 98 | 100 | 100 | 92 | 100 | 99 | 100 | 100 | 100 | 100 | 100 | 100 | 99 | 100 | 95 | 100 | 100 | ||
54 | 71 | 9 | 19 | 14 | 19 | 14 | 18 | 30 | 83 | 87 | 15 | 27 | 10 | 42 | 15 | 20 | 26 | ||
46 | 61 | 9 | 26 | 18 | 34 | 16 | 18 | 36 | 89 | 96 | 14 | 13 | 10 | 29 | 8 | 13 | 27 | ||
S0 | 5.0 | 16.5 | 5.3 | 9.1 | 4.7 | 7.4 | 9.4 | 8.1 | 6.9 | 5.5 | 22.6 | 5.3 | 7.4 | 5.5 | 3.0 | 5.4 | 3.5 | 3.1 | |
S1 | 5.5 | 16.7 | 5.4 | 9.2 | 5.8 | 6.5 | 6.4 | 7.5 | 6.4 | 5.6 | 21.5 | 5.1 | 7.0 | 5.5 | 3.4 | 2.5 | 4.0 | 3.5 | |
0.2 | X1 | 97 | 100 | 98 | 100 | 95 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
89 | 93 | 100 | 100 | 37 | 83 | 20 | 22 | 100 | 100 | 100 | 100 | 100 | 29 | 100 | 13 | 31 | 100 | ||
94 | 97 | 98 | 100 | 84 | 100 | 99 | 100 | 100 | 100 | 100 | 100 | 100 | 98 | 100 | 94 | 100 | 100 | ||
56 | 70 | 10 | 24 | 24 | 27 | 21 | 22 | 32 | 93 | 96 | 5 | 19 | 13 | 44 | 10 | 16 | 17 | ||
53 | 69 | 11 | 25 | 21 | 31 | 16 | 18 | 26 | 98 | 95 | 13 | 25 | 15 | 32 | 3 | 13 | 18 | ||
S0 | 5.7 | 16.7 | 5.6 | 11.8 | 6.3 | 7.2 | 9.8 | 8.2 | 8.7 | 6.1 | 22.1 | 5.8 | 11.5 | 6.3 | 3.6 | 5.8 | 3.9 | 5.5 | |
S1 | 5.9 | 17.6 | 5.4 | 9.5 | 6.4 | 6.7 | 6.0 | 7.4 | 5.0 | 5.9 | 21.9 | 5.0 | 8.1 | 5.6 | 2.8 | 2.3 | 3.7 | 1.4 | |
0.5 | X1 | 97 | 96 | 94 | 100 | 90 | 100 | 100 | 100 | 100 | 100 | 100 | 99 | 100 | 99 | 100 | 100 | 100 | 100 |
96 | 98 | 98 | 100 | 66 | 94 | 15 | 24 | 100 | 100 | 100 | 100 | 100 | 52 | 100 | 11 | 33 | 100 | ||
97 | 95 | 99 | 100 | 84 | 100 | 99 | 98 | 100 | 99 | 100 | 100 | 100 | 85 | 100 | 98 | 100 | 100 | ||
67 | 64 | 13 | 18 | 40 | 34 | 17 | 17 | 9 | 92 | 91 | 14 | 23 | 32 | 56 | 7 | 22 | 1 | ||
65 | 66 | 9 | 31 | 37 | 24 | 11 | 14 | 6 | 93 | 97 | 15 | 24 | 33 | 58 | 5 | 16 | 0 | ||
S0 | 9.1 | 17.4 | 8.7 | 31.8 | 9.4 | 7.1 | 9.3 | 8.5 | 14.9 | 8.3 | 21.1 | 9.4 | 32.1 | 10.5 | 3.0 | 5.7 | 4.2 | 7.4 | |
S1 | 6.6 | 18.1 | 4.7 | 11.5 | 6.3 | 6.5 | 6.8 | 7.2 | 0.3 | 6.0 | 22.3 | 4.7 | 10.4 | 6.2 | 3.0 | 2.4 | 3.3 | 0.0 | |
0.8 | X1 | 90 | 87 | 69 | 100 | 83 | 100 | 94 | 96 | 97 | 100 | 92 | 82 | 100 | 95 | 100 | 93 | 100 | 100 |
85 | 89 | 95 | 100 | 65 | 95 | 11 | 20 | 91 | 100 | 98 | 99 | 100 | 64 | 100 | 8 | 31 | 94 | ||
90 | 89 | 73 | 100 | 75 | 100 | 80 | 94 | 90 | 99 | 98 | 90 | 100 | 89 | 100 | 84 | 97 | 83 | ||
62 | 58 | 16 | 33 | 44 | 34 | 11 | 20 | 0 | 80 | 85 | 16 | 40 | 48 | 42 | 6 | 21 | 0 | ||
63 | 60 | 14 | 39 | 48 | 17 | 7 | 18 | 0 | 81 | 79 | 21 | 42 | 46 | 44 | 2 | 20 | 0 | ||
S0 | 16.4 | 17.7 | 17.8 | 67.7 | 19.1 | 8.5 | 10.3 | 10.2 | 16.0 | 13.4 | 21.4 | 18.9 | 67.7 | 18.8 | 4.2 | 5.6 | 5.0 | 7.6 | |
S1 | 7.4 | 17.5 | 4.6 | 18.2 | 5.9 | 5.4 | 6.6 | 5.6 | 0.0 | 7.1 | 22.1 | 4.4 | 17.2 | 9.5 | 2.2 | 2.8 | 2.5 | 0.0 |
Reported is the percentage of the important variables detected by each method (p value cutoff of 0.05), out of 100 repetitions for each simulation scenario, for five true causal features: X1, , , , , and two null feature sets: S0 and S1.
aNote that SHAP-DNN, LIME-DNN, SNGM-DNN, and RFE-SVM do not perform formal statistical testing, and features can only be ranked with no associated p values. The reported results for each of these four methods are based on the top 10 selected features for a simple illustration. For PermFIT methods and Vanilla-RF, p values are calculated from one-sided Z test.