Table 1.
Setting | Functional form of interaction | True optimal treatment | Covariates type | Method | Misclassification | Value |
---|---|---|---|---|---|---|
1 | Tree | 1 or 2 or 3 | Continuous | l1-PLS-HGL | 0.101 (0.016) | 1.226 (0.010) |
l1-PLS-GL | 0.094 (0.015) | 1.229 (0.010) | ||||
ACWL | 0.028 (0.040) | 1.276 (0.023) | ||||
D-learning | 0.100 (0.020) | 1.226 (0.013) | ||||
BART | 0.010 (0.004) | 1.286 (0.003) | ||||
2 | Linear | 1 or 2 or 3 | Continuous | l1-PLS-HGL | 0.015 (0.004) | 1.736 (0.003) |
l1-PLS-GL | 0.013 (0.004) | 1.737 (0.003) | ||||
ACWL | 0.171 (0.020) | 1.662 (0.016) | ||||
D-learning | 0.018 (0.005) | 1.737 (0.004) | ||||
BART | 0.056 (0.004) | 1.730 (0.006) | ||||
3 | Nonlinear | 1 or 2 or 3 | Continuous | l1-PLS-HGL | 0.566 (0.012) | 1.089 (0.008) |
l1-PLS-GL | 0.565 (0.014) | 1.088 (0.011) | ||||
ACWL | 0.561 (0.016) | 1.089 (0.009) | ||||
D-learning | 0.572 (0.013) | 1.087 (0.008) | ||||
BART | 0.192 (0.038) | 1.209 (0.010) | ||||
4 | Nonlinear | 1 or 2 or 3 | Continuous | l1-PLS-HGL | 0.350 (0.011) | 1.129 (0.004) |
l1-PLS-GL | 0.352 (0.010) | 1.128 (0.004) | ||||
ACWL | 0.362 (0.011) | 1.118 (0.004) | ||||
D-learning | 0.359 (0.016) | 1.129 (0.004) | ||||
BART | 0.163 (0.045) | 1.220 (0.012) | ||||
5 | Nonlinear | 1 or 2 or 3 | Continuous + binary+ categorical discrete uniform {1, 5} | l1-PLS-HGL | 0.077 (0.019) | 1.101 (0.012) |
l1-PLS-GL | 0.078 (0.018) | 1.101 (0.011) | ||||
ACWL | 0.029 (0.028) | 1.129 (0.016) | ||||
D-learning | 0.090 (0.032) | 1.094 (0.019) | ||||
BART | 0.007 (0.005) | 1.142 (0.002) | ||||
6 | Tree | 1 | Continuous | l1-PLS-HGL | 0.000 (0.000) | 2.093 (0.000) |
l1-PLS-GL | 0.000 (0.000) | 2.093 (0.000) | ||||
ACWL | 0.000 (0.000) | 2.093 (0.000) | ||||
D-learning | 0.000 (0.000) | 2.093 (0.000) | ||||
BART | 0.000 (0.000) | 2.093 (0.000) |
Note: Methods under comparison include the l1-penalized least squares with hierarchical group LASSO variable selection (l1-PLS-HGL), l1-penalized least squares with group LASSO variable selection (l1-PLS-GL), adaptive contrast weighted learning (ACWL), direct learning (D-learning), and Bayesian additive regression trees (BART). The smallest misclassification rates and the largest value functions for each setting are in bold.