Skip to main content
. Author manuscript; available in PMC: 2016 Apr 16.
Published in final edited form as: J Am Stat Assoc. 2015 Apr 16;110(512):1770–1784. doi: 10.1080/01621459.2015.1036994

Table 3.

Tuning parameter settings

Lasso 10-fold cross-validation is used with α = 1 for the lasso penalty. We use lambda:min and lambda:1se for λ.
Boosting A total number of 1000 trees are fit. Testing error is calculated for every 20 trees. n:minobsinnode = 2, n1/3, 10. learning rate shrinkage = 0.001, 0.01, 0.1, interaction:depth = 1, 3, 5.
BART A total of 18 settings: ntrees = 50 or 200; Sigma prior: (3, 0.90), (3, 0.99), (10, 0.75); μ prior: 2, 3, 5.
RF A total of 36 settings: ntrees = 500, 1000; mtry=p, p/3, p; nodesize = 2, n1/3. Bootstrap sample ratio = 1, 0.8, 2/3.
RF-p
Select the top p important variables from each RF model and refit with the same settings as RF (with mtry recalculated accordingly).
RF-log(p) Similar as RF-p, however with top log(p) variables selected.
ET ntrees = 500, 1000; mtry=p, p/3, p; nodesize = 2, n1/3; numRandomCuts = 1, 5.
RLT-naive ntrees = 1000; nodesize = 2, n1/3; muting rate = 0%, 50%, 80%. Bootstrap sample ratio = 1, 0.8, 2/3. number of random splits = 10 or all possible splits.
RLT M = 100 trees with nmin = n1/3 are fit to each RLT model. We consider a total of 9 settings: k = 1, 2, 5, with no muting (pd = 0), moderate muting ( pd=50%·P\PAd), and aggressive muting ( pd=80%·P\PAd) as discussed in Remark 2.3. We set the number of protected variables p0 = log(p) to be on par with RF-log(p). Note that when pd = 0, all variables are considered at each internal node, hence no protection is needed. This is on par with RF.
HHS Vulnerability Disclosure