Table 1.
Model | Implementation | Hyperparameter levels |
---|---|---|
Decision tree | rpart (R) | Cost_complexity: 0.00001, 0.0001, 0.001, 0.01, 0.1 |
Logistic regression | glmnet (R) |
Penalty: 0.000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10 Mixture: 0, 0.5, 1 |
Random forest | ranger (R) |
mtry : 1, 4, 7 trees: 500, 1000, 2000 |
XGBoost | xbgoost (python/gpu accel) |
max_depth : 3, 6, 9 n_estimators: 100, 500, 1000 gamma: 1, 10, 100 |
All models (including the EBMs) were trained on a compute cluster with 512 GB memory, 2 20-core Intel Xeon E52698 v4 CPUs, and 8 Nvidia Tesla V100 (per node). Each job was allowed 3 days of compute time. Of all the algorithms used (including EBMs), only xgboost was able to take advantage of GPU acceleration.