Table 1.
Algorithm | R package | Description |
---|---|---|
I. Spline | • adaptive spline regression flexibly captures interactions and linear and non‐linear associations | |
Adaptive splines | earth (Milborrow, Hastie, Tibshirani, Miller, & Lumley, 2016) | • linear segments (splines) of varying slopes are connected and smoothed to create piece‐wise curves (basis functions) |
• final fit is built using a stepwise procedure that selects the optimal combination of basis functions | ||
Adaptive polynomial splines | polspline (Kooperberg, 2015) | • earth and polymars are generally similar, but differ in the order in which basis functions (e.g. linear versus non‐linear) are added to build the final model |
II. Decision tree | • decision tree methods capture interactions and non‐linear associations | |
Random forest | randomForest (Liaw & Wiener, 2002) | • independent variables are partitioned (based on values) and stacked to build decision trees and ensemble an aggregate “forest” |
• random forest builds numerous trees in bootstrapped samples and generates an aggregate tree by averaging across trees (reducing overfit) | ||
Bayesian additive regression trees (BART) | BayesTree (Chipman & McCulloch, 2016) | • Bayesian trees are based on an underlying probability model (priors) for the structure and likelihood for data in terminal nodes; aggregate tree is generated by averaging across tree posteriors (reducing overfit) |
III. Support vector machines (SVM) | e1401 (Meyer et al., 2015) | • support vector machines treat each independent variable as dimensions in high dimensional space and attempt to identify the best hyperplane to separate the sample into classes (e.g. cases and non‐cases) |
Linear kernel | • goal is to find the hyperplane with the maximum margin between the two closest points in space | |
Polynomial kernel | • captures linear associations, but alternate kernels can be used to capture non‐linearities (polynomial and radial basis kernels were used here) | |
Radial kernel | ||
IV. Generalized boosted regression models | ||
Adaptive boosting | gbm (Freund & Schapire, 1999) |
• adaptive boosting is a meta‐algorithm that iteratively fits decision‐trees using weights to adjust for cases classified incorrectly in the prior iteration • this allows subsequent iterations to focus on predicting more difficult cases |