Skip to main content

View full-text article in PMC

. 2017 Jul 4;26(3):e1575. doi: 10.1002/mpr.1575

Table 1.

Overview of the algorithms used in the analysis

Algorithm	R package	Description
I. Spline		• adaptive spline regression flexibly captures interactions and linear and non‐linear associations
Adaptive splines	earth (Milborrow, Hastie, Tibshirani, Miller, & Lumley, 2016)	• linear segments (splines) of varying slopes are connected and smoothed to create piece‐wise curves (basis functions)
Adaptive splines		• final fit is built using a stepwise procedure that selects the optimal combination of basis functions
Adaptive polynomial splines	polspline (Kooperberg, 2015)	• earth and polymars are generally similar, but differ in the order in which basis functions (e.g. linear versus non‐linear) are added to build the final model
II. Decision tree		• decision tree methods capture interactions and non‐linear associations
Random forest	randomForest (Liaw & Wiener, 2002)	• independent variables are partitioned (based on values) and stacked to build decision trees and ensemble an aggregate “forest”
Random forest	randomForest (Liaw & Wiener, 2002)	• random forest builds numerous trees in bootstrapped samples and generates an aggregate tree by averaging across trees (reducing overfit)
Bayesian additive regression trees (BART)	BayesTree (Chipman & McCulloch, 2016)	• Bayesian trees are based on an underlying probability model (priors) for the structure and likelihood for data in terminal nodes; aggregate tree is generated by averaging across tree posteriors (reducing overfit)
III. Support vector machines (SVM)	e1401 (Meyer et al., 2015)	• support vector machines treat each independent variable as dimensions in high dimensional space and attempt to identify the best hyperplane to separate the sample into classes (e.g. cases and non‐cases)
Linear kernel		• goal is to find the hyperplane with the maximum margin between the two closest points in space
Polynomial kernel		• captures linear associations, but alternate kernels can be used to capture non‐linearities (polynomial and radial basis kernels were used here)
Radial kernel
IV. Generalized boosted regression models
Adaptive boosting	gbm (Freund & Schapire, 1999)	• adaptive boosting is a meta‐algorithm that iteratively fits decision‐trees using weights to adjust for cases classified incorrectly in the prior iteration • this allows subsequent iterations to focus on predicting more difficult cases