Skip to main content
. Author manuscript; available in PMC: 2020 Nov 6.
Published in final edited form as: Stat Med. 2020 Jun 24;39(23):3059–3073. doi: 10.1002/sim.8591

TABLE 3.

Candidate algorithms included in the ensemble Super Learner library

Prediction Algorithm (approximate) Controls per Case Eligible Covariates Notes
Logistic Regression 50, 20, or 10 All Weighted
(14 variants) 50, 20, or 10 Pre-Selected Weighted
50, 20, or 10 All
50, 20, or 10 Pre-selected
50 All Backward selection
50 Pre-Selected Backward selection
Lasso 20 or 10 All Deviance loss
(8 variants) 20 or 10 All neg AUC loss
20 or 10 Pre-Selected Deviance loss
20 or 10 Pre-Selected neg AUC loss
Ridge Regression 20 or 10 All Deviance loss
(8 variants) 20 or 10 All neg AUC loss
20 or 10 Pre-Selected Deviance loss
20 or 10 Pre-Selected neg AUC loss
Random Forest 50, 20, or 10 All 10,000 trees, 1/3 of covariates sampled per split
(6 variants) 50, 20, or 10 Pre-Selected 10,000 trees, 1/3 of covariates sampled per split
Support Vector Machine 50 All Tuning parameters chosen by cross validation
(2 variants) 50 Pre-Selected Tuning parameters chosen by cross validation
Neural Network 20 or 10 Pre-Selected 10 nodes in 1hidden layer
(4 variants) 20 or 10 Pre-Selected 5 nodes in 1 hidden layer

Note: Risk scores were re-scaled to account for undersampling of controls, except for weighted logistic regression algorithms.