Table 3. Comparison of Prediction Performance among Models.
“Exploratory” models used all 109 predictors. “Final” models used 9 predictors (Table 2), which were selected based on previous literature and the variable importance from the exploratory gradient boosting model. Discrimination and calibration of these models were assessed via AUC and Brier score, respectively, using the 0.632+ bootstrap validation method. The scaled Brier score represents the relative reduction in Brier score achieved with a prediction model compared to a “null” model.
Model | AUC | Brier Score | Scaled Brier Score |
---|---|---|---|
Gradient Boosting | |||
Exploratory | 0.660 (0.595, 0.707) | 0.167 (0.152, 0.183) | 1.7% (−5.7%, 6.7%) |
Final | 0.704 (0.648, 0.761) | 0.159 (0.145, 0.176) | 4.6% (−3.2%, 11.0%) |
Random Forests | |||
Exploratory | 0.693 (0.640, 0.757) | 0.157 (0.141, 0.176) | 6.6% (3.9%, 10.2%) |
Final | 0.732 (0.695, 0.786) | 0.151 (0.133, 0.168) | 8.3% (2.1%, 14.0%) |
Logistic Regression | |||
Final | 0.717 (0.699, 0.732) | 0.148 (0.141, 0.154) | 10.8% (8.9%, 12.5%) |
Abbreviations: AUC, area under receiver operating characteristics curve.