Table 5.
General Algorithm for Boosting
| Step Number | Statistical Procedure | Conceptual Purpose |
|---|---|---|
| 1a | Initialize predicted values of y to sample mean | Need an initial unconditional “best guess” for outcome |
| ***ITERATE **** | ||
| 2 | Draw a bootstrap sample from the data | Mimic sampling variation in the observations used to fit the model |
| 3b | Compute current residuals: loss function of y vs. current prediction | Obtain part of outcome currently unaccounted for the model |
| 4c | Fit a model to current residuals | Attempt to predict a portion of the unaccounted-for outcome |
| 5d | Generate predictions from the current model | Predict an additional portion of unaccounted-for outcome |
| 6e | Take only predictions from a randomly selected 50% of cases in the bootstrap sample | Introduce further random variation in an attempt to avoid overfitting |
| 7f | Add scaled predictions of current model to the running prediction. Scaling involved multiplying by a learning rate, or a constant between 0 and 1. | Update the current prediction of the outcome with the results of current iteration, but scale the update so no particular iteration exerts an undue impact on the cumulative prediction |
| 8g | Go back to step 2 | Move to next iteration |
Notes: The scaling factor in step 7 is λ, the learning rate. Smaller values correspond to greater shrinkage at each step and a slower fitting process that may require more iterations. The iteration procedure is stop when some form of GCV error, usually the error in the out of bag observations, is stopped.