Skip to main content
. Author manuscript; available in PMC: 2016 Dec 5.
Published in final edited form as: Psychol Methods. 2016 Jul 25;21(4):603–620. doi: 10.1037/met0000088

Table 5.

General Algorithm for Boosting

Step Number Statistical Procedure Conceptual Purpose
1a Initialize predicted values of y to sample mean Need an initial unconditional “best guess” for outcome
            ***ITERATE ****
2 Draw a bootstrap sample from the data Mimic sampling variation in the observations used to fit the model
3b Compute current residuals: loss function of y vs. current prediction Obtain part of outcome currently unaccounted for the model
4c Fit a model to current residuals Attempt to predict a portion of the unaccounted-for outcome
5d Generate predictions from the current model Predict an additional portion of unaccounted-for outcome
6e Take only predictions from a randomly selected 50% of cases in the bootstrap sample Introduce further random variation in an attempt to avoid overfitting
7f Add scaled predictions of current model to the running prediction. Scaling involved multiplying by a learning rate, or a constant between 0 and 1. Update the current prediction of the outcome with the results of current iteration, but scale the update so no particular iteration exerts an undue impact on the cumulative prediction
8g Go back to step 2 Move to next iteration

Notes: The scaling factor in step 7 is λ, the learning rate. Smaller values correspond to greater shrinkage at each step and a slower fitting process that may require more iterations. The iteration procedure is stop when some form of GCV error, usually the error in the out of bag observations, is stopped.