Table III.
Hyperparameter | Definition | Significance/usefulness | Grid search range |
---|---|---|---|
ETA | ETA is the shrinkage of the learning rate at each step | ETA prevents the model from overfitting by scaling contribution of each tree. | From 0.01 to 1 at a step size of 0.01 |
Max Depth | Maximum depth of each tree | Max depth affects the complexity of each tree and the overfitting of the model. Deeper trees can overfit the data. | 1, 2, 3, 4 and 5 |
Minimum Child Weight | The minimum weight required in order to create a new node in each tree | Adjusting the value can prevent overfitting and reduce the model complexity. | 1, 2, 3, 4 and 5 |
γ | Minimum loss reduction required to create a further partition on a tree's leaf node | Increasing γ causes the model to be more conservative. | 1, 2, 3, 4 and 5 |
Nround | The number of training rounds within the model | Increasing Nrounds can reduce biases and variance in a model. | Varies depending on model error/loss and 200 Nround early stopping |
ETA, estimated time of arrival (also known as the learning rate in R user documentation for XGBoost).