Skip to main content
. 2019 Dec 2;16(23):4842. doi: 10.3390/ijerph16234842

Table 1.

Summary of parameter values in each model for predicting hepatitis B virus (HBV) infection. Decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost).

Algorithms Parameter Value Meaning
XGBoost nrounds 120 The number of rounds for boosting.
max_depth 8 Maximum depth of a tree.
eta 0.09 Step size shrinkage used in update to prevent overfitting.
gamma 0.04 Minimum loss reduction required to make a further partition on a leaf node of the tree.
colsample_bytree 0.8 The subsample ratio of columns when constructing each tree.
min_child_weight 18 Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than the value, then the building process will give up further partitioning.
subsample 0.89 Subsample ratio of the training instances.
n_estimators 600 Number of base learners in the integrated model.
max_delta_step 9 Maximum delta step we allow each leaf output to be. If it is set to a positive value, it can help making the update step more conservative.
DT minispilt 20 The minimum number of observations that must exist in a node for a split to be attempted.
minibucket 20 The minimum number of observations in any terminal node.
maxdepth 10 The maximum depth of any node of the final tree.
xval 5 Number of cross-validations.
cp (complexity parameter) 0.001 The minimum improvement in the model needed at each node.
RF mtry 6 Number of variables available for splitting at each tree node.
ntree 700 Number of trees to grow.