Table 5.
The selection of hyper-hyperparameters.
Algorithms Types | Hyper-parameters Values Search Range | Best Parameters |
---|---|---|
Random Forest | n_estimators: linspace(start = 50,stop = 3000,num = 60) | 2800 |
max_features: [‘auto’, ‘sqrt’] | auto | |
max_depth: linspace(start = 10,stop = 500,num = 50) | 430 | |
min_samples_split [2,5,10]: | 2 | |
min_samples_leaf [1,2,4,8]: | 1 | |
Decision Tree | criterion: [‘Gini’, ‘entropy’] | Gini |
min_samples_leaf: linspace(start = 1,stop = 50,num = 10) | 46 | |
min_impurity_decrease: linespace(start = 0,stop = 0.5,num = 20) | 0.0 | |
max_depth [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]: | 6 | |
BernoulliNB | Default | Default |
KNeighbors | n_neighbors [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]]: | 11 |
‘Metric’: [‘euclidean’,’manhattan’,’cheebyshev’,’minkowski’] | minkowski | |
weights: [‘uniform’,’distance’] | distance | |
p [[1], [2], [3], [4], [5], [6]]: | 4 | |
Logistic Regression |
‘C’: [0.25,0.5,0.75,1] | 0.5 |
solver: ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'] | newton-cg | |
penalty: [L1,L2] | L2 | |
GaussianNB | Default | Default |
XGBoost | n_estimators: linespace(start = 50,stop = 3000,num = 60) | 250 |
max_features [1,3,5,7,9]: | 1 | |
max_depth [[3], [4], [5], [6], [7], [8], [9], [10]]: | 10 | |
min_samples_leaf [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]: | 9 | |
min_samples_split [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]: | 9 | |
Catboost | iterations: linspace(start = 50, stop = 3000,num = 10) | 600 |
max_depth [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]: | 9 | |
subsample [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]: | 1 | |
Lightgbm | learning_rate: [0.01,0.02,0.05,0.1,0.15] | 0.1 |
feature_fraction: [0.6, 0.7, 0.8, 0.9, 0.95] | 0.6 | |
max_depth [15,20,25,30,35]: | 30 | |
bagging_fraction: [0.6, 0.7, 0.8, 0.9, 0.95] | 0.6 | |
lambda_l1: [0, 0.1, 0.4, 0.5, 0.6] | 0 | |
lambda_l2: [0, 10, 15, 35, 40] | 0 |