. 2022 Sep 28;19(19):12378. doi: 10.3390/ijerph191912378

Table 6.

The highest achievable AUC for the DDC dataset with hyperparameters tuning of the six ML models.

Classifiers	Tuned Hyperparameters	AUC (W/ GSO)	AUC (W/O GSO)
GNB	The classes’ prior probabilities (=None) and features’ largest variance portion for stability guesstimate (= $0.01$ ).	$0.637 \pm 0.008$	$0.628 \pm 0.009$
BNB	Additive Laplace smoothing parameter (=1.0), classes’ prior probabilities (=None), and to learn or not class priors (=True).	$0.637 \pm 0.009$	$0.632 \pm 0.003$
RF	Bootstrap samples or not (=True), split quality function (=gini), the best split feature numbers (=auto), leaf node number for grow trees (=3), leaf node’s samples (=0.4), the samples required to split an internal node ( $= 2$ ), tree numbers in the forest (=100), out-of-bag samples to calculate the generalization score (=False), and the bootstrapping samples’ randomness control with feature sampling for node’ split (=100).	$0.628 \pm 0.000$	$0.628 \pm 0.000$
DT	Split quality function (=entropy), the best split feature numbers (=auto), leaf node’s samples required (=0.5), samples required to split an internal node (=0.1), the bootstrapping samples’ randomness control with feature sampling for node’ split (=100), and node’s partition strategy (=best).	$0.792 \pm 0.025$	$0.675 \pm 0.009$
XGB	Initial prediction score ( $= 0.5$ ), used booster (gbtree), each levels’ subsample ratio (=1), each nodes’ subsample ratio (=1), evaluation metrics for validation data (=error), minimum loss reduction for a further partition on a leaf node (=1.5), weights’ L2 regularization (=1.5), tree depth (=5), child’s hessian sum (=5), trees in the forest (=100), parallel trees built during each iteration (=1), the bootstrapping samples’ randomness control with feature sampling for node’ split (=100), control the unbalance classes (=1), and training subsample ratio (=1.0).	$0.830 \pm 0.007$	$0.811 \pm 0.008$
LGB	Boosting method (=gbdt), class weight (=True), tree construction’s columns subsample ratio (=1.0), base learner tree depth (= $- 1$ ), trees in the forest (=50), the bootstrapping samples’ randomness control with feature sampling for node’ split (=100), base learner tree leaves (=25), and training instance subsample ratio (=0.25).	$0.796 \pm 0.010$	$0.793 \pm 0.012$