. 2024 Sep 27;14:22281. doi: 10.1038/s41598-024-73733-w

Table 2.

Optimal value and explanation for optimal hyperparameters of machine learning algorithms on the unstratified data with core features.

Model	Hyperparameter	Explanation	Value
RF	ntree	Number of trees to grow	625
	mtry	Number of variables randomly sampled as candidates at each split	1
	nodesize	Minimum size of terminal nodes, increasing the nodesize leads to the growth of smaller trees and reduces the time required to fit the model.	4
SVM	kernel	Kernel functions for model training and prediction, including linear and radial kernels.	linear
SVM	cost	Cost of constraints violation	0.4
XGBoost	eta	The learning rate, a larger ‘eta’ value results in a more conservative boosting process, increasing the risk of underfitting, while a smaller value may lead to overfitting.	0.05
	max_depth	Maximum depth of individual learners (classification trees)	2
	subsample	The subsample proportion of the training instances, when set to 0.5 means half of the training samples are randomly selected for each learner, aiding in preventing overfitting.	0.5
	colsample_bytree	Percentage of columns selected when training individual learners	0.3
	gamma	Minimum loss required for further division of leaf nodes for an individual learner (classification tree)	10
	nrounds	Maximum number of boosting iterations	150