. 2023 Sep 14;13:15213. doi: 10.1038/s41598-023-42542-y

Table 2.

Performance comparison of machine learning models: random forest (RF), XGBoost (XGB), decision trees (DT), logistic regression (LogR), and support vector machines (SVM) for a varying number of features selected with permutation importance.

Metric	# of features	DT	LogR	RF	XGB	SVM
F1	8	0.489 ± 0.138	0.56 ± 0.13	0.429 ± 0.151	0.546 ± 0.127	0.361 ± 0.114
F1	12	0.502 ± 0.134	0.529 ± 0.123	0.316 ± 0.162	0.547 ± 0.135	0.409 ± 0.1
F1	16	0.493 ± 0.155	0.514 ± 0.13	0.31 ± 0.168	0.54 ± 0.134	0.483 ± 0.063
F1	20	0.49 ± 0.149	0.514 ± 0.125	0.214 ± 0.167	0.514 ± 0.145	0.483 ± 0.064
F1	All	0.421 ± 0.13	0.374 ± 0.117	0.064 ± 0.107	0.433 ± 0.145	0.487 ± 0.061

Performance is measured with F1 ± standard deviation (SD) for the test set.

Parameters of machine learning methods: RF, DT, SVM: class_weight = ‘balanced’, XGB: scale_pos_weight = counts[class1]/counts[class2]; LogR: max_iter = 10,000.

Significant values are in bold.