. 2022 Oct 10;12:16990. doi: 10.1038/s41598-022-21390-2

Table 2.

Example of comparison models for the classification of six clusters obtained for the case of Physical Health (pre-injury) with HDclassif method.

Model	Mean accuracy % (f₁_macro)	95% CI for accuracy	Optimized hyper-parameters
Logistic regression	36.53 (33.33)	[35.60–37.46]	Solver = ‘newton-cg’, C = 10, penalty = ‘l₂’
Logistic regression (under-sampling)	36.11 (34.10)	[34.44–37.79]	Solver = ‘newton-cg’, C = 10⁻², penalty = ‘l₂’
Logistic regression (smote)	37.88 (35.53)	[36.83–38.94]	Solver = ‘saga’, C = 10⁻², penalty = ‘l₂’
Logistic regression (over-sampling)	37.20 (35.64)	[35.96–38.45]	Solver = ‘lib-linear’, C = 10⁻², penalty = ‘l₂’
Random forest	36.98 (35.89)	[34.69–39.27]	Estimators = 200, max depth = 15, min samples split = 5
Random forest (under-sampling)	36.07 (35.12)	[34.24–37.89]	Estimators = 50, max depth = 5, min samples split = 10
Random forest (smote)	54.75 (53. 67)	[52.79–56.70]	Estimators = 500, max depth = 50, min samples split = 2
Random forest (over-sampling)	69.12 (68.71)	[67.81–70.44]	Estimators = 500, max depth = 50, min samples split = 2
XGBClassifier (over-sampling)	68.52 (67.78)	[67.37–69.67]	Estimators = 500, max depth = 10
XGBClassifier (smote)	57.14 (56.23)	[55.95–58.34]	Estimators = 500, max depth = 20