Skip to main content
. 2022 Oct 10;12:16990. doi: 10.1038/s41598-022-21390-2

Table 2.

Example of comparison models for the classification of six clusters obtained for the case of Physical Health (pre-injury) with HDclassif method.

Model Mean accuracy % (f1_macro) 95% CI for accuracy Optimized hyper-parameters
Logistic regression 36.53 (33.33) [35.60–37.46] Solver = ‘newton-cg’, C = 10, penalty = ‘l2
Logistic regression (under-sampling) 36.11 (34.10) [34.44–37.79] Solver = ‘newton-cg’, C = 10−2, penalty = ‘l2
Logistic regression (smote) 37.88 (35.53) [36.83–38.94] Solver = ‘saga’, C = 10−2, penalty = ‘l2
Logistic regression (over-sampling) 37.20 (35.64) [35.96–38.45] Solver = ‘lib-linear’, C = 10−2, penalty = ‘l2
Random forest 36.98 (35.89) [34.69–39.27] Estimators = 200, max depth = 15, min samples split = 5
Random forest (under-sampling) 36.07 (35.12) [34.24–37.89] Estimators = 50, max depth = 5, min samples split = 10
Random forest (smote) 54.75 (53. 67) [52.79–56.70] Estimators = 500, max depth = 50, min samples split = 2
Random forest (over-sampling) 69.12 (68.71) [67.81–70.44] Estimators = 500, max depth = 50, min samples split = 2
XGBClassifier (over-sampling) 68.52 (67.78) [67.37–69.67] Estimators = 500, max depth = 10
XGBClassifier (smote) 57.14 (56.23) [55.95–58.34] Estimators = 500, max depth = 20