Table 2.
Example of comparison models for the classification of six clusters obtained for the case of Physical Health (pre-injury) with HDclassif method.
Model | Mean accuracy % (f1_macro) | 95% CI for accuracy | Optimized hyper-parameters |
---|---|---|---|
Logistic regression | 36.53 (33.33) | [35.60–37.46] | Solver = ‘newton-cg’, C = 10, penalty = ‘l2’ |
Logistic regression (under-sampling) | 36.11 (34.10) | [34.44–37.79] | Solver = ‘newton-cg’, C = 10−2, penalty = ‘l2’ |
Logistic regression (smote) | 37.88 (35.53) | [36.83–38.94] | Solver = ‘saga’, C = 10−2, penalty = ‘l2’ |
Logistic regression (over-sampling) | 37.20 (35.64) | [35.96–38.45] | Solver = ‘lib-linear’, C = 10−2, penalty = ‘l2’ |
Random forest | 36.98 (35.89) | [34.69–39.27] | Estimators = 200, max depth = 15, min samples split = 5 |
Random forest (under-sampling) | 36.07 (35.12) | [34.24–37.89] | Estimators = 50, max depth = 5, min samples split = 10 |
Random forest (smote) | 54.75 (53. 67) | [52.79–56.70] | Estimators = 500, max depth = 50, min samples split = 2 |
Random forest (over-sampling) | 69.12 (68.71) | [67.81–70.44] | Estimators = 500, max depth = 50, min samples split = 2 |
XGBClassifier (over-sampling) | 68.52 (67.78) | [67.37–69.67] | Estimators = 500, max depth = 10 |
XGBClassifier (smote) | 57.14 (56.23) | [55.95–58.34] | Estimators = 500, max depth = 20 |