Table 4.
Accuracies as measured by % correct, AUC, and F1 obtained by five different ML classifiers (using 10-fold cross-validation and 5-fold cross-validation) with a preliminary best attributes selection.
Classifier | Accuracy (%) | AUC | F1 | Correct classification | |
---|---|---|---|---|---|
10-FOLD CROSS-VALIDATION | |||||
Naïve Bayes | 86.48 | 0.90 (d = 1.81) | 0.87 | FMS 33/38 | HC 33/38 |
Logistic regression | 84.21 | 0.77 (d = 1.04) | 0.84 | FMS 32/33 | HC 32/33 |
Simple logistics | 94.74 | 0.93 (d = 2.09) | 0.95 | FMS 36/38 | HC 36/38 |
Support vector machine | 90.79 | 0.91 (d = 1.90) | 0.91 | FMS 34/35 | HC 35/38 |
Random forest | 88.16 | 0.93 (d = 2.09) | 0.88 | FMS 34/38 | HC 33/38 |
5-FOLD CROSS CROSS-VALIDATION | |||||
Naïve Bayes | 82.89 | 0.90 (d = 1.81) | 0.83 | FMS 32/38 | HC 31/38 |
Logistic regression | 75.00 | 0.74 (d = 0.90) | 0.75 | FMS 30/38 | HC 27/38 |
Simple logistics | 86.84 | 0.87 (d = 1.59) | 0.87 | FMS 33/38 | HC 33/38 |
Support vector machine | 86.84 | 0.87 (d = 1.59) | 0.87 | FMS 33/38 | HC 33/38 |
Random forest | 85.53 | 0.92 (d = 1.98) | 0.86 | FMS 34/38 | HC 31/38 |
Perfect classification of exemplars in the two categories has an AUC of 1 and a F1 of 1. AUC stands for Area Under the Curve in ROC analysis and F1. In order to compare AUC with the best know effect size measure Cohen's d, is included. Classifiers were run with default parameters of Weka and therefore without any parameter tuning.