Table 1:
Performance of classification models on training and reserved test dataset. Boldface signifies that DL method is statistically significantly better in the metric, compared to other methods.
| Dataset | Algorithm | Accuracy | SENS | SPEC | F1 Statistic | Balanced accuracy | Computing time/run (sec) |
|---|---|---|---|---|---|---|---|
| Training | DL | 0.909 | 0.978 | 0.747 | 0.952 | 0.777 | 570.68 |
| GBM | 0.906 | 0.600 | 0.945 | 0.666 | 0.772 | 8.291 | |
| LDA | 0.700 | 0.583 | 0.718 | 0.478 | 0.651 | 3.118 | |
| LOG | 0.906 | 0.608 | 0.946 | 0.681 | 0.777 | 5.394 | |
| RF | 0.892 | 0.568 | 0.946 | 0.648 | 0.757 | 21.340 | |
| RPART | 0.801 | 0.605 | 0.895 | 0.620 | 0.750 | 3.525 | |
| SVM | 0.905 | 0.663 | 0.920 | 0.688 | 0.791 | 4.941 | |
| Testing | DL | 0.912 | 0.954 | 0.688 | 0.930 | 0.747 | 1.844 |
| GBM | 0.878 | 0.560 | 0.939 | 0.639 | 0.749 | 0.0152 | |
| LDA | 0.745 | 0.627 | 0.754 | 0.527 | 0.691 | 0.0149 | |
| LOG | 0.873 | 0.550 | 0.943 | 0.634 | 0.747 | 0.0184 | |
| RF | 0.870 | 0.578 | 0.938 | 0.643 | 0.758 | 0.0181 | |
| RPART | 0.767 | 0.609 | 0.861 | 0.589 | 0.735 | 0.0257 | |
| SVM | 0.883 | 0.653 | 0.927 | 0.693 | 0.790 | 0.0218 |