Table 4.
Accuracy and area under the ROC curve (AUC) results for bag of words (BOW)-based models and the word embedding-based deep learning models along with 95% confidence intervals (CI)
| Category | Modela | AUC (95% CI) | Accuracy | Epochs |
|---|---|---|---|---|
| BOW models | RF | 0.975 (0.967–0.983) | 0.921 | N/A |
| LASS | 0.973 (0.964–0.982) | 0.912 | N/A | |
| SVM | 0.967 (0.957–0.976) | 0.912 | N/A | |
| MLP | 0.947 (0.934–0.960) | 0.883 | N/A | |
| SDT | 0.934 (0.918–0.950) | 0.911 | N/A | |
| NBC | 0.924 (0.908–0.940) | 0.838 | N/A | |
| Deep learning models | CNN_D200 | 0.985 (0.979–0.992) | 0.945 | 30.8 |
| CNN_W2V | 0.985 (0.979–0.991) | 0.942 | 25.0 | |
| CNN_D50 | 0.984 (0.978–0.991) | 0.944 | 36.6 |
aModel abbreviations are described in the text
The number of epochs for training the deep learning is based on the early stopping condition as described in the methods. The entries are sorted in descending order of AUC within each category. Bolding indicates results for the best performing models