Table 4:
Average estimates (95% CI)‡ |
|||||
---|---|---|---|---|---|
SVM models | Sensitivity | Specificity | PPV | NPV | AUC |
Unigram only, linear kernel, no tuning | 0.51 (0.34 to 0.68) | 0.99 (0.98 to 1.00) | 0.75 (0.62 to 0.88) | 0.96 (0.95 to 0.98) | 0.92 (0.87 to 0.96)† |
Unigram only, RBF kernel, no tuning | 0.39 (0.32 to 0.46) | 0.98 (0.97 to 0.99) | 0.63 (0.45 to 0.81) | 0.95 (0.94 to 0.96) | 0.93 (0.92 to 0.95)† |
Unigram only, linear kernel, with tuning | 0.53 (0.36 to 0.70) | 0.98 (0.97 to 0.99) | 0.75 (0.65 to 0.85) | 0.96 (0.94 to 0.98) | 0.92 (0.88 to 0.96)† |
Unigram only, RBF kernel, with tuning | 0.55 (0.42 to 0.67) | 0.98 (0.97 to 0.99) | 0.70 (0.54 to 0.86) | 0.96 (0.95 to 0.98) | 0.95 (0.93 to 0.97)† |
Uni + bigrams, linear kernel, no tuning | 0.60 (0.45 to 0.75) | 0.99 (0.98 to 1.00) | 0.85 (0.77 to 0.93) | 0.97 (0.95 to 0.98) | 0.95 (0.90 to 1.00) |
Uni + bigrams, RBF kernel, no tuning | 0.40 (0.33 to 0.47) | 0.98 (0.97 to 0.99) | 0.67 (0.51 to 0.83) | 0.95 (0.94 to 0.96) | 0.95 (0.93 to 0.96)† |
Uni + bigrams, linear kernel, tuning | 0.61 (0.46 to 0.76) | 0.99 (0.98 to 1.00) | 0.84 (0.76 to 0.92) | 0.97 (0.95 to 0.98) | 0.95 (0.90 to 1.00) |
Uni + bigrams, RBF kernel, tuning | 0.66 (0.49 to 0.83) | 0.99 (0.98 to 1.00) | 0.80 (0.68 to 0.93) | 0.97 (0.96 to 0.99) | 0.96 (0.92 to 1.00) |
Uni + bigrams, linear kernel, tuning, all features | 0.78 (0.72 to 0.85) | 0.99 (0.98 to 0.99) | 0.84 (0.76 to 0.91) | 0.98 (0.98 to 0.99) | 0.99 (0.98 to 1.00) |
Uni + bigrams, RBF kernel, tuning, all features | 0.79 (0.73 to 0.85) | 0.99 (0.98 to 0.99) | 0.84 (0.75 to 0.92) | 0.98 (0.98 to 0.99) | 0.99 (0.98 to 1.00)* |
Bold typeface is used to highlight the characteristics of the best performing SVM model. *p<0.001; statistically significant difference in performance compared to alternative SVM models.
†Statistically significantly different compared to the best performing SVM model (i.e., Uni + bigrams, RBF kernel, tuning, all features).
‡Averages correspond to the mean accuracy estimates obtained after 10 rounds of cross-validation.
AUC, area under the curve; NPV, negative predictive value; PPV, positive predictive value; RBF, radial basis function kernel; SVM, support vector machine.