Table 7. E-cigarette Use for Smoking Cessation - 10 Fold Cross Validation Area Under the Receiver Operating Curve (AUC) Performance (Range of Performances Across 10 folds).
Encoding Name | Naïve Bayes | Liblinear | Bayesian Logistic Regression | Random Forests |
---|---|---|---|---|
unigram | 0.57 (0.45-0.71) | 0.78 (0.68-0.92) | 0.88 (0.74-1.0) | 0.94 (0.81-0.97) |
bigram | 0.53 (0.38-0.68) | 0.75 (0.65-0.87) | 0.87 (0.73-0.95) | 0.93 (0.86-0.98) |
stem_unigram | 0.59 (0.51-0.78) | 0.80 (0.55-0.90) | 0.89 (0.67-0.98) | 0.93 (0.82-0.99) |
stem_bigram | 0.50 (0.38-0.71) | 0.76 (0.65-0.88) | 0.89 (0.83-0.95) | 0.94 (0.86-0.97) |
stop_unigram | 0.59 (0.49-0.74) | 0.71 (0.40-0.91) | 0.87 (0.76-0.97) | 0.90 (0.81-0.96) |
stop_bigram | 0.59 (0.42-0.70) | 0.69 (0.44-0.82) | 0.86 (0.71-0.98) | 0.88 (0.81-0.98) |
stop_stem_unigram | 0.60 (0.51-0.72) | 0.70 (0.41-0.86) | 0.85 (0.72-0.95) | 0.86 (0.79-0.96) |
stop_stem_bigram | 0.57 (0.46-0.66) | 0.69 (0.41-0.90) | 0.83 (0.37-0.94) | 0.87 (0.80-0.97) |