Skip to main content
. Author manuscript; available in PMC: 2016 Jan 21.
Published in final edited form as: Pac Symp Biocomput. 2016;21:480–491.

Table 7. E-cigarette Use for Smoking Cessation - 10 Fold Cross Validation Area Under the Receiver Operating Curve (AUC) Performance (Range of Performances Across 10 folds).

Encoding Name Naïve Bayes Liblinear Bayesian Logistic Regression Random Forests
unigram 0.57 (0.45-0.71) 0.78 (0.68-0.92) 0.88 (0.74-1.0) 0.94 (0.81-0.97)
bigram 0.53 (0.38-0.68) 0.75 (0.65-0.87) 0.87 (0.73-0.95) 0.93 (0.86-0.98)
stem_unigram 0.59 (0.51-0.78) 0.80 (0.55-0.90) 0.89 (0.67-0.98) 0.93 (0.82-0.99)
stem_bigram 0.50 (0.38-0.71) 0.76 (0.65-0.88) 0.89 (0.83-0.95) 0.94 (0.86-0.97)
stop_unigram 0.59 (0.49-0.74) 0.71 (0.40-0.91) 0.87 (0.76-0.97) 0.90 (0.81-0.96)
stop_bigram 0.59 (0.42-0.70) 0.69 (0.44-0.82) 0.86 (0.71-0.98) 0.88 (0.81-0.98)
stop_stem_unigram 0.60 (0.51-0.72) 0.70 (0.41-0.86) 0.85 (0.72-0.95) 0.86 (0.79-0.96)
stop_stem_bigram 0.57 (0.46-0.66) 0.69 (0.41-0.90) 0.83 (0.37-0.94) 0.87 (0.80-0.97)