Skip to main content
. 2013 Aug 29;15(8):e174. doi: 10.2196/jmir.2534

Table 1.

Performance measures for tobacco relevance, positive sentiment, and negative sentiment classification tasks using 500 features (baseline classification accuracies [majority class] are 57% for relevance, 74% for positive sentiment, and 82% for negative sentiment).

Features Naïve Bayes KNN SVM


Acca F Preb Recc Sped Acc F Pre Rec Spe Acc F Pre Rec Spe
Relevance














Unigrams 0.77 0.83 0.73 0.95 0.53 0.73 0.78 0.73 0.83 0.59 0.82 0.85 0.82 0.88 0.75

Bigrams 0.66 0.77 0.63 0.97 0.24 0.65 0.76 0.63 0.97 0.24 0.73 0.75 0.82 0.69 0.79

Trigrams 0.61 0.74 0.6 0.99 0.1 0.6 0.74 0.59 0.97 0.11 0.61 0.74 0.59 0.99 0.1
Positive sentiment














Unigrams 0.76 0.5 0.56 0.45 0.87 0.76 0.37 0.58 0.27 0.93 0.75 0.38 0.53 0.3 0.91

Bigrams 0.77 0.44 0.62 0.34 0.93 0.76 0.42 0.58 0.33 0.92 0.77 0.43 0.61 0.33 0.92

Trigrams 0.76 0.26 0.62 0.16 0.96 0.76 0.26 0.62 0.17 0.96 0.76 0.27 0.61 0.17 0.96
Negative sentiment














Unigrams 0.84 0.52 0.57 0.48 0.92 0.72 0.3 0.27 0.33 0.8 0.83 0.39 0.53 0.3 0.94

Bigrams 0.85 0.35 0.73 0.23 0.98 0.31 0.3 0.18 0.82 0.2 0.84 0.44 0.59 0.35 0.95

Trigrams 0.84 0.24 0.76 0.14 0.99 0.22 0.3 0.18 0.94 0.07 0.84 0.37 0.66 0.25 0.97

aAcc: accuracy.

bPre: precision.

cRec: recall.

dSpe: specificity.