Skip to main content
. 2017 Sep 26;3(3):e63. doi: 10.2196/publichealth.8060

Table 3.

Ten most important features in predicting Twitter users who tweet about e-cigarettes across all user types.

Featuresa Proportion of feature importance among all variables, %
Statuses count 5.1
Followers count 4.1
Original tweet raw keyword count 3.7
Profile description keyword count 3.3
Original tweet cosine similarity mean 3.2
Retweet cosine similarity mean 3.0
Friends count 3.0
Retweet raw keyword count 3.0
Listed count 2.9
Original tweet URL count mean 2.7
Favorites count 2.7

aMost important feature among each user type—Individual: favorites count (4.9%); Vaper enthusiast: retweet raw keyword count (8.3%); Informed agency: followers count (6.5%); Marketer: original tweet raw keyword counts (8.9%); Spammer: statuses count (8.1%).