Skip to main content
. 2019 Nov 11;27(2):225–235. doi: 10.1093/jamia/ocz191

Figure 1.

Figure 1.

The overall data analysis workflow. The analysis consists of 4 steps: (1) data preprocessing; (2) rule-based classification of the tweets into either promotional information or consumers’ discussions; (3) applying topic modeling to discover major discussion themes and exploring associations between topics in consumers’ Twitter discussions and responses to the 8 human papillomavirus (HPV)–related Health Information National Trends Survey (HINTS) questions; and (4) based on these analyses, answering 3 research questions (RQs). IBM: Integrated Behavior Model; LDA: latent Dirichlet allocation.