Abstract
Introduction:
The hemophilia community on Twitter is diverse, consisting of advocacy groups, patients, physicians, researchers, and other users. However, the scope of this community is uncharacterized, and limited data are available regarding effective participation in this community.
Aim:
To assess the types of users active in the hemophilia community on Twitter, as well as major themes present in hemophilia-related tweets.
Methods:
49,512 tweets between September 2019 and September 2021 were classified using regular expressions. A subset of the classified tweets was manually analyzed to identify prevalent discussion themes.
Results:
Among the top 250 users by post count, the largest categories of users were support and advocacy groups, people with bleeding disorders, and healthcare providers. The largest thematic categories of tweets were gene therapy, contaminated hemophilia blood products, hemophilia research, clinical management of hemophilia, and COVID-19. While misinformation was rare, negative and incorrect perceptions of hemophilia were present among the general public.
Conclusion:
Our results demonstrate patterns of effective Twitter usage for patient care, research, and advocacy purposes among the hemophilia community.
Keywords: hemophilia, education, social media, text classification, natural language processing, data science
Introduction
The global proliferation of social media has important implications for healthcare. More than 40% of healthcare consumers report using social media to obtain healthcare information1, and a growing number of physicians and researchers are using social media for professional purposes, including hematologists2–4. Concurrently, disease advocacy groups use social media as an education and promotion platform5.
The use of social media usage among chronic disease communities has been studied in the context of multiple chronic diseases, including cancer6, diabetes7, and hearing loss8. The use of social media for chronic disease has been found to have positive social- and emotional support-related outcomes9–11 and to promote adaptive health behaviors, such as increased physical activity12. Conversely, the prevalence of inaccurate or biased information on these platforms may have a negative impact on patient education13. However, little is known about the use of social media within the hemophilia community or the use of the term “hemophilia” on social media.
Our study addresses these questions in three ways. Firstly, we classify the individuals driving the hemophilia conversation on Twitter, one of the largest social media platforms worldwide. Secondly, we categorize the major themes present in hemophilia-related tweets, the temporality of these tweets, and the sentiments they express. Finally, we investigate public perceptions of hemophilia as well as misinformation among hemophilia-related tweets. Based on the results, we identify strategies for better engagement and reveal unmet needs where the community needs to better educate the public.
Methods
We used Twitter API v2 to retrieve all non-retweet, English-language tweets containing either “hemophilia” or “haemophilia” published between September 1, 2019 and September 1, 2021, yielding 57,027 tweets (Figure 1). We performed quality filtering by removing duplicate tweets (n = 6,159) and tweets containing stock tickers (1,356). To quantify user interaction with tweets, we calculated an engagement score for each tweet, defined as the sum of likes, retweets, and replies.
To classify tweets, we selected 181 topic-differentiating regular expressions from three sources: 1) frequent n-grams identified by the Python Natural Language Toolkit14; 2) topic keywords identified by Latent Dirichlet Allocation using the Python ktrain library15; 3) and author-contributed keywords not identified by the two automated methods. We used 44 of the 181 expressions to identify tweets containing non-medical references to hemophilia (Supplemental Table 1), such as hemophilia presence in royal families. Non-medical references were used for the public perceptions analysis. The other 137 expressions were medically relevant and grouped into 11 categories (Supplemental Table 2), individually checked for specificity, and applied to the remaining 45,590 tweets. The 200 tweets with the most engagement within each category were independently screened by two authors to identify common discussion themes. Two authors also independently screened the 1,000 tweets with most engagement among the medically relevant tweets to identify misinformation.
To analyze the categories of users responsible for hemophilia tweets, we first identified 250 users with the most authorships among the 45,590 tweets. The descriptions, usernames, and most recent tweets of these users were independently screened and categorized by two authors into 12 categories with 97.2% agreement; the remaining 7 users were arbitrated by the third author.
To perform sentiment analysis of tweets, we used the VADER (Valence Aware Dictionary and Sentiment Reasoner) package, which is available open source (https://github.com/cjhutto/vaderSentiment). VADER is a social media sentiment analysis tool that has previously been used to analyze tweets in medical contexts16,17. The output of VADER is a compound score normalized between −1 (most negative) and +1 (most positive), with scores below −0.05 classified as negative, scores between −0.05 and 0.05 classified as neutral, and scores above 0.05 classified as positive.
All data analysis was conducted in Python. Heatmaps and violin plots were generated using the Seaborn and Matplotlib libraries. The Wilcoxson rank-sum test was performed using the SciPy library, and Bonferroni correction was applied using the statsmodels library.
Results
User account classification analysis
From September 2019 to September 2021, we identified 16,392 unique users who posted at least one tweet containing either “hemophilia” or “haemophilia.” Within this period, 12,952 users posted only one such tweet, while the most active user posted 968 hemophilia-related tweets. We manually classified the 250 most active accounts into 12 major categories based on primary self-identification (Table 1; Supplemental Table 3). These accounts represented 16 countries (Supplemental Figure 1), including the USA (60), UK (32), and India (9).
Table 1:
Category | User count | Average engagement per post |
---|---|---|
support and advocacy | 70 | 5.82 |
person with bleeding disorder | 35 | 12.51 |
non-journal media | 30 | 1.55 |
physician | 19 | 9.02 |
pharmaceutical organization | 16 | 11.58 |
non-physician provider | 9 | 12.69 |
journal | 8 | 9.13 |
continuing education | 8 | 2.89 |
professional organization | 6 | 10.79 |
pharmacy | 6 | 1.02 |
research | 5 | 11.11 |
hospitals and treatment centers | 2 | 1.06 |
unidentifiable | 36 | 5.52 |
214 users were classified into one of 12 major categories, while 36 users could not be classified. The category “support and advocacy” is defined as individuals or groups supporting or advocating for patients with bleeding disorders or other rare diseases, while the category “non-journal media” is defined as non-journal accounts posting or affiliated with news, blogs, podcasts, and magazines. In cases where users had multiple roles (e.g., physicians who also performed research), categories were assigned based on the following priority: person with bleeding disorder, physician, research, non-physician provider, support and advocacy.
70 of the most active accounts belonged to organizations or individuals participating in support and advocacy for hemophilia, such as the National Hemophilia Foundation (747) and The Haemophilia Society UK (688). 35 people with bleeding disorders (PwBDs), primarily hemophilia A, hemophilia B, and Von Willebrand disease, were also among the 250 most active accounts, eighteen of whom self-identified as bleeding disorder advocates. Healthcare providers, including 19 physicians and nine non-physician providers, were also present.
Examining the popularity of tweets, we calculated a user-averaged engagement score, defined as the sum of likes, retweets, and replies, for each category. Despite support and advocacy groups having the most activity, taken as a whole, they had an average engagement score of 5.82 per post (Table 1). In contrast, non-physician providers had the highest score of 12.69, followed by PwBDs (12.51), pharmaceutical organizations (11.58), and researchers (11.11).
The sentiment of tweets posted by most categories of users ranged from neutral to highly positive (Figure 2). However, only tweets made by non-physician providers and support and advocacy groups had a statistically significantly higher compound sentiment than tweets made by all other users (Supplemental Table 4).
Additionally, 135 of the most active accounts belonged to organizations, while 115 belonged to individuals. Organization accounts were more likely than individual users to explicitly state their focus on hemophilia in their username or description, making them more accessible for non-specialists interested in hemophilia content on Twitter.
Major themes of content
We classified 20,316 of the 45,590 (44.6%) tweets into at least one of 11 categories with gene therapy and blood product contamination being the most frequent (Figure 3). Certain categories had a high incidence of co-occurrence; for example, 1,060 tweets related to gene therapy and 339 tweets related to hemophilia management also contained research keywords.
The major themes present in tweets varied by user type (Supplemental Figure 2). Analysis of the 250 most active users demonstrated that among the 11 thematic categories, the average patient most often tweeted about contaminated blood products. In comparison, physicians and researchers most often tweeted about gene therapy, research, and hemophilia management. Support and advocacy users most often tweeted about World Hemophilia Day and awareness, contaminated blood products, and hemophilia management.
Tweets in the athletics, World Hemophilia Day, management, and research categories had a statistically significantly higher compound sentiment than all other tweets (Figure 4; Supplemental Table 5). Tweets discussing athletics had the highest median compound sentiment score and discussed the ability of people with hemophilia to participate in physical activities, as well as notable athletes with hemophilia. Tweets discussing World Hemophilia Day were generally positive and aimed to improve patient quality of life and hemophilia awareness. Conversely, tweets in the gene therapy, inhibitors, mental health, symptoms, and contamination categories had a statistically significantly lower compound sentiment than all other tweets. Only one topic category, contamination, had a median negative sentiment.
Gene therapy tweets:
This was the largest identified category. These tweets were primarily informational, delivering updates regarding the development process of therapeutics. Many tweets named a specific pharmaceutical company, such as BioMarin (719), Spark Therapeutics (138), and Sangamo (122), and/or its gene therapy product, such as Valrox (170) or AMT-061 (75). Several gene therapy trial participants also shared their experiences of gene therapy, both for hemophilia A and hemophilia B; these accounted for the posts with the second and fifth most engagement in the gene therapy category. While most tweets had a positive outlook of gene therapy, some raised concerns regarding the durability of gene therapy, long-term risks, and adverse events. 99 tweets also raised concerns regarding the financial accessibility of gene therapy, many in response to the anticipated 2–3 million dollar cost of BioMarin’s product18.
Contaminated hemophilia blood product tweets:
This was the second largest category and the most prominent theme for patients. 1,964 of these mentioned human immunodeficiency virus or acquired immunodeficiency syndrome, while 276 tweets mentioned hepatitis C. Many tweets criticized the mishandling of the crisis and lack of accountability by the UK’s National Health Service (97), paralleling the ongoing Infected Blood Inquiry in the United Kingdom19. Other tweets criticized pharmaceutical companies and prominent politicians for their roles in the crisis. Some users also discussed their own experiences with the contaminated blood scandal, either as a person with hemophilia or as a friend or family member of someone affected by the crisis.
Hemophilia research tweets:
Researchers from many fields actively use Twitter for professional purposes, including asking for advice, disseminating their research, seeking collaborations, and announcing open positions20, and the hemophilia community is no exception. Hemophilia researchers used Twitter to announce publications (212), abstracts (108), successful acquisition of funding (47), preprints (5), and open research positions (4). Tweets contained both translational and clinical research topics, such as novel hemophilia therapies, hemophilia management and patient care (339), and clinical trials (536). Some tweets explicitly referenced scientific conferences, including American Society for Hematology (376), International Society on Thrombosis Hemostasis (321), and American Society for Gene and Cell Therapy (14).
Hemophilia management tweets:
This topic included non-gene therapy therapeutics, such as emicizumab/Hemlibra (410) and efmoroctocog alfa/Eloctate (13), approaches to prophylaxis for hemophilia A (443), and clinical practice guidelines (286). 51 tweets mentioning emicizumab, including the post with the seventh most engagement in this category, raised concerns on its adverse effects, namely thrombotic microangiopathy, thrombosis, and death.
Tweets mentioning COVID-19:
While this category covered a diverse range of topics, only a subset of these tweets made a meaningful connection between COVID-19 and hemophilia. Some users expressed concern over hemophilia care during COVID, with one user posting that they were hesitant to take their child with hemophilia to the hospital because of high rates of COVID cases in their area. At least 6 posts expressed concern over shortages of hemophilia medication and blood donations. Clinicians, researchers, and professional groups used Twitter to discuss the intersection of COVID-19 and hemophilia care, such as managing COVID-19 coagulopathy in hemophilia A patients (14), the use of telemedicine (11), and the impact of COVID-19 on hemophilia treatment centers (9). Additionally, 622 posts explicitly mentioned vaccines; while a small number of users expressed concern regarding SARS-CoV-2 vaccination in individuals with hemophilia, many support and advocacy groups and providers provided guidance that those with bleeding disorders should still receive the vaccine. There were also 24 posts discussing reported incidences of acquired hemophilia A post SARS-CoV-2 vaccination.
Temporal analysis
The rate of hemophilia-related tweets remained stable between September 2019 and September 2021, as did the number of unique users per month. This is consistent with stagnation of growth of Twitter as a whole since 201521. In both metrics, there were major spikes on April 17 corresponding to World Hemophilia Day, and minor spikes corresponding to other notable events (Figure 5). For example, the peak in August 2020 can be attributed to the FDA BioMarin rejection on August 18, 2020, which prompted more than 300 gene-therapy related posts in the same week. Tweets related to COVID-19 peaked in March and April 2020 and declined thereafter, although conversations regarding COVID-19 vaccination grew rapidly between November 2020 and January 2021, when vaccination efforts commenced.
Minor spikes were also detected whose timing corresponded to hematology research conferences. For example, 274 posts tagged with “ASH” (including “ASH19” and “ASH20”), 288 posts tagged with “ISTH” (including “ISTH20” and “ISTH21”), and 218 posts tagged with “WFHVirtualSummit” appeared primarily during the week of the respective conference. These spikes are in part due to targeted interventions by conference organizers to increase tweets, impressions, and engagements; in fact, ISTH appointed “Twitter Ambassadors” for both the 2020 and 2021 conferences (including author BSJ for 2021)22. However, we identified some hemophilia-related tweets that do not explicitly say “hemophilia,” making them difficult to identify by search (Supplemental Table 6).
In every month between September 2019 and September 2021, the median tweet had either a positive or neutral compound score, and very few tweets had a compound negative score. Although there was a slight decline in the median compound sentiment in March 2020, the sentiment distribution of all tweets did not change in a significant manner between the months before, during, and after the initial onset of COVID-19 (Supplemental Figure 3), and tweets mentioning both hemophilia and COVID-19 had a median positive sentiment.
Public perceptions of hemophilia
Unexpectedly, we found 5,278 posts referencing hemophilia in a non-medical context. For example, 134 tweets used hemophilia as an analogy to refute claims that the COVID-19 death count was inflated by including people with comorbidities. The general argument of these tweets was that if a person with hemophilia dies due to an external event causing blood loss, the cause of death is the external event rather than hemophilia. Similar references to hemophilia in argument from analogy were also made in 44 posts mentioning the case of the State of Minnesota v. Officer Derek Chauvin, for the death of George Floyd. Here again, hemophilia was used in analogy for an underlying condition in arguments regarding the primary cause of this event. The use of hemophilia in these analogies suggests that many people are unaware of modern prophylaxis therapy in hemophilia.
Since hemophilia is sometimes referred to as the “royal disease,” many people associate hemophilia with royal families (2293). Some tweets discussed two recent television series, “The Irregulars” (16) and “The Last Czars” (4), that reference hemophilia in royal family members. There were also associations of hemophilia with inbreeding (448), sometimes in the context of royal families. While the effects of these associations have not been studied, they may reflect unawareness of hemophilia’s prevalence across ethnicities, as well as negative perceptions of people with hemophilia.
Misinformation
The spread of misinformation on social media adversely affects public health. Among the 1,000 tweets with the most engagement, tweets with explicit misinformation were rare, with five posts describing ‘miracle cures’ for hemophilia, one post inaccurately describing plasma as the only treatment for hemophilia, and one post exaggerating the severity of hemophilia (Supplemental Table 7). There were also a significant number of patient-directed informational posts, some from well-known hemophilia advocacy groups. While these posts were not necessarily misinformation, they may provide health advice that is irrelevant or harmful for a specific patient (Supplemental Table 8). Considering this data, healthcare providers should be aware of the misinformation and irrelevant healthcare advice prevalent online, and the potential for their patients to engage with this content.
Discussion
This study investigated the scope of the hemophilia community on Twitter, specifically the types of users involved, major themes and sentiments present in tweets, and the prevalence of misinformation. Among the 250 most active accounts, which originate from 16 countries, the largest category are organizations and individuals participating in hemophilia support and advocacy. These accounts also have the most followers and generate posts with high engagement, and thus exert significant influence on hemophilia-related discourse on Twitter. While PwBDs, providers, and researchers have less representation among the 250 most active accounts, they generate more engagement per post than the average support or advocacy account, suggesting their perspectives are in high demand. These findings highlight the potential for individual PwBDs, providers, and researchers to increase engagement in the hemophilia Twitter community.
We also classified 44.6% of all tweets into at least one of 11 major categories. The largest categories were gene therapies and contaminated hemophilia blood products, with several highly active accounts dedicated to these topics. Other large categories included World Hemophilia Day, research, hemophilia management, and COVID-19. Of note, contaminated blood products represented a higher proportion of tweets from patient users than other user categories, suggesting a potential disconnect between the concerns of patients and other members of the hemophilia community. Likewise, we note that tweets regarding adverse events receive substantial engagement, highlighting the ongoing interest of safety consideration in hemophilia therapies.
World Hemophilia Day appears to be a highly effective mechanism for social media engagement around hemophilia, with marked spikes in the temporal analysis in both years studied. Our results strongly support the continued promotion of this event. Interestingly, despite explicit social media outreach by ISTH22, ISTH and ASH appear to generate comparably few tweets mentioning hemophilia; we recommend that tweet authors tag all hemophilia-related tweets with “#hemophilia” to improve searchability. Our results also demonstrate the utility of social media for the rapid dissemination of information to the hemophilia community; this is illustrated by the tweets mentioning COVID-19 as well as the temporal spike around the FDA’s ruling on Biomarin’s Valrox in 2020.
It is unsurprising that hemophilia, the most well-known bleeding disorder, is widely referenced outside of the hemophilia community. While tweets containing explicit misinformation were rare (seven among the 1,000 tweets with the most engagement), our results reveal a high prevalence of negative perceptions of hemophilia, including the false and highly stigmatizing association with inbreeding. We are also surprised about the use of hemophilia in rhetorical arguments by analogy over divisive controversial social issues. Our findings suggest that negative connotations of those with hemophilia persist among the public, despite highly active awareness campaigns on social media. Educating the public about hemophilia remains an unmet need.
Limitations of this study primarily concern inaccuracies in tweet thematic categorization. While regular expressions used for classification were selected to be as specific as possible, misclassification of some tweets is unavoidable. Furthermore, although hemophilia is a global phenomenon, we limited our study to English-speaking populations and did not examine differences in perspectives on hemophilia-related issues among these regions.
In conclusion, the results of our comprehensive analysis of the hemophilia community on Twitter demonstrate the importance of social media for disseminating information associated with hemophilia, highlight effective engagement strategies, and identify the need for better hemophilia education among the public.
Supplementary Material
Acknowledgments
RC and BJSJ designed and performed the research and wrote the paper. KM performed the research and wrote the paper. The authors have no competing interests. BJSJ receives support from the National Hemophilia Foundation and NIH/NHLBI 1K08HL140078. Research reported in this publication was supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under award number TL1TR001880. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Footnotes
Data Availability Statement
The data that support the findings of this study can be obtained using the Twitter API. Restrictions apply to the availability of these data, which were used under an academic license for this study. The Tweet IDs of the posts used in this study and the code needed to replicate the analyses of this study are available at https://github.com/robchiral/Hemophilia-Twitter-Analysis/.
References
- 1.Surani Z, Hirani R, Elias A, et al. Social media usage among health care providers. BMC Research Notes. 2017;10(1):654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mishori R, Singh LO, Levy B, et al. Mapping Physician Twitter Networks: Describing How They Work as a First Step in Understanding Connectivity, Information Flow, and Message Diffusion. J Med Internet Res. 2014;16(4):e107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Connell NT, Weyand AC, Barnes GD. Use of Social Media in the Practice of Medicine. The American Journal of Medicine. September 2021. [DOI] [PubMed]
- 4.Zaidi AU, Glaros AK, Weyand AC. Navigating a new terrain: how Twitter is changing hematologists. Blood Adv. 2021;5(1):277–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Attai DJ, Cowher MS, Al-Hamadani M, et al. Twitter Social Media is an Effective Tool for Breast Cancer Patient Education and Support: Patient-Reported Outcomes by Survey. J Med Internet Res. 2015;17(7):e188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Himelboim I, Han JY. Cancer Talk on Twitter: Community Structure and Information Sources in Breast and Prostate Cancer Social Networks. Journal of Health Communication. 2014;19(2):210–25. [DOI] [PubMed] [Google Scholar]
- 7.Greene JA, Choudhry NK, Kilabuk E, et al. Online Social Networking by Patients with Diabetes: A Qualitative Evaluation of Communication with Facebook. J GEN INTERN MED. 2011;26(3):287–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Crowson MG, Tucci DL, Kaylie D. Hearing loss on social media: Who is winning hearts and minds? Laryngoscope. 2018;128(6):1453–61. [DOI] [PubMed] [Google Scholar]
- 9.Chiu Y-C, Hsieh Y-L. Communication online with fellow cancer patients: writing to be remembered, gain strength, and find survivors. J Health Psychol. 2013;18(12):1572–81. [DOI] [PubMed] [Google Scholar]
- 10.Sugawara Y, Narimatsu H, Hozawa A, et al. Cancer patients on Twitter: a novel patient community on social media. BMC Research Notes. 2012;5(1):699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pagoto S, Schneider KL, Evans M, et al. Tweeting it off: characteristics of adults who tweet about a weight loss attempt. J Am Med Inform Assoc. 2014;21(6):1032–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Valle CG, Tate DF, Mayer DK, et al. A randomized trial of a Facebook-based physical activity intervention for young adult cancer survivors. J Cancer Surviv. 2013;7(3):355–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Burel G, Farrell T, Alani H. Demographics and topics impact on the co-spread of COVID-19 misinformation and fact-checks on Twitter. Inf Process Manag. 2021;58(6):102732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bird S, Klein E, Loper E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. 1st edition. Beijing; Cambridge Mass.: O’Reilly Media; 2009. [Google Scholar]
- 15.Maiya AS. ktrain: A Low-Code Library for Augmented Machine Learning. arXiv:200410703 [cs]. July 2020.
- 16.Valdez D, ten Thij M, Bathina K, et al. Social Media Insights Into US Mental Health During the COVID-19 Pandemic: Longitudinal Analysis of Twitter Data. J Med Internet Res. 2020;22(12):e21418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bathina KC, ten Thij M, Valdez D, et al. Declining well-being during the COVID-19 pandemic reveals US social inequities. PLOS ONE. 2021;16(7):e0254114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ahle S BioMarin Sets High Price Tag for Hemophilia Gene Therapy Candidate. ASH Clinical News. March 2020.
- 19.Burki T Infected blood inquiry in the UK. The Lancet Infectious Diseases. 2021;21(8):1078–79. [DOI] [PubMed] [Google Scholar]
- 20.Cheplygina V, Hermans F, Albers C, et al. Ten simple rules for getting started on Twitter as a scientist. PLOS Computational Biology. 2020;16(2):e1007513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Twitter: monthly active users worldwide. Statista.
- 22.Othman M, Cormier M, Barnes GD, et al. Harnessing Twitter to empower scientific engagement and communication: The ISTH 2020 virtual congress experience. Res Pract Thromb Haemost. 2021;5(2):253–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.