Abstract
Background:
The Centers for Disease Control and Prevention (CDC) in United States initially alerted the public to three COVID-19 signs and symptoms—fever, dry cough, and shortness of breath. Concurrent social media posts reflected a wider range of symptoms of COVID-19 besides these three symptoms. Because social media data have a potential application in the early identification novel virus symptoms, this study aimed to explore what symptoms mentioned in COVID-19-related social media posts during the early stages of the pandemic.
Methods:
We collected COVID-19-related Twitter tweets posted in English language between March 30, 2020 and April 19, 2020 using search terms of COVID-19 synonyms and three common COVID-19 symptoms suggested by the CDC in March. Only unique tweets were extracted for analysis of symptom terms.
Results:
A total of 36 symptoms were extracted from 30,732 unique tweets. All the symptoms suggested by the CDC for COVID-19 screening in March, April, and May were mentioned in tweets posted during the early stages of the pandemic.
Discussion:
The findings of this study revealed that many COVID-19-related symptoms mentioned in Twitter tweets earlier than the announcement by the CDC. Monitoring social media data is a promising approach to public health surveillance.
Keywords: COVID-19, epidemiology, social media, symptoms
BACKGROUND
The COVID-19 pandemic highlighted the challenges public health officials face in collecting data on a novel virus. One specific challenge is identifying common symptoms of that virus promptly. Timeliness in identifying COVID-19 symptoms could slow the spread of the virus by quickly informing the public to isolate and be tested when exhibiting those symptoms. To identify symptoms of a novel virus, a large dataset is necessary to find symptom patterns and clusters within the population the epidemic occurred. COVID-19-related symptom data can be collected via self-reported tools such as mobile apps or websites (Chan & Brownstein, 2020); this data collection approach may be limited to collect massive data in a short period of time if people are not aware of available resources or do not have access to the resources. Electronic health records (EHRs) contain a vast repository of clinical data that collects reported symptoms of those who test positive for the COVID-19. However, people infected with the virus must seek medical care in a health care setting to have symptom data collected into an EHR. Also, EHR data can be difficult to extract for research in a timely manner due to the Health Insurance Portability and Accountability Act and other security measures surrounding clinical data.
New approaches such as digital epidemiology (Salathé, 2018) or infodemiology (Eysenbach, 2011) offer another option for tracking patterns of health and disease through digital data, including data from keyword search engines (e.g., Google; Shin et al., 2016; Walker, Hopkins, & Surda, 2020) or public social media data (e.g., Twitter; Kudchadkar & Carroll, 2020; YoussefAgha, Jayawardene, & Lohrmann, 2013). For example, internet searches and tweets from Twitter containing Middle Eastern Respiratory Syndrome (MERS)- related keywords had a strong relationship with the number of MERS cases (Shin et al., 2016). Moreover, social media can be a timely and low-cost data source to inform symptom patterns and clusters for COVID-19. One study found the frequency of Google Trends on anosmia, or loss of the sense of smell, correlated to the onset of COVID-19 cases in multiple countries (Walker et al., 2020). Another study found that COVID-19 keywords peaked in social media and internet searches 10–14 days before the peak in incidence (Li et al., 2020).
At the beginning of the pandemic, the Centers for Disease Control and Prevention (CDC) reported only three main symptoms of COVID-19 in March 2020 (Centers for Disease Control & Prevention, 2020a), cough, fever, and shortness of breath. As the pandemic continued, more symptoms of COVID-19 were identified and added to the CDC’s list of COVID-19 symptoms (Centers for Disease Control & Prevention, 2020b, 2020c). In late April, the CDC added chills, repeated shaking with chills, muscle pain, headache, sore throat, and new loss of taste or smell as symptoms of COVID-19 (Centers for Disease Control & Prevention, 2020b). The CDC added several additional symptoms in May, including fatigue, congestion or runny nose, nausea, vomiting, and diarrhea (Centers for Disease Control & Prevention, 2020c). Since then, the CDC has not added any new symptoms. The continued addition of symptoms for COVID-19 implies that it took several weeks to months to understand which symptoms related to COVID-19. Because social media data have a potential application in the early identification novel virus symptoms in digital epidemiology, this study aimed to examine what symptoms mentioned or discussed in COVID-19-related tweets during the early stages of the pandemic.
METHODS
Design and sample
We sampled English language social media data posted on the Twitter platform from U.S. users between March 30, 2020 and April 19, 2020. Tweets contained key terms related to COVID-19 (e.g., corona, coronavirus, corovirus, covid, cv-19, and ncov-19) and at least one of the following terms: “fever,” “cough,” “shortness of breath,” which were suggested by the CDC as the symptoms of COVID-19 at the time (Centers for Disease Control & Prevention, 2020a), in March 2020. Retweets, or tweets posted by a second individual that repeated another user’s original tweet, were excluded. Tweets originating from outside of the United States were excluded from the study. R software (version 3.6.3) and Twitter Application Programming Interface retrieved tweets that met these criteria.
Measures and analytic strategy
To extract symptom wordings from unstructured, free-text tweets, we applied the following text analysis procedures: (a) cleaning the tweets; (b) generating n-grams to extracting symptom terms, and (c) grouping symptom terms based on the body system. First, because tweets can include symbols, emojis, and hyperlinks, the data required cleaning to remove non-text from tweets. During the cleaning process we removed non-English words, such as special characters, numbers, emojis, URLs, usernames, punctuation, and extra white space. Then we eliminated words which commonly occur but have little meaning in the context, called “stop words” (e.g., “and,” “the,” “also.”, “to,” “in,” etc.). Next, to efficiently extract symptom terms from a large number of texts in tweets, we generated n-grams. These are consecutive sequences of words, including unigrams, or a single words such as “shortness” or “breathing;” bigrams, or groups of two words such as “shortness of” or “difficult breathing,” and trigrams, or groups of three words such as “shortness of breath” or “have difficulty breathing.”
Next, we had three RN-prepared nurses review the most frequent words or terms (i.e., the top 1,000) from the lists of unigrams, bigrams, and trigrams to extract symptom-related terms. During the symptom term extraction, we focused on extracting physical symptoms. Mental health symptoms, such as stress and anxiety, were excluded. Lastly, the extracted terms were grouped by body system. Using R software, we calculated the frequency of each symptom mentioned within the tweets and Microsoft Excel was used to generate graphs to illustrate symptom trends mentioned over time.
RESULTS
Description of retrieved tweets A total of 30,732 unique tweets posted between March 30, 2020 and April 19, 2020 were retrieved. Notably, more tweets were posted before April 10, 2020, and the peak of daily newly diagnosed COVID-19 cases in the United States was on April 5, 2020 (Figure 1). Of the tweets, 24,251 (78.9%) indicated the users’ location or where the tweet was posted. Four states presented 42.3% of the tweets and they are California (n = 3,544, 14.6%), New York (n = 3,151, 13.0%), Texas (n = 2074, 8.6%), and Florida (n = 1,500, 6.2%). The remaining tweets were from the other states, and no tweets originated from Hawaii.
A total of 36 symptoms, not including “asymptomatic,” were extracted and then grouped into body systems (Table 1). These symptoms were mentioned a total of 41,867 times from the collected 30,732 tweets. Examples of tweets mentioning individual symptoms are presented in Table 1. The most frequently mentioned symptoms by body systems were respiratory (n = 16,504, 39.4%), immunologic (n = 9,405, 22.5%), neurologic (n = 8,498, 20.3%), and gastrointestinal (GI) symptoms (n = 4,892, 11.7%).
Table 1.
Symptom group | n | Symptom | n | Examples of tweets |
---|---|---|---|---|
Respiratory | 16504 | Cough | 9710 | I was told today that I was presumed positive for COVID-19. Last night what had been severe seasonal allergies seemed to become serious symptoms. A dry, persistent cough that brought up blood, shortness of breath, and diaphoresis. I was pre-authorized to be tested. |
Shortness of breath or difficulty breathing | 4545 | Quarantine Day 2: Breathing is still very labored, energy at all time low. Began to get some smell and taste back, hands and knees still ache. Can move more than twice without coughing hysterically, feels impossible to get sleep. If this isn’t covid idk what is. | ||
Sneezing | 878 | When I am at the grocery store, I am now scared to cough, sneeze, or even clear my throat, for fear of people thinking I have covid. | ||
Sore throat | 616 | I had mild coronavirus, sore throat, slight cough, no fever, lump in throat, sinusitis, dizziness over a couple of days | ||
Congestion or running nose | 525 | My aunt and uncle are doctors who had Covid-19 w/ mild symptoms (slight fever, runny nose, sore throat). Uncle is still testing positive now w/o symptoms for 3 weeks after onset. Their message is to act like you and everyone around you is infected and STAY HOME! | ||
Chest pain | 125 | Resting in bed so far today as still quite out of puff, cough, chest pain. The ED consultant said most likely Covid (thankfully ruled out cardiac cause). | ||
Chest tightness | 105 | I have a friend who has Covid-19. She went to the hospital a few weeks ago with the symptoms fever, headache body ache tight chest. They didn’t test her for corona. They sent her home saying it was just some virus. After feeling a bit better she lost her sense smell and taste. | ||
Immunologic | 9405 | Fever | 8644 | I’m pretty sure I had COVID. I suddenly had a high fever (~102.8), extreme body aches, and a dry cough. |
Chills | 581 | Literally think have corona, I’m all achy and was shivering last night and have fever. | ||
Sweating | 120 | Hard to breathe, sweating, tiredness … I feel like I only feel the symptoms of Covid-19 | ||
Inflammation | 60 | The range of symptoms caused by Covid which include dry cough, fever and of course the dangerous inflammation of the respiratory system. | ||
Neurologic | 8498 | Loss of taste | 3898 | Heard no taste is a cause of coronavirus. 22 yr old Boise woman tested positive for the coronavirus with one primary, obscure symptom - loss of taste and smell. |
Loss of smell (e.g., anosmia) | 2458 | Can you really lose your sense of smell from coronavirus? | ||
Headache | 1112 | A person who had been informing me about the health of a close friend of hers who had COVID-19 and was in the hospital on life support has now been admitted to the hospital herself with pneumonia, fever, loss of smell and taste, cough, headaches, etc. Waiting on a test. Awful. | ||
Confusion | 554 | Coronavirus symptoms take new shape. Confusion and pink eye and can be first symptoms. | ||
Insomnia | 381 | Anyone else find themselves suffering from insomnia during this pandemic? | ||
Dizziness | 80 | I called out sick from work today with dizziness and mild fever. Work called me later to tell me that I MUST get tested for Covid, that I am not allowed back in the building until they receive my negative results, and that I cannot get sick pay without proof I have it. F**k. | ||
Tinnitus | 15 | My coronavirus patients being treated with anti-malaria drug its potential side effects include vomiting, skin rashes, and ear ringing. | ||
Gastrointestinal | 4028 | Abdominal Pain | 3292 | I had all the symptoms of covid, major breathing coughing issues, sore throat, headache stomachache not able to eat. |
Diarrhea | 735 | Last week of January I got super sick with cough, vomiting, diarrhea, respiratory probs, headaches, body aches + fever. Tested neg for strep, flu, and mono. Couldn’t get over it for a month bc of starting a new semester. I wish the covid-19 test was avail back then | ||
Nausea | 263 | Went on a drive by for a COVID testing after experiencing a scary episode of the flu yesterday, fever, massive headache, nausea and vomiting. Even though I have been socially distant for almost 4 weeks, constantly | ||
Vomiting | 263 | Hello friends. I have 4 symptoms of COVID. I was washing my hands yesterday and started vomiting in the sink for no reason. No fever. Breathing ok. Throat hurts. Chest feels a tiny bit off but it could be anxiety. Current CDC direction is to wait at least 7 days in q-tine. | ||
Flatulence | 183 | I used to cough to cover a fart. Now I fart to cover a cough. | ||
Appetite loss (e.g., Anorexia) | 129 | I had all the symptoms of covid, major breathing coughing issues, sore throat, headache stomachache not able to eat. | ||
Dysphagia | 27 | Hurts to swallow, but no sore throat. My doctor was not concerned. | ||
Muscular | 1981 | Fatigue | 1374 | You can be positive for coronavirus with only symptoms like fatigue, conjunctivitis, or chest tightness/pain. |
Muscle or body ache | 477 | Most coronavirus patients are mild with symptoms like muscle aches, fatigue, runny nose, sore throat, or diarrhea. | ||
Weakness | 130 | Some of the first warning signs for coronavirus can be extreme fatigue, weakness and chills. | ||
Skin | 227 | Skin issues (e.g., skin lesions, itching) | 221 | The doctor says my persistent cough and skin lesions are definitely not covid. |
Toe skin problem | 6 | COVID toes present with blue and red around toes. | ||
Cardiovascular | 202 | Blood clots related | 163 | Some bad news just now about a distant acquaintance who was finally admitted to hospital with Coronavirus flu-like symptoms, mostly sweating, fever. Don’t know if he’s positive, but they found blood clots around his lungs. Is this a related symptom anyone’s heard of before? |
Arrhythmia | 39 | Potential side effects of chloroquine include fatal heart arrhythmia, vision loss. | ||
Eye | 127 | Pink eye (e.g., conjunctivitis) | 97 | Almost certain at this point my boys had Covid-19 in February. We went out of town, my mom watch them for a few days. They all got sick, cough, fever, pink eye. She developed “pneumonia”. |
Eye pain | 18 | I had many of the symptoms of Covid in January- dry cough that made it so I couldn’t talk and could barely breathe. I would start coughing and it just never ended. Fever, body aches, eye pain, extreme tiredness- I couldn’t leave my house for my classes for two weeks. | ||
Vision change | 12 | Did anyone else notice the side effects from taking hydroxychloroquine? Disturbing dreams, hallucinations, blurry vision. | ||
Renal | 31 | Any kidney issues | 31 | Please pray for her friend that is battling for her life in ICU due to Covid 19 with double pneumonia fever is on ventilator laying on her tummy and has been given dialysis to relax her kidneys and lungs. |
Asymptomatic | 274 | No symptoms | 274 | I’ve seen multiple patients showing no symptoms but spiked a fever and dialysis sends them to the ER. Pt says they feel fine, Covid test comes back positive. Fever, a dry cough and shortness of breath are among the most common, but some people with the coronavirus feel no symptoms at all. |
The three most frequently mentioned symptoms were cough (n = 9,710, 23.2%), fever (n = 8,626, 20.6%), and difficulty breathing (n = 4,545 10.9%), with cough and fever mentioned frequently each day (Figure 2). Figure 3 presents additional COVID-19 symptoms suggested by CDC in late April; they are chills (n = 581, 1.4%), muscle pain (n = 477, 1.1%), headache (n = 1,112, 2.7%), sore throat (n = 616, 1.5%), loss of taste (n = 3,898, 9.3%), and loss of smell (n = 2,458, 5.9%; Table 1). Within these symptoms, loss of taste and loss of smell were frequently mentioned (Figure 3). Figure 4 presents additional COVID-19 symptoms suggested by CDC in May; they are fatigue (n = 1,374, 3.3%), diarrhea (n = 735, 1.8%), congestion or runny nose (n = 525, 1.3%), nausea (n = 263, 0.6%), and vomiting (n = 263, 0.6%; Table 1); within these symptoms, most of them were GI symptoms.
DISCUSSION
This study was able to extract symptoms terms mentioned in the COVID-19 related tweets, and the findings show that sign and symptom patterns of COVID-19 emerged in social media posts during the early stages of the COVID-19 pandemic. We were able to retrieve COVID-19-related tweets successfully, using commonly used terms related to COVID-19 and three classic symptoms as relevant markers. The ability to extract additional symptom terms from the sample of tweets suggests that our strategy was sufficient.
More than 30 symptoms were extracted in the sample of tweets posted between late March and April, including all the symptoms of COVID-19 suggested by the CDC (Centers for Disease Control & Prevention, 2020c) until July 2020. This study confirmed that social media data could be useful for supporting the identification of potential symptoms for novel diseases like COVID-19. It also provides information regarding “asymptomatic” cases tested positive, yet not having typical symptoms of COVID-19 (e.g., fever, difficult breathing, and cough). We are aware that not all the symptoms extracted from the tweets are necessary to be symptoms of COVID19. Further validation or verification is required to confirm which symptoms are manifestations of COVID-19 infection. For example, symptoms such as vision change or tinnitus extracted in this study may not be symptoms of COVID-19 and are more relevant to the side effects of hydroxychloroquine, one potential medicine to treat COVID-19 (National Institute of Health, 2020; Skipper et al., 2020). Although this type of symptom information is not specific in identifying symptoms of COVID-19 per se, this finding implies that the topic of potential treatments for COVID-19 could be one of the hot topics within social media posts and followed by the public.
Longitudinal social media data can provide an overview of trending symptoms mentioned by the public and the timing of emerging symptoms (Walker et al., 2020; YoussefAgha et al., 2013). This study analyzed 3 weeks of COVID-19-related tweets and found that symptoms related to COVID-19 were mentioned in tweets before the CDC suggested them. For example, symptoms including chills, muscle pain, headache, sore throat, loss of taste, and loss of smell were mentioned consistently throughout the study data collection period. However, these COVID-19 symptoms were suggested by CDC in late April (Centers for Disease Control & Prevention, 2020b), and the CDC suggested additional symptoms (e.g., fatigue, diarrhea, nausea, and vomiting) in May (Centers for Disease Control & Prevention, 2020c). Moreover, these symptoms were also mentioned consistently throughout our sample period before May.
Twitter is a rich data source for discussion on emerging infectious diseases, such as COVID-19. The retrieved COVID-19-related tweets in this study contained multiple aspects such as symptoms experienced by individuals, their family or friends, or celebrities, emotion reactions concerns related to COVID-19, or COVID-19-related news or information disseminated governmental organizations. Some posts in this study were from health care providers sharing their experiences of encountering COVID-19 cases; an example of one tweet was presented from “asymptomatic” in Table 1. This finding aligns with a recent study showing how pediatric health care providers have utilized Twitter to rapidly disseminate and share COVID-19 information and experience using relevant hashtags (Kudchadkar & Carroll, 2020).
Limitations
This study only focused on extracting physical symptoms from retrieved tweets due to they were likely to be relevant to COVID-19. However, several psychological symptoms (e.g., stress, anxiety, worrisome thoughts, and anger) were present in the tweets collected. These are important because they demonstrate the public fear resulting from uncertain situations of COVID-19 and rising demand for clarification or confirmation from experts in the fields.
Because the contents of tweets are not limited to only users’ personal experiences, it is difficult to directly link tweet posts to actual symptom prevalence in affected individuals when a large set of tweets were analyzed. Moreover, it may not be possible to verify the symptom experiences of individuals in each tweet posts. Therefore, the frequency of the mentioned symptoms in this study does not present the symptom prevalence of individuals with COVID-19. However, even if tweets about symptoms are posted by individuals sharing second-hand anecdotes or by health organizations seeking to inform the public about risks and protective measures, social media data will still track with public health events like the COVID-19 pandemic because of its immediacy and nimbleness compared with other forms of data. In summary, symptoms related to COVID-19 were presented in social media data at an early stage of the pandemic before they were confirmed by the CDC. These data are informative, but need further verification to be confirmed.
Acknowledgments
Grant support
F31 NR018987 (PI Sarah E. Wawrzynski)
K01 NR016948 (PI Jia-Wen Guo)
REFERENCES
- Centers for Disease Control and Prevention. (2020a, March 30, 2020). Symptoms of coronavirus. Retrieved from https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html
- Centers for Disease Control and Prevention. (2020b, April 17, 2020). Symptoms of coronavirus. Retrieved from https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html
- Centers for Disease Control and Prevention. (2020c, May 13, 2020). Symptoms of coronavirus. Retrieved from https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html
- Chan AT, & Brownstein JS (2020). Putting the public back in public health – Surveying symptoms of Covid-19. The New England Journal of Medicine, 383(7), e45. 10.1056/NEJMp2016259 [DOI] [PubMed] [Google Scholar]
- Eysenbach G. (2011). Infodemiology and infoveillance tracking online health information and cyberbehavior for public health. American Journal of Preventive Medicine, 40(5 Suppl 2), S154–S158. 10.1016/j.amepre.2011.02.006 [DOI] [PubMed] [Google Scholar]
- Kudchadkar SR, & Carroll CL (2020). Using social media for rapid information dissemination in a pandemic: #PedsICU and coronavirus disease 2019. Pediatric Critical Care Medicine. 10.1097/PCC.0000000000002474 [DOI] [PMC free article] [PubMed]
- Li C, Chen LJ, Chen X, Zhang M, Pang CP, & Chen H. (2020). Retrospective analysis of the possibility of predicting the COVID19 outbreak from Internet searches and social media data, China, 2020. Eurosurveillance, 25(10). 10.2807/1560-7917.ES.2020.25.10.2000199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- National Institute of Health. (2020, June 15). Hydroxychloroquine. Retrieved from https://medlineplus.gov/druginfo/meds/a601240.html
- Salathé M. (2018). Digital epidemiology: What is it, and where is it going? Life Sciences, Society and Policy, 14(1), 1. 10.1186/s40504-017-0065-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shin S-Y, Seo D-W, An J, Kwak H, Kim S-H, Gwack J, & Jo M-W (2016). High correlation of Middle East respiratory syndrome spread with Google search and Twitter trends in Korea. Scientific Reports, 6(1), 32920. 10.1038/srep32920 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skipper CP, Pastick KA, Engen NW, Bangdiwala AS, Abassi M, Lofgren SM, … Boulware DR (2020). Hydroxychloroquine in nonhospitalized adults with early COVID-19: A randomized trial. Annals of Internal Medicine. 10.7326/m20-4207 [DOI] [PMC free article] [PubMed]
- Walker A, Hopkins C, & Surda P. (2020). Use of Google Trends to investigate loss-of-smell-related searches during the COVID-19 outbreak. International Forum of Allergy and Rhinology, 10(7), 839–847. 10.1002/alr.22580 [DOI] [PMC free article] [PubMed] [Google Scholar]
- YoussefAgha AH, Jayawardene WP, & Lohrmann DK (2013). Role of social media in early warning of norovirus outbreaks: A longitudinal twitter-based infoveillance. Proceedings of the International Conference on Data Mining (DMIN). Retrieved from http://worldcomp-proceedings.com/proc/p2013/DMI8027.pdf