Abstract
Purpose
COVID-19 continues to affect millions of individuals worldwide, both in the short and long term. The post-acute complications of SARS-CoV-2 infection, referred to as long COVID, result in diverse symptoms affecting multiple organ systems. Little is known regarding how the symptoms associated with long COVID progress and resolve over time. The Johns Hopkins COVID Long Study aims to prospectively examine the short-term and long-term consequences of COVID-19 in individuals both with and without a history of SARS-CoV-2 infection using self-reported data collected in an online survey.
Participants
16 764 adults with a history of SARS-CoV-2 infection and 799 adults without a history of SARS-CoV-2 infection who completed an online baseline survey.
Findings to date
This cohort profile describes the baseline characteristics of the Johns Hopkins COVID Long Study. Among 16 764 participants with a history of SARS-CoV-2 infection and defined long COVID status, 75% reported a very good or excellent health status prior to infection, 99% reported experiencing at least one COVID-19 symptom during the acute phase of infection, 9.9% reported hospitalisation and 63% were defined as having long COVID using the WHO definition.
Future plans
Analysis of longitudinal data will be used to investigate the progression and resolution of long COVID symptoms over time.
Keywords: COVID-19, INFECTIOUS DISEASES, Chronic Disease, EPIDEMIOLOGIC STUDIES
STRENGTHS AND LIMITATIONS OF THIS STUDY.
The Johns Hopkins COVID Long Study (JHCLS) is a large, online, prospective cohort study of adults that collects comprehensive clinical and behavioural data on participants with and without a history of SARS-CoV-2 infection at baseline with an option to participate in longitudinal follow-up every 3–6 months.
Detailed clinical data are collected on COVID-19 diagnosis and treatment, health history and pre-existing health conditions, in addition to validated measurements on physical, mental and cognitive limitations.
The JHCLS is comprised of participants from 53 US territories and includes individuals aged 19–96 years.
Because SARS-CoV-2 tests are not always accessible, eligibility requirements include either a self-reported positive SARS-CoV-2 test or symptoms of COVID-19.
There may be selection and recall bias due to the increased likelihood of participation by individuals with long COVID and self-reported clinical data; however, the JHCLS is comprised of a subset of individuals who enrolled within 4 weeks of their initial SARS-CoV-2 infection and had not yet developed long COVID.
Introduction
Since its emergence in 2019, COVID-19 has greatly affected the health and well-being of millions of people worldwide.1 2 Both acute and persistent post-infection complications have been reported by patients3 and COVID-19 is now recognised as a multiorgan disease.4 The WHO defines persistent post-infection complications, referred to as long COVID, as new or continuing symptoms 3 months after initial illness that last at least 2 months and cannot be explained otherwise.5 Despite recent studies suggesting that long COVID may occur in 10–55% of individuals exposed to SARS-CoV-2,6–8 the exact incidence remains unknown. There is also uncertainty in the pathophysiology and symptomatology of long COVID.9 10 With the elevated burden of COVID-19 worldwide, it is important to understand the full range of symptoms and long-term outcomes.2 Moreover, the large number of individuals requiring continued medical care will pose an economic burden on our healthcare system.10
Similar to other infections, SARS-CoV-2 is associated with post-acute infection syndromes resulting in a variety of symptoms.11 Some of the core symptoms associated with long COVID are common to other post-acute infection syndromes as well, including but not limited to fatigue, exertion intolerance and neurocognitive impairment.11 Despite our general knowledge of the occurrence of post-acute infection syndromes, it is largely understudied. Cohort studies composed of those who have had SARS-CoV-2 and those who have not (ie, control population) are critical to understanding the gaps in our knowledge of long COVID and post-acute infections in general.
The presentation of those with long COVID is often marked with multiple diverse symptoms affecting multiple organs; each individual may have their own unique clinical presentation.12 Though age is a major risk factor in COVID-19-related mortality, and despite a preponderance of long COVID among those aged 40–60 years, long COVID is reported across the age spectrum.13 Similarly, long COVID is reported by persons of all genders, race/ethnicities, and those with and without pre-existing comorbidities.13 14 Hence, it is essential that research both identifies and characterises the main clinical and epidemiological features associated with long COVID, including potential targets for intervention.
For these reasons, the Johns Hopkins COVID Long Study (JHCLS) was established to prospectively examine the short-term and long-term consequences of COVID-19 over a 3-year follow-up period. The overall objectives of the JHCLS are to (1) characterise the spectrum of long-term sequelae of SARS-CoV-2 infection; (2) identify individuals at risk of long-term sequelae; and (3) characterise the physical and mental health disability associated with long COVID. To meet study objectives, the cohort includes participants with and without a history of SARS-CoV-2 infection. This cohort profile describes baseline demographic and clinical characteristics of US participants enrolled in the JHCLS.
Cohort description
Study design and participants
The JHCLS launched for participants with a self-reported history of SARS-CoV-2 infection on 2 February 2021, and expanded to include participants without a history of SARS-CoV-2 infection on 2 March 2022. All consenting participants are asked to complete a one-time, short online baseline survey with the option to remain anonymous. At the end of the baseline survey, participants are asked if they agree to be contacted for future COVID-19 studies, such as enrolment into longitudinal follow-up. If they respond yes, they are contacted by email or phone 3–6 months later with information about participating in longitudinal follow-up. If they subsequently consent to participate in longitudinal follow-up, they are emailed a follow-up survey every 3–6 months.
As of 14 February 2023, 20 319 participants with a self-reported history of SARS-CoV-2 infection and 1041 participants without a history of SARS-CoV-2 infection consented to participate in the JHCLS (figure 1). Of the 20 319 participants with a history of infection, 15 478 with a defined long COVID status (76%) completed the baseline survey and 11 924 (59%) consented to be contacted for future studies. Of these, 6327 have enrolled in longitudinal follow-up and completed their first follow-up survey. Of the 1041 participants without a history of infection, 799 (77%) completed the baseline survey and 501 (48%) consented to be contacted for future studies. Of these, 278 have enrolled in longitudinal follow-up and completed their first follow-up survey. At each round of follow-up, participants without a history of infection are asked if they have experienced COVID-19 symptoms or tested positive for SARS-CoV-2 since their last survey completion. If they respond yes, they are transferred to the survey for participants with a history of infection. As of 14 February 2023, 46 of the 278 participants (17%) who completed their first round of longitudinal follow-up have self-reported either a positive SARS-CoV-2 test or symptoms of COVID-19. The median survey completion time is 20 min for the baseline survey and 24 min for the first follow-up survey.
Recruitment
Participants are recruited into the JHCLS using several mechanisms: social media posts, Facebook ad campaigns, direct messaging (eg, emails to health departments), word of mouth and participation in a recruitment registry. For social media recruitment, researchers use study-owned and operated Instagram, Facebook and Twitter accounts. In addition, the team partnered with the Audience Development Team at the Johns Hopkins Bloomberg School of Public Health Communications Department to develop targeted Facebook ad campaigns. The study ran three Facebook ad campaigns, each targeting a neighbourhood in the USA with high SARS-CoV-2 case counts. Two campaigns ran in April 2021, the first targeting neighbourhoods in Detroit, Michigan, and the second in Fayetteville and Hope Mills, North Carolina, and South Fulton and Alpharetta, Georgia. The final campaign ran in July 2021 and targeted neighbourhoods in Houston and San Antonio, Texas; Miami and Jacksonville, Florida; and Los Angeles, California.
The study team also partners with the Johns Hopkins Opportunities for Participant Engagement (HOPE) Registry. The HOPE Registry (http://johnshopkinshope.org/) is a recruitment registry designed to connect individuals with teams conducting COVID-19 research studies at Johns Hopkins University. The JHCLS was officially enrolled into the HOPE Registry in April 2021.
Participant eligibility
To be eligible to participate in the study, participants must be at least 18 years of age. Additionally, to be eligible to complete the survey for participants with a history of SARS-CoV-2 infection, participants must self-report at least one positive SARS-CoV-2 test or symptoms of COVID-19. At the start of the baseline survey, eligible participants are provided with a short, informed consent script that details the purpose of the study and provides details on participation. In order to protect the confidentiality of participants, participants are assigned a unique study identifier number and data are collected anonymously.
Study procedures
The JHCLS baseline survey is self-administered and collects data across nine domains: SARS-CoV-2 testing and COVID-19 symptoms, vaccines and SARS-CoV-2 reinfection, COVID-19 treatments and hospitalisation, pre-existing comorbidities, physical limitations and exercise, sleep quality, mental fatigue, anxiety and demographics (online supplemental table 1). Data from these same domains are collected during longitudinal follow-up as well. All data are collected in REDCap, a Health Insurance Portability and Accountability Act-compliant, secure web application designed to build and manage online surveys and databases.15 16 Most survey questions were adapted from validated measures and assessments. However, certain questions were self-designed for the purpose of meeting study objectives. All self-designed questionnaires are available on the National Institute of Environmental Health Sciences Disaster Research Response Resources Portal (https://tools.niehs.nih.gov/dr2/index.cfm/resource/24278).
bmjopen-2023-077742supp001.pdf (63.1KB, pdf)
SARS-CoV-2 testing and COVID-19 symptoms
To obtain data on COVID-19 history, diagnosis and symptoms, researchers self-designed questions to assess overall health status prior to initial COVID-19 illness, history of SARS-CoV-2 testing and results, initial symptom onset date, symptoms experienced during the acute phase of COVID-19, new/continuing COVID-19 symptoms, impact of each reported symptom on daily activities and self-reported recovery from COVID-19 illness. Participants without a history of infection are asked about health status prior to the COVID-19 pandemic, symptoms experienced in reference to overall general health and self-reported recovery from the effects of the COVID-19 pandemic.
Vaccines and SARS-CoV-2 reinfection
To collect data on vaccination and SARS-CoV-2 reinfection, researchers self-designed questions related to influenza vaccination uptake, COVID-19 vaccination uptake, SARS-CoV-2 antibody testing, participation in COVID-19 treatment trials, SARS-CoV-2 reinfection and self-reported comparison of COVID-19 symptoms experienced during the first reinfection compared with initial illness. Participants without a history of infection are asked questions related to influenza and COVID-19 vaccination uptake.
Treatments and hospitalisation
Researchers self-designed questions to obtain data on medications used to treat initial COVID-19 illness, medications used to treat new/continuing COVID-19 symptoms, COVID-19-related hospitalisation, healthcare utilisation and health-seeking behaviour to treat symptoms. In this section, participants without a history of infection are asked about overall healthcare utilisation.
Pre-existing comorbidities
In order to obtain data on pre-existing health status and comorbidities, researchers self-designed questions to capture current health status, pre-existing health conditions, cancer diagnosis, height, weight, current stress level and stress level prior to the COVID-19 pandemic.
Physical limitations and exercise
To assess physical limitations and exercise, researchers used questions adapted from the Baltimore Longitudinal Study of Aging (BLSA) and the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ).
The BLSA is a longitudinal study of healthy adults with the aim of understanding how adults adjust to the ageing process, including adjustments in physical activity.17 18 During the baseline survey, participants are asked questions to assess difficulty and level of difficulty in the following domains: mobility (walking a quarter mile/1 mile and going up 10 steps/20 steps) and instrumental activities of daily living (IADL) (light and heavy housework). If a participant reports experiencing difficulty, they are asked to report the level of difficulty (a little, some, a lot or unable to do); conversely, if they do not report difficulty, they are asked to report the level of ease (very easy, somewhat easy or not so easy).17 18 Participants who report difficulty are also asked if they experienced the difficulty prior to their COVID-19 illness or the COVID-19 pandemic. For mobility, if a participant reports any level of difficulty walking a quarter of a mile, they are considered to have a mobility disability. For IADL, if a participant reports any level of difficulty with light housework, they are considered to have an IADL disability.
The GSLTPAQ was validated for use in healthy adults by measuring the correlation between objective measures of physical condition, maximum oxygen intake during exercise (V02 max) and body fat percentile, and subjective measures of total leisure time physical activity.19–21 The questionnaire was found to have a test–retest reliability of 0.94, 0.46 and 0.48 for strenuous, moderate and light-intensity exercise, respectively, with the highest correlation shown between V02 max and strenuous-intensity exercise (Pearson’s r=0.38) and body fat percentile and strenuous-intensity exercise (Pearson’s r=0.21).20
During the baseline survey, participants are asked to report the number of times on average they participate in mild, moderate and strenuous-intensity exercise for longer than 15 min during a typical week. The number of times per week is multiplied by the corresponding metabolic equivalent of task factor (3, 5 and 9 for mild, moderate and strenuous-intensity exercise, respectively) and summed for a total leisure activity score.19 20 A score of >24 indicates an active lifestyle, a score of 14–23 indicates a moderately active lifestyle and a score of <14 indicates an insufficiently active/sedentary lifestyle.19 Participants with a history of infection are asked to report the number of times they exercised in each category before and after their COVID-19 illness; participants without a history of infection are asked in reference to before and after the COVID-19 pandemic.
Sleep quality
Sleep quality is assessed using questions adapted from the AIDS Linked to the IntraVenous Experience (ALIVE) Study and the Idiopathic Hypersomnia Severity Scale (IHHS). The ALIVE Study is a prospective cohort study designed to characterise the incidence and natural history of HIV infection among injection drug users in Baltimore, Maryland.22 Participants are asked how often they experience a list of five items related to sleep quality over the past 4 weeks. For each item, participants respond based on the following scale: 1 (all of the time), 2 (most of the time), 3 (a good bit of the time), 4 (some of the time), 5 (a little bit of the time) or 6 (none of the time).
The IHHS was validated for use in patients experiencing three major symptoms of idiopathic hypersomnia: excessive daytime sleepiness, prolonged night-time sleep and sleep inertia, and was found to have high internal consistency (Cronbach’s α=0.89) and good content validity.23 24 The scale consists of 14 items and each item is scored separately and then summed together for a total score ranging from 0 to 50. Higher scores represent more severe/frequent symptoms of idiopathic hypersomnia.23 24 For the purpose of the JHCLS, researchers used four questions from the IHHS for a range of scores from 0 to 14.
Mental fatigue
Mental fatigue is assessed using the Wood Mental Fatigue Inventory (WMFI). The WMFI has been validated for use in patients with myalgic encephalomyelitis/chronic fatigue syndrome and was found to have high internal consistency (Cronbach’s α=0.93) and good test–retest reliability (Pearson’s r=0.887).25 Participants are asked how much they have been bothered by a list of nine items over the past 2 weeks. Each item is scored on the following scale: 0 (not at all), 1 (a little), 2 (somewhat), 3 (quite a lot) or 4 (very much). At the end of the assessment, the scores are summed together for a range of 0–36. Higher scores indicate greater levels of mental fatigue.25
Anxiety
To assess anxiety, researchers use the Generalized Anxiety Disorder-7 (GAD-7). The GAD-7 has been validated for use in the general population and was found to have both high internal consistency (Cronbach’s α=0.92) and good criterion, construct, factorial and procedural validity.26 Participants are asked how often they have been bothered by a list of seven items over the past 2 weeks. For each item, participants are scored based on the following scale: 0 (not at all), 1 (several days), 2 (more than half the days) or 3 (nearly every day). At the end of the assessment, the scores are summed together for a range of 0–21. A score of 0–4 indicates no anxiety disorder, a score of 5–9 indicates a mild anxiety disorder, a score of 10–14 indicates a moderate anxiety disorder and a score of >15 indicates a severe anxiety disorder.26
Demographics
Researchers self-designed survey questions to obtain demographic information, including gender, race/ethnicity, country of residence, year of birth, educational attainment, work activities prior to the COVID-19 pandemic, primary occupation, total household income and total number of dependents.
Patient and public involvement
There was no patient or public involvement in the design, conduct, reporting or dissemination plans of our research. However, patient feedback is routinely discussed and considered. Specifically, patients are encouraged to reach out to study team members with suggestions on ways to improve the survey and the survey has been adjusted several times based on patient suggestions. In addition, study findings are regularly disseminated to patients via newsletters posted on the study website (https://covid-long.com/).
Baseline characteristics of JHCLS participants
Among 16 764 participants with a history of SARS-CoV-2 infection and defined long COVID status, the median age was 43 years, 84% were female, 88% self-reported white race and 8.0% self-reported Hispanic/Latino ethnicity (table 1). In terms of socioeconomic status, 70% of participants self-reported a bachelor’s degree or higher and 72% self-reported an annual household income of greater than or equal to $50 000. A diverse array of self-reported pre-existing comorbid conditions were reported, including hypertension (15%), depression/anxiety/other mental health conditions (35%), asthma/reactive airway disease/chronic lung disease (16%) and autoimmune disorders (9.6%) (table 2). In addition, the majority of participants (65%) were classified as overweight/obese based on a calculated body mass index (BMI) of >25.
Table 1.
Study sample with SARS-CoV-2 infection: baseline data*†‡ | Study sample without SARS-CoV-2 infection: baseline data*§ | |
(n=16 764) | (n=799) | |
Gender | ||
Cisgender man | 2514 (15%) | 104 (13%) |
Cisgender woman | 14 010 (84%) | 674 (85%) |
Transgender man | 24 (0.1%) | 2 (0.3%) |
Transgender woman | 14 (0.1%) | 1 (0.1%) |
Different identity | 181 (1.1%) | 16 (2.0%) |
Missing | 21 | 2 |
Race | ||
White | 14 651 (88%) | 651 (82%) |
Black | 489 (3.0%) | 15 (1.9%) |
Other | 890 (5.4%) | 100 (13%) |
Mixed race | 537 (3.2%) | 24 (3.0%) |
Missing | 197 | 9 |
Hispanic, Latino or Spanish origin | 1323 (8.0%) | 47 (5.9%) |
Missing | 122 | 5 |
Median age, IQR | 43 (34–55) | 42 (32–57) |
Missing | 45 | 3 |
Educational attainment | ||
High school, GED or less | 974 (5.8%) | 30 (3.8%) |
Some college, associates/technical degree | 4150 (25%) | 98 (12%) |
Bachelor’s degree | 5135 (31%) | 241 (30%) |
Postgraduate degree | 6461 (39%) | 430 (54%) |
Missing | 44 | 0 |
Annual household income | ||
<$25 000 | 1588 (11%) | 100 (14%) |
$25 000–34 999 | 957 (6.7%) | 42 (5.8%) |
$35 000–49 999 | 1486 (10%) | 60 (8.3%) |
$50 000–74 999 | 2697 (19%) | 118 (16%) |
$75 000 or greater | 7635 (53%) | 405 (56%) |
Missing | 2401 | 74 |
Region | ||
Northeast | 3159 (19%) | 180 (23%) |
Midwest | 3462 (21%) | 141 (18%) |
South | 6581 (39%) | 281 (35%) |
West | 3538 (21%) | 197 (25%) |
Missing | 24 | 0 |
*Missing data were due to invalid data, ‘don’t know’ responses, ‘refuse to answer’ responses and missing responses. Percentages do not add up to 100% due to missing data. The number of missing data varies due to participants dropping out of the survey at different sections.
†Baseline data in participants with a history of SARS-CoV-2 infection are as of 20 October 2022.
‡Limited to participants with a defined long COVID status. Long COVID status was determined using the WHO definition. 320 participants without a defined long COVID status were excluded: 283 provided an initial SARS-CoV-2 infection date before the first confirmed positive test in the USA (20 January 2020), 28 provided an invalid SARS-CoV-2 infection date, 9 did not provide a SARS-CoV-2 infection date nor report experiencing symptoms of COVID-19.
§Baseline data in participants without a history of SARS-CoV-2 infection are as of 14 February 2023.
GED, General Educational Development.
Table 2.
Study sample with SARS-CoV-2 infection: baseline data*† | Study sample without SARS-CoV-2 infection: baseline data*‡ | |
(n=16 764) | (n=799) | |
Body mass index (kg/m2) | ||
Underweight (<18.5) | 285 (1.8%) | 13 (1.6%) |
Normal weight (18.5–24.9) | 5517 (34%) | 342 (43%) |
Overweight (25–29.9) | 4643 (29%) | 216 (27%) |
Obese (30 and above) | 5827 (36%) | 225 (28%) |
Missing | 492 | 3 |
Comorbid conditions | ||
Diabetes | 716 (4.4%) | 32 (4.0%) |
Cardiovascular disease/congestive heart failure | 379 (2.3%) | 24 (3.0%) |
Hypertension | 2526 (15%) | 105 (13%) |
Chronic kidney disease | 138 (0.8%) | 6 (0.8%) |
Cancer | 392 (2.4%) | 17 (2.1%) |
Asthma/reactive airway disease/chronic lung disease | 2684 (16%) | 99 (12%) |
Overweight/obese | 4853 (30%) | 187 (23%) |
Autoimmune disorder | 1568 (9.6%) | 65 (8.1%) |
Stroke | 116 (0.7%) | 9 (1.1%) |
Depression/anxiety/other mental health conditions | 5730 (35%) | 286 (36%) |
Missing | 388 | 0 |
Self-rated health status prior to COVID-19§ | ||
Excellent | 6297 (38%) | 283 (35%) |
Very good | 6192 (37%) | 321 (40%) |
Good | 2900 (17%) | 135 (17%) |
Fair | 1242 (7.4%) | 53 (6.6%) |
Poor | 126 (0.8%) | 7 (0.9%) |
Missing | 7 | 0 |
Vaccination status at time of enrolment | ||
None=0 | 6414 (39%) | 22 (2.8%) |
Partial vaccination | 805 (4.9%) | 4 (0.5%) |
Complete first series | 4006 (24%) | 93 (12%) |
≥1 booster | 5208 (32%) | 679 (85%) |
Missing | 331 | |
Timing of initial SARS-CoV-2 infection | ||
January–June 2020 | 3804 (23%) | N/A |
July–December 2020 | 5484 (33%) | N/A |
January–June 2021 | 2216 (13%) | N/A |
July–December 2021 | 2629 (16%) | N/A |
>January 2022 | 2631 (16%) | N/A |
Missing | 0 | N/A |
Time between initial infection and survey completion in days, median (IQR) | 173 (70–382) | N/A |
Missing | 0 | N/A |
Symptom status at initial COVID-19 illness | ||
Symptomatic | 16 588 (99%) | N/A |
Asymptomatic | 175 (1.0%) | N/A |
Missing | N/A | |
Presenting symptoms at initial COVID-19 illness | ||
Cardiopulmonary | 15 140 (90%) | N/A |
Neuropsychiatric | 14 178 (85%) | N/A |
Systemic | 14 949 (89%) | N/A |
Gastrointestinal | 9175 (55%) | N/A |
Missing | 0 | N/A |
Hospitalisation status at initial COVID-19 illness | ||
Not hospitalised | 14 839 (90%) | N/A |
Hospitalised | 1627 (9.9%) | N/A |
Missing | 298 | N/A |
Long COVID status at survey completion¶ | ||
Has long COVID | 10 518 (63%) | N/A |
Does not have long COVID | 1246 (7.4%) | N/A |
Cannot be determined** | 5000 (30%) | N/A |
*Missing data were due to invalid data, ’don‘t know’ responses, ’refuse to answer’ responses and missing responses. Percentages do not add up to 100% due to missing data. The number of missing data varies due to participants dropping out of the survey at different sections.
†Baseline data in participants with a history of SARS-CoV-2 infection are as of 20 October 2022.
‡Baseline data in participants without a history of SARS-CoV-2 infection are as of 14 February 2023.
§Participants with a history of SARS-CoV-2 infection were asked for self-rated health status prior to their COVID-19 illness. Participants without a history of SARS-CoV-2 infection were asked for self-rated health status prior to the COVID-19 pandemic.
¶Limited to participants with a defined long COVID status. Long COVID status was determined using the WHO definition. 320 participants without a defined long COVID status were excluded: 283 provided an initial SARS-CoV-2 infection date before the first confirmed positive test in the USA (20 January 2020), 28 provided an invalid SARS-CoV-2 infection date, 9 did not provide a SARS-CoV-2 infection date nor report experiencing symptoms of COVID-19.
**Long COVID status could not be determined because fewer than 12 weeks existed between initial SARS-CoV-2 infection and survey completion.
N/A, not applicable.
Prior to COVID-19 illness, 75% of participants reported very good/excellent health status and 8.2% of participants reported fair/poor health status (table 2). During the acute phase of COVID-19 illness, 99% of participants reported experiencing at least one symptom. Of those, 90% reported cardiopulmonary symptoms (eg, new/worsening cough, shortness of breath, rapid heart rate), 89% reported systemic symptoms (eg, fatigue, muscle weakness, fever), 85% reported neuropsychiatric symptoms (eg, headache, dizziness, neuropathy) and 55% reported gastrointestinal symptoms (eg, vomiting, diarrhoea, lack of appetite). Overall, 9.9% of participants self-reported being hospitalised for their COVID-19 illness and 63% were defined as having long COVID based on the WHO definition.
At the time of study enrolment, 39% of participants with a history of infection reported not being vaccinated against SARS-CoV-2 compared with 56% who reported receiving at least a complete first vaccination series (table 2). The median number of days between initial SARS-CoV-2 infection and study enrolment was 173 days. While most participants reported experiencing their initial SARS-CoV-2 infection in 2020 (56%), 29% reported being infected in 2021 and 16% reported being infected in 2022.
Similar characteristics were found in participants without a history of SARS-CoV-2 infection who completed the baseline survey. Among 799 participants, the median age was 42, 85% were female, 82% self-reported white race and 5.9% self-reported Hispanic/Latino ethnicity (table 1). A higher percentage of participants without a history of infection self-reported ‘other’ race (13% compared with 5.4%) which was largely due to a greater number self-reporting Asian/Pacific Islander/Native Hawaiian race. With regard to socioeconomic status, a higher percentage of participants without a history of infection reported a bachelor’s degree or higher (84% compared with 70%) and 72% reported an annual household income of $50 000 or more. Comparable pre-existing comorbid conditions were reported: hypertension (13%), depression/anxiety/other mental health conditions (36%), asthma/reactive airway disease/chronic lung disease (12%) and autoimmune disorders (8.1%) (table 2). Based on calculated BMI, a slightly higher percentage of participants without a history of infection were classified as having a normal BMI (43% compared with 34%).
Of participants without a history of SARS-CoV-2 infection, 75% reported very good/excellent health status and 7.5% reported fair/poor health status prior to the COVID-19 pandemic (table 2). At time of study enrolment, a higher percentage of participants without a history of infection reported at least a complete first vaccination series (97% compared with 56%). It is worth noting that enrolment for participants without a history of infection opened up in March 2022 when vaccinations were more widely available, likely accounting for this difference.
The demographic and clinical characteristics between those who agreed to be contacted for future studies and those who declined were comparable, with the exception of long COVID status (online supplemental table 2). Unsurprisingly, more individuals who fully recovered declined continued participation in the study. However, the number of indeterminate individuals (too early to determine long COVID status) was similar.
bmjopen-2023-077742supp002.pdf (39.3KB, pdf)
Strengths and limitations
The JHCLS is a large, online, prospective cohort study of adults with representation in 53 US states and territories. The baseline survey collects comprehensive clinical and behavioural data, including data related to COVID-19 diagnosis and treatment, health history, pre-existing health conditions, and physical, mental and cognitive limitations, and uses several reliable, validated scales to assess outcomes and exploratory variables. Participants are given the option to complete a one-time, anonymous online survey or to consent to longitudinal follow-up at predefined time intervals (every 3–6 months). In addition, the overall participant burden is minimal.
A major strength of the JHCLS is that a positive SARS-CoV-2 test is not required to be eligible to participate. We recognise that testing is often limited or inaccessible, and thus requires either a self-reported positive test or symptoms of COVID-19. In addition, our survey collects data on a wide range of organ systems using several different validated measures. Despite early studies focusing primarily on the respiratory symptoms associated with initial COVID-19 illness (eg, shortness of breath), we appreciate that SARS-CoV-2 may have notable effects on other organ systems following the acute period of infection as well.
Another strength of the JHCLS is the inclusion of participants without a history of SARS-CoV-2 infection which provides a natural control group, while also allowing for the determination of the incidence of long COVID among those who report a SARS-CoV-2 infection during follow-up. Importantly, both samples are comparable in terms of sociodemographic variables and pre-existing health conditions. We also recognise that many of the heterogeneous symptoms reported as long COVID may reflect all of us collectively living through a pandemic (ie, anxiety, depression). Thus, it is important that we compare those with and without infection to evaluate some of these outcomes during the same time frame (vs retrospective or historical controls).
In addition, there are few longitudinal studies focused on post-acute outcomes of COVID-19. Longitudinal studies provide an opportunity to evaluate change over time in exposures and outcomes. The longitudinal collection of data on new/continuing COVID-19 symptoms at each time point during follow-up will allow for evaluation of resolution and persistence of symptoms over time, as well as the impact of reinfection, vaccination and other health changes. To date, just under 7000 participants have consented to participate in longitudinal follow-up and have completed their first follow-up survey.
The JHCLS also has a few limitations. The reliance on self-reported clinical data, including self-reported SARS-CoV-2 tests, may result in recall and measurement bias. Although a confirmed SARS-CoV-2 test would be preferable, we recognise that tests were not available to everyone and that restriction to only those with a confirmed test would introduce selection bias. A second limitation is the possibility of selection bias due to the fact that the survey must be completed using a smart device or computer with internet access. This may preclude participants from lower socioeconomic statuses from participating. There is also a risk that findings from the JHCLS are not generalisable as the majority of participants self-reported white race, female gender and are from a higher socioeconomic status. To address this, we plan to do stratified-specific analyses that may be better representative of individuals within that same stratum. However, whether a study is representative or not depends not on demographics but on potential effect measure modifiers that may or may not include demographics.27 Additionally, results that may not necessarily be generalisable in the effect estimate may still be generalisable in the direction of effect (eg, protective or increased risk) of an exposure on outcome.27
Additionally, many individuals enrol many months after their acute infection when they already have long COVID. A potential selection bias would include increased likelihood of participation among those with more severe long COVID. However, it is important to note that the JHCLS has a subset of individuals (n=2020) who enrolled during their acute infection (within 4 weeks of infection). The high percentage of participants in our study with long COVID (63%) also likely reflects a selection bias on those willing to participate in COVID-19 research. However, those with and without a history of SARS-CoV-2 infection are similar in their demographic characteristics (table 1). Another limitation is the possibility of recall bias, especially among participants with a history of COVID-19 illness experiencing mental fatigue and/or other cognitive limitations at the time of survey completion.
A final limitation is the use of non-validated instruments to collect COVID-19-related data, including COVID-19 diagnosis, treatments and symptoms. We were limited by the unavailability of validated instruments to capture these domains at the time of study initiation. However, when available, we used validated instruments that targeted several specific domains (eg, anxiety, mental fatigue, etc), and when unavailable, we drew upon experience and validated instruments developed for other infectious diseases to develop questions used by our group and others across multiple COVID-19 studies.
Future plans
Moving forward, the JHCLS will continue to enrol additional participants with and without a history of SARS-CoV-2 infection and collect data from the baseline and longitudinal surveys. The study team is in the initial stages of analysing the longitudinal data collected thus far, focusing on the progression and resolution of long COVID symptoms over time. In addition, the study team is planning a cluster analysis of both initial and new/continuing COVID-19 symptoms to help address the broad WHO definition of long COVID. We plan to do this by bringing together the rich symptom data we have in our study with data on the impact each reported symptom has on daily functioning. In the future, the study team may apply for funding to answer additional research questions and to continue following the longitudinal participants for a longer period of time.
The JHCLS has the potential to impact both our overall understanding of long COVID and our ability to identify subgroups of individuals for targeted interventions. We can also capture real-time changes by SARS-CoV-2 variants (based on calendar time), location (using geospatial data), birth/age cohorts and/or vaccine data.
Collaboration
The JHCLS invites researchers to contact the corresponding author for collaboration opportunities.
Supplementary Material
Acknowledgments
We would like to express our deepest appreciation to our participants for their dedication, unwavering commitment and vulnerability in sharing their stories with us. We would also like to thank the REDCap team at the Johns Hopkins Bloomberg School of Public Health for their continued assistance and guidance. This endeavour would not have been possible without their help. Lastly, we‘d like to acknowledge and thank our student researchers.
Footnotes
Contributors: BL, PD and SHM conceived the original study concept and design and act as co-principal investigators. BL, PD and SHM are equally responsible for the overall content as guarantors. BL, PD, SHM and EW were responsible for the acquisition of the data. EW prepared the first draft of this manuscript, under the supervision of BL, PD and SHM. EW, ZN, KY, CV, JP, TK, PX, PD, BL and SHM were involved in reviewing the manuscript and contributing to critical revisions. Administrative and technical support was provided by EW, CV, ZN, JP, TK and PX.
Funding: The study is supported by the Johns Hopkins University COVID-19 Research Response Program (N/A) and in part by the Johns Hopkins University Center for AIDS Research (P30AI094189), which is supported by the following NIH co-funding and participating institutes and centres: NIAID, NCI, NICHD, NHLBI, NIDA, NIA, NIGMS, NIDDK and NIMHD.
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Competing interests: SHM receives material support from Abbott Laboratories (not related to this study).
Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review: Not commissioned; externally peer reviewed.
Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.
Data availability statement
Data are available upon reasonable request.
Ethics statements
Patient consent for publication
Not applicable.
Ethics approval
This study involves human participants but the Johns Hopkins Bloomberg School of Public Health Institutional Review Board (IRB00014874) exempted this study on 8 January 2021. Participants gave informed consent to participate in the study before taking part.
References
- 1. Bourmistrova NW, Solomon T, Braude P, et al. Long-term effects of COVID-19 on mental health: A systematic review. J Affect Disord 2022;299:118–25:S0165-0327(21)01253-2. 10.1016/j.jad.2021.11.031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Nalbandian A, Sehgal K, Gupta A, et al. Post-acute COVID-19 syndrome. Nat Med 2021;27:601–15. 10.1038/s41591-021-01283-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Raman B, Bluemke DA, Lüscher TF, et al. Long COVID: post-acute sequelae of COVID-19 with a cardiovascular focus. Eur Heart J 2022;43:1157–72. 10.1093/eurheartj/ehac031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Montani D, Savale L, Noel N, et al. Post-acute COVID-19 syndrome. Eur Respir Rev 2022;31:210185. 10.1183/16000617.0185-2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Soriano JB, Murthy S, Marshall JC, et al. A clinical case definition of post-COVID-19 condition by a Delphi consensus. Lancet Infect Dis 2022;22:e102–7:S1473-3099(21)00703-9. 10.1016/S1473-3099(21)00703-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Perlis RH, Santillana M, Ognyanova K, et al. Prevalence and correlates of long COVID symptoms among US adults. JAMA Netw Open 2022;5:e2238804. 10.1001/jamanetworkopen.2022.38804 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. van Kessel SAM, Olde Hartman TC, Lucassen PLBJ, et al. Post-acute and long-COVID-19 symptoms in patients with mild diseases: a systematic review. Fam Pract 2022;39:159–67. 10.1093/fampra/cmab076 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Domingo FR, Waddell LA, Cheung AM, et al. Prevalence of long-term effects in individuals diagnosed with COVID-19: an updated living systematic review. Epidemiology [Preprint] 2021. 10.1101/2021.06.03.21258317 [DOI]
- 9. Castanares-Zapatero D, Chalon P, Kohn L, et al. Pathophysiology and mechanism of long COVID: a comprehensive review. Ann Med 2022;54:1473–87. 10.1080/07853890.2022.2076901 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Rodriguez-Sanchez I, Rodriguez-Mañas L, Laosa O. Long COVID-19: the need for an Interdisciplinary approach. Clin Geriatr Med 2022;38:533–44:S0749-0690(22)00005-2. 10.1016/j.cger.2022.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Choutka J, Jansari V, Hornig M, et al. Unexplained post-acute infection syndromes. Nat Med 2022;28:911–23. 10.1038/s41591-022-01810-6 [DOI] [PubMed] [Google Scholar]
- 12. Ramakrishnan RK, Kashour T, Hamid Q, et al. Unraveling the mystery surrounding post-acute sequelae of COVID-19. Front Immunol 2021;12:686029. 10.3389/fimmu.2021.686029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Pavli A, Theodoridou M, Maltezou HC. Post-COVID syndrome: incidence, clinical spectrum, and challenges for primary Healthcare professionals. Arch Med Res 2021;52:575–81:S0188-4409(21)00081-3. 10.1016/j.arcmed.2021.03.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Karuna S, Gallardo-Cartagena JA, Theodore D, et al. Post-COVID symptom profiles and duration in a global Convalescent COVID-19 observational cohort: correlations with demographics, medical history, acute COVID-19 severity and global region. J Glob Health 2023;13:06020. 10.7189/jogh.13.06020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Harris PA, Taylor R, Thielke R, et al. Research electronic data capture (Redcap)—A Metadata-driven methodology and Workflow process for providing Translational research Informatics support. J Biomed Inform 2009;42:377–81. 10.1016/j.jbi.2008.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Harris PA, Taylor R, Minor BL, et al. The Redcap consortium: building an international community of software platform partners. J Biomed Inform 2019;95:103208:S1532-0464(19)30126-1. 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Ferrucci L. The Baltimore longitudinal study of aging (BLSA): a 50-year-long journey and plans for the future. J Gerontol A Biol Sci Med Sci 2008;63:1416–9. 10.1093/gerona/63.12.1416 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Stone JL, Norris AH. Activities and attitudes of participants in the Baltimore longitudinal study. J Gerontol 1966;21:575–80. 10.1093/geronj/21.4.575 [DOI] [PubMed] [Google Scholar]
- 19. Godin G. The Godin-Shephard Leisure-Time physical activity questionnaire. Health Fit J Can 2011;4:18–22. 10.14288/hfjc.v4i1.82 [DOI] [Google Scholar]
- 20. Godin G, Shephard RJ. A simple method to assess exercise behavior in the community. Can J Appl Sport Sci 1985;10:141–6. [PubMed] [Google Scholar]
- 21. Amireault S, Godin G. The Godin-Shephard leisure-time physical activity questionnaire: validity evidence supporting its use for classifying healthy adults into active and insufficiently active categories. Percept Mot Skills 2015;120:604–22. 10.2466/03.27.PMS.120v19x7 [DOI] [PubMed] [Google Scholar]
- 22. Vlahov D, Anthony JC, Munoz A, et al. The ALIVE study, a longitudinal study of HIV-1 infection in intravenous drug users: description of methods and characteristics of participants. NIDA Res Monogr 1991;109:75–100. [PubMed] [Google Scholar]
- 23. Rassu AL, Evangelista E, Barateau L, et al. Idiopathic Hypersomnia severity scale to better quantify symptoms severity and their consequences in idiopathic Hypersomnia. J Clin Sleep Med 2022;18:617–29. 10.5664/jcsm.9682 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Dauvilliers Y, Evangelista E, Barateau L, et al. Measurement of symptoms in idiopathic Hypersomnia: the idiopathic Hypersomnia severity scale. Neurology 2019;92:e1754–62. 10.1212/WNL.0000000000007264 [DOI] [PubMed] [Google Scholar]
- 25. Bentall RP, Wood GC, Marrinan T, et al. A brief mental fatigue questionnaire. Br J Clin Psychol 1993;32:375–9. 10.1111/j.2044-8260.1993.tb01070.x [DOI] [PubMed] [Google Scholar]
- 26. Spitzer RL, Kroenke K, Williams JBW, et al. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med 2006;166:1092–7. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
- 27. Rudolph JE, Zhong Y, Duggal P, et al. Defining representativeness of study samples in medical and population health research. BMJ Med 2023;2:e000399. 10.1136/bmjmed-2022-000399 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
bmjopen-2023-077742supp001.pdf (63.1KB, pdf)
bmjopen-2023-077742supp002.pdf (39.3KB, pdf)
Data Availability Statement
Data are available upon reasonable request.