Abstract
Nearly every introductory epidemiology course begins with a focus on person, place, and time, the key components of descriptive epidemiology. And yet in our experience, introductory epidemiology courses were the last time we spent any significant amount of training time focused on descriptive epidemiology. This gave us the impression that descriptive epidemiology does not suffer from bias and is less impactful than causal epidemiology. Descriptive epidemiology may also suffer from a lack of prestige in academia and may be more difficult to fund. We believe this does a disservice to the field and slows progress towards goals of improving population health and ensuring equity in health. The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak and subsequent coronavirus disease 2019 pandemic have highlighted the importance of descriptive epidemiology in responding to serious public health crises. In this commentary, we make the case for renewed focus on the importance of descriptive epidemiology in the epidemiology curriculum using SARS-CoV-2 as a motivating example. The framework for error we use in etiological research can be applied in descriptive research to focus on both systematic and random error. We use the current pandemic to illustrate differences between causal and descriptive epidemiology and areas where descriptive epidemiology can have an important impact.
Keywords: descriptive epidemiology, methods, surveillance, teaching
Abbreviations
- COVID-19
coronavirus disease 2019
- SARS-CoV-2
severe acute respiratory syndrome coronavirus 2
Editor’s note: The opinions expressed in this article are those of the authors and do not necessarily reflect the views of the American Journal of Epidemiology.
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak and subsequent coronavirus disease 2019 (COVID-19) pandemic have highlighted the importance of descriptive epidemiology in responding to serious public health crises. Key descriptive questions whose answers should inform the pandemic response include: Who is being infected and dying, according to age, sex, race, socioeconomic status (person)? Where are infection rates in local geographies rising and falling (place)? And how are rates of infection and case fatality changing (over time) (1–3)?
Given that we had tools to reduce the risk of COVID-19 infection (namely physical distancing) since the beginning of the pandemic, answers to these questions have the potential to allow policy makers to create and/or align existing policies to reduce transmission and improve survival in communities they serve, allow targeting of scarce resources (e.g., testing, tracers, and vaccines) to where they are most needed, and hold us accountable for the equitable or inequitable response to the pandemic (e.g., understanding the distribution of disease relative to the distribution of testing resources). Of course, descriptive epidemiology alone does not determine policy, which also must account for the economic or political consequences of public health measures, but without descriptive epidemiology on where transmission is occurring in terms of person, place, and time, making good policy decisions is nearly impossible.
We could arguably do a better job of prioritizing descriptive epidemiology in: 1) our research portfolios and 2) the teaching mission of our discipline. Academic epidemiologists often leave descriptive epidemiology to practitioners working in governmental settings (Centers for Disease Control and Prevention, National Center for Health Statistics, and others), possibly because of a (reality-based) perception that it is hard to get descriptive studies funded and published. This, then, leaves the teaching of descriptive epidemiology to faculty whose research is not focused on this topic, and who have minimal experience of the realities of doing descriptive epidemiology. For example, given privacy and confidentiality concerns surrounding surveillance data, academic epidemiologists rarely get hands-on understanding of the messiness of those data, the experience of generating descriptive statistics, or the opportunity to apply advanced methods that would improve the utility of those data. It also means that often (not always), the most rigorous descriptive epidemiology is published in the appendices of technical governmental reports, rather than scientific journals, and may not be easily accessible to academic researchers.
In this commentary, we highlight the need for methodological rigor in descriptive epidemiologic training and analyses. We do so using the SARS-CoV-2 pandemic as a motivating example, although the issues we raise are not specific to the current pandemic.
WHAT IS DESCRIPTIVE EPIDEMIOLOGY?
There is no standard definition of descriptive epidemiology. The Centers for Disease Control and Prevention’s Field Epidemiology Manual says that “this task, called descriptive epidemiology, answers the following questions about disease, injury, or environmental hazard occurrence: What? How much? When? Where? Among whom?” (4, p. 106). The Dictionary of Epidemiology says descriptive studies are “… more concerned with describing associations than with analyzing and explaining causal effects” and refers to “General descriptions concerning the relationship of disease to basic characteristics such as age, gender, ethnicity, occupation, social class, and geographic location” (5, p. 72).
We contend that descriptive epidemiology seeks to characterize the distributions of health, disease, and harmful or beneficial exposures in a well-defined population as they exist, including any meaningful differences in distribution, and whether that distribution is changing over time. Descriptive epidemiology also seeks to embed this data in the historical and sociological context, so that we can attempt to understand the ways in which that context contributes to patterns of disease and mortality (see our example below describing the distribution of COVID-19 according to race given the history of systemic racism). This is not the same as estimating the causal effects of these contexts, because it does not involve estimating the incidence of disease absent the historical context.
A simple framework for descriptive epidemiology
When teaching causal inference, often a framework is used to guide the approach (6). Outlining a full framework for descriptive inference is beyond the scope of this commentary; however, briefly, the first step is asking a clear question (7). The target quantity of interest for descriptive epidemiology is some feature of the underlying true disease distribution in a well-defined target population. As with—and perhaps more so than with—causal questions, asking good descriptive questions requires grounding them in theory about the disease-generating process (8) and defining a clear target population (9).
Accurately answering descriptive questions requires appropriate sampling, valid measurement of the outcome and any covariates, and appropriate data analysis. Because causal inference focuses on a contrast (absolute or relative) of disease distributions across exposure groups, if we can assume that covariates associated with sampling are not effect measure modifiers on the scale of interest, the sampling mechanism can be ignored (10). There may be reasons not to prioritize representative sampling for causal questions (11), for example, to increase precision. However, representative sampling (or stratified sampling with a clear sampling frame) is a key concern in descriptive epidemiology. To understand the validity of a sampling strategy, the target population must be well-defined. Because descriptive epidemiology makes inferences from the sampled population to the target population (and not across exposure groups), lack of a sampling frame or failure to define the target population makes inference nearly impossible. Sampling need not be representative (e.g., we might oversample some groups, then reweight the data to account for the oversampling); however, better understanding of survey sampling methods might improve descriptive epidemiologic studies. Arguably, coursework on survey sampling methods should be a core part of graduate epidemiology curricula.
Early COVID-19 work provides an example of how a poorly conceived sampling strategy can lead to poor inference in descriptive epidemiology. Early serological surveys enrolled a sample of volunteers interested in knowing their SARS-CoV-2 antibody status (12). These surveys likely oversampled people who had experienced COVID-19-like symptoms during the Spring of 2020, and the resulting prevalence estimates were likely much higher than would have been obtained from a random population sample. Here, measurement error (as the tests used were not perfect at detecting COVID-19 antibodies and, due to low prevalence of COVID-19, false positives probably overwhelmed true positives) also likely biased results. However, in contrast to causal analyses, confounding bias is not an issue (indeed, confounding bias is not defined for this question).
Descriptive questions ask about the distribution of some outcome in a single target population under exposure conditions that the population actually received (i.e., only the factual exposures, not counterfactual exposures). Yet asking a descriptive question does not preclude thinking about the exposure conditions and historical context that produced the disease distribution and allowing them to guide the study design. For COVID-19, descriptive questions should be informed by theories including those related to infectious disease transmission dynamics (and transmission of respiratory infection dynamics specifically), social theory on fundamental causes of disease distribution, life-course theory, and ecosocial theory to name a few (13). We might ask what proportion of the population has been infected with COVID (in terms of sociodemographic factors and space and time). To answer this question, we need to define our target population of interest. This may be the population of the United States (or some other country) or a state within, or it could be a community, population, or location in which transmission is expected due to close contact (e.g., schools, restaurants, or churches). We then need to sample appropriately from that population such that our estimates of risk are representative of the population. We need valid measurement techniques (tests for COVID with high sensitivity and specificity) to generate valid estimates of the population prevalence or incidence of disease. And finally, we need to decide according to what characteristics, if any, we present the data. Ideally this should be done prior to data collection to ensure valid data collection measures are used, and, in some cases, sampling techniques need to be modified for populations with small numbers (e.g., oversampling methods). For example, if we understand that systemic racism is a fundamental determinant of health, we might prioritize questions that describe the distribution of COVID-19 according to race. If we understand how different “essential” professions are unequally valued by society, we might prioritize describing the distribution of COVID-19 according to occupation, given that workers in many occupations deemed essential were required to work in close contact with other people, thus increasing their risk for infection, but offered unequal protections. For example, while the dangers of personal protective equipment shortages for medical professionals were widely publicized, there was not a similar national effort to provide high-quality masks to janitorial staff in health-care settings or workers in grocery stores.
Bias and error in descriptive epidemiology
We would like to see more emphasis (in research and teaching) on bias and error in the context of descriptive epidemiology. As with causal epidemiology, descriptive epidemiology needs to contend with random error, preferably summarized using confidence intervals around key parameters, whether they be rates and proportions or descriptive associational measures.
And as with causal epidemiology, descriptive epidemiology also needs to contend with bias and systematic error. Some biases that influence descriptive epidemiology are not the same as, or behave differently from, biases in causal investigations. The most prominent would be confounding. Unlike in causal epidemiology, where analytical adjustment to control for confounding is essential, in descriptive epidemiology, there are times when analytical adjustment is helpful but others where it is not (7). In our experience, students frequently fail to tailor their study design to their question and overadjust or mention “unmeasured confounding” as a limitation in their descriptive studies.
Selection bias should also be considered and mitigated in descriptive epidemiology. As discussed above, there may be some selection mechanisms that do not cause bias in the estimation of causal parameters (another example is selecting all cases and a sampling of the study base in case-control studies), that would preclude estimation of descriptive parameters like risks or rates.
Another source of bias in descriptive epidemiology is measurement error. Using well-validated tools and measurement techniques that have high sensitivity and specificity to measure important study variables (in particular the outcome and any stratifying variables) is essential for generating valid results that can lead to strong inferences and to positive public health action. Techniques like quantitative bias analysis can be applied to descriptive epidemiology to account for poor measurement, although the approaches have received less attention in this space. Descriptive epidemiology is also affected by nonrandomly missing data on the outcome or key covariates used to stratify risk.
A note on theory in descriptive epidemiology
Descriptive epidemiology is a vital component of public health and crucial to ensuring that health and wellness are attainable by all. History and epidemiologic theory (which, in our experience, are also given limited space in epidemiology curricula), such as critical race public health praxis (14), tell us that the current distributions of health and illness are the result of decades or centuries of injustices and inequities. The current SARS-CoV-2 pandemic is exacerbating most, if not all, of the inequities that we have seen in the past, in the United States and around the world (15). Poorly executed descriptive epidemiology—for example, based on uncritical or unexamined choices about what data to collect, how to classify populations, or what (if any) variables to control for in analyses—has the potential to reinforce narratives and power structures that led to inequities in the first place. In contrast, well-executed, theoretically grounded descriptive epidemiology can provide us with tools and evidence to actively dismantle these structures, provided we are intentional in defining our questions and designing our studies.
Stratification and adjustment in descriptive epidemiology: understanding who, where, and when
When the goal of a study is descriptive, the first question to ask is “What is the goal of this description, and does it require (stratification on or) adjustment for additional variables? (7)” Descriptive analyses often involve stratifying results by pertinent population descriptors to assess heterogeneity across and within groups, to target resources or identify groups at high risk of disease. Stratification can help us identify differences between groups or areas and as such is a critical tool for descriptive epidemiology. As described above, our choices about which variables to consider as strata should be driven by theory and context. As with the investigation of effect heterogeneity, blindly stratifying by all available covariates runs the risk of returning spurious associations (16, 17).
There are important reasons descriptive statistics are sometimes more useful after adjustment for some variables (e.g., age) that are known to affect the outcome and whose distribution differs across populations (18). Such adjustment may help us hypothesize about reasons for variability within or across populations to better identify disparities in disease distribution that might otherwise be masked. For example, if we sought to describe differences in COVID mortality between doctors and nurses, we might adjust for differences in age distributions (19) to focus on occupational hazards rather than known age-related hazards. In other instances, however, age adjustment can prevent appropriate targeting or resources (20). This is because adjusting away a major difference between 2 populations, one of which is, on average older than the other, may make it seem that rates of diseases are the same, and therefore require equivalent resources, when in fact there is more disease in the older population.
As another example, early descriptions of COVID mortality by race and ethnicity (inappropriately) adjusted for geography, obscuring disparities due to systemic racism (21, 22). In some cases, adjusting for place adjusts away the policies that drove people to live where they do (e.g., redlining) and that influenced other health conditions in residents of those places (e.g., availability of healthy foods and concentration of air pollution). Therefore, inappropriate adjustment makes it harder to see the magnitude of disparities. We note here that in a causal context, place could be a confounder in some circumstances or a mediator in others. In descriptive epidemiology, this distinction may be less relevant to whether to adjust; the question of interest should guide this decision. Overall, understanding how adjustment changes the results and, critically, the question, would help guide the decision of whether to adjust in the descriptive question at hand, and no adjustment should be implemented without a clear rationale for why adjustment is needed and a thorough understanding of the implications for interpretation of the results with adjustment. A robust understanding of the utility of adjustment will require that we stop equating “adjustment” with “confounder control” (which implies a causal question) (18).
The line between descriptive and causal epidemiology
Observing differences in disease distribution across populations often inspires causal hypotheses. This can make it difficult to identify where descriptive epidemiology ends and causal epidemiology begins. Indeed, there is a continuum from descriptive to causal epidemiology and no clear line exists between them. But, in our view, too often this leads to prioritizing causal epidemiology over descriptive epidemiology in our training programs. Even the Centers for Disease Control and Prevention’s own text quickly pivots from “[descriptive epidemiology conveys] the extent and pattern of the public health problem being investigated” to “this information in turn provides important clues to the causes of the disease” (23, p. 1–31) as if descriptive epidemiology is merely a waystation on the path to causal inference. Descriptive epidemiology can be hypothesis generating but has utility on its own.
Part of the confusion between causal and descriptive epidemiology may result from a failure to clearly delineate different types of questions and from teaching analytical methods separately from specific questions being asked of the data. The literature describing different analytical goals and how those goals are linked to analytical strategies is thin, albeit growing (24–28). Most textbooks that discuss adjustment (whether standardization, stratification, or regression) do so in the context of confounder control, without acknowledging there are other possible goals of statistical adjustment (e.g., the “partial adjustment” approach described above) (18). Leading with 1) types of questions, then 2) how to ask a good question within that type, and 3) identifying the appropriate methods for the question is key to getting better information to inform public health action (29).
Failing to have a clear analytical plan for descriptive epidemiology can lead to poor execution and interpretation (30–32) that lands somewhere between descriptive and causal approaches. Such studies are often framed as “risk-factor” analysis (33), which, given there is no clear definition of a “risk factor,” creates more confusion. For example, media coverage of a preprint investigating “risk factors” for COVID-19–related death (34–36) inappropriately interpreted a conditional association between smoking and COVID-19 mortality as “protective” although it was adjusted for causal intermediates and potential colliders (37, 38). “What are rates of COVID-19 among smokers?” could be clearly framed as a descriptive question (and no adjustment would be warranted) or reframed as a causal question (“What is the effect of smoking on COVID-19 risk?”; appropriate confounder control would be required). This anecdote highlights the need to 1) be clear in what our goals are in epidemiologic research (description vs. causation, or perhaps prediction), 2) be cautious and deliberate in how we conduct and interpret descriptive studies (e.g., avoid multivariable models as a default analytical strategy), 3) be clear in how we communicate descriptive epidemiology results, and 4) not shy away from descriptive epidemiology when it can provide useful insights.
CONCLUSION
We as a field need to devote more attention to the teaching and development of methodological guidance on the creation and dissemination of good descriptive epidemiology. Having a course that is focused solely on descriptive epidemiology would be one way to convey its importance. An alternative would be to weave descriptive epidemiology into all courses but ensure it is given appropriate prominence. We would like to see epidemiology journals encouraging publication of methodology and applied papers on descriptive epidemiology, and our training, evaluation, funding, and promotion committees recognizing the value of such contributions to our discipline. Finally, we would like to foster academic–public health practice partnerships so that the innovations in descriptive epidemiology born out of necessity in the face of real-world surveillance data and pressing public health problems can be embraced and addressed by multidisciplinary teams.
ACKNOWLEDGMENTS
Author affiliations: Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, United States (Matthew P. Fox, Eleanor J. Murray); Department of Global Health, Boston University School of Public Health, Boston, Massachusetts, United States (Matthew P. Fox); Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States (Catherine R. Lesko); and Division of Epidemiology, Ohio State College of Public Health, Ohio State University, Columbus, Ohio, United States (Shawnita Sealy-Jefferson).
This work was funded by the National Institutes of Health (grants K01 AA028193 (C.R.L.) and R21HD098733 (E.J.M.)) and Robert Wood Johnson Foundation (grant 77771).
Conflict of interest: none declared.
REFERENCES
- 1. Garg S, Kim L, Whitaker M, et al. Hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed coronavirus disease 2019—COVID-NET, 14 states, March 1–30, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(15):458–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Lauer SA, Grantz KH, Bi Q, et al. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med. 2020;172(9):577–582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Verity R, Okell LC, Dorigatti I, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect Dis. 2020;20(6):669–677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Fontaine RE. Describing epidemiologic data. In: Rasmussen SA, Goodman RA, eds. The CDC Field Epidemiology Manual. New York, NY: Oxford University Press; 2019:105–134. [Google Scholar]
- 5. Porta MA, ed. A Dictionary of Epidemiology. 6th ed. New York, NY: Oxford University Press; 2014. [Google Scholar]
- 6. Balzer L, Petersen M, Laan M. Tutorial for causal inference. In: Buhlmann P, Drineas P, Buhlmann P, Drineas P, Kane M, Laan M, eds. Handbook of Big Data. London, UK: Chapman & Hall/CRC; 2016. [Google Scholar]
- 7. Conroy S, Murray EJ. Let the question determine the methods: descriptive epidemiology done right. Br J Cancer. 2020;123(9):1351–1352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Hernán M, Hernández-Díaz S, Werler M, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–184. [DOI] [PubMed] [Google Scholar]
- 9. Westreich D, Edwards JK, Lesko CR, et al. Target validity and the hierarchy of study designs. Am J Epidemiol. 2019;188(2):438–443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Lesko CR, Buchanan AL, Westreich D, et al. Generalizing study results: a potential outcomes perspective. Epidemiology. 2017;28(4):553–561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Rothman KJ, Gallacher JEJ, Hatch EE. Why representativeness should be avoided. Int J Epidemiol. 2013;42(4):1012–1014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Bendavid E, Mulaney B, Sood N, et al. COVID-19 antibody seroprevalence in Santa Clara County, California. Int J Epidemiol. 2021;50(2):410–419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Krieger N. Got theory? On the 21st C. CE rise of explicit use of epidemiologic theories of disease distribution: a review and ecosocial analysis. Curr Epidemiol Rep. 2014;1:45–56. [Google Scholar]
- 14. Ford CL, Airhihenbuwa CO. The public health critical race methodology: praxis for antiracism research. Soc Sci Med. 2010;71(8):1390–1398. [DOI] [PubMed] [Google Scholar]
- 15. Lopez L 3rd, Hart LH 3rd, Katz M. Racial and ethnic health disparities related to COVID-19. JAMA. 2021;325(8):719–720. [DOI] [PubMed] [Google Scholar]
- 16. Lesko CR, Henderson NC, Varadhan R. Considerations when assessing heterogeneity of treatment effect in patient-centered outcomes research. J Clin Epidemiol. 2018;100:22–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Second International Study of Infarct Survival Collaborative Group . Randomized trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected acute myocardial infarction: ISIS-2. J Am Coll Cardiol. 1988;12(6):A3–A13. [DOI] [PubMed] [Google Scholar]
- 18. Kaufman JS. Statistics, adjusted statistics, and maladjusted statistics. Am J Law Med. 2017;43(2-3):193–208. [DOI] [PubMed] [Google Scholar]
- 19. Buerhaus P, Auerbach D, Staiger D. Older clinicians and the surge in novel coronavirus disease 2019 (COVID-19). JAMA. 2020;323(18):1777–1778. [DOI] [PubMed] [Google Scholar]
- 20. Thurber KA, Thandrayen J, Maddox R, et al. Reflection on modern methods: statistical, policy and ethical implications of using age-standardized health indicators to quantify inequities. Int J Epidemiol. 2022;51(1):324–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Cowger TL, Davis BA, Etkins OS, et al. Comparison of weighted and unweighted population data to assess inequities in coronavirus disease 2019 deaths by race/ethnicity reported by the US Centers for Disease Control and Prevention. JAMA Netw Open. 2020;3(7):e2016933. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zalla LC, Martin CL, Edwards JK, et al. A geography of risk: structural racism and coronavirus disease 2019 mortality in the United States. Am J Epidemiol. 2021;190(8):1439–1446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Centers for Disease Control and Prevention . Principles of Epidemiology in Public Health Practice Third Edition: An Introduction to Applied Epidemiology and Biostatistics. Atlanta, GA: Centers for Disease Control and Prevention; 2006. https://www.cdc.gov/csels/dsepd/ss1978/SS1978.pdf. Accessed March 16, 2022. [Google Scholar]
- 24. Hernán MA, Hsu J, Healy B. A second chance to get causal inference right: a classification of data science tasks. Chance. 2019;32(1):42–49. [Google Scholar]
- 25. Vittinghoff E, Glidden DV, Shiboski SC, et al. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. Boston, MA: Springer; 2006. [Google Scholar]
- 26. Lesko CR, Keil AP, Edwards JK. The epidemiologic toolbox: identifying, honing, and using the right tools for the job. Am J Epidemiol. 2020;189(6):511–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Lau B, Duggal P, Ehrhardt S. Epidemiology at a time for unity. Int J Epidemiol. 2018;47(5):1366–1371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Lau B, Duggal P, Ehrhardt S, et al. Perspectives on the future of epidemiology: a framework for training. Am J Epidemiol. 2020;189(7):634–639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Fox M, Edwards J, Platt R, et al. The critical importance of asking good questions: the role of epidemiology doctoral training programs. Am J Epidemiol. 2020;189(4):261–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Huitfeldt A. Is caviar a risk factor for being a millionaire? BMJ. 2016;355:i6536. [DOI] [PubMed] [Google Scholar]
- 31. De Maat MPM, Trion A. C-reactive protein as a risk factor versus risk marker. Curr Opin Lipidol. 2004;15(6):651–657. [DOI] [PubMed] [Google Scholar]
- 32. Westreich D, Greenland S. The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol. 2013;177(4):292–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Kannel WB, Dawber TR, Kagan A, et al. Factors of risk in the development of coronary heart disease—six year follow-up experience. The Framingham study. Ann Intern Med. 1961;55(1):33–50. [DOI] [PubMed] [Google Scholar]
- 34. Williamson EJ, Walker AJ, Bhaskaran K, et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature. 2020;584(7821):430–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Williamson E, Walker AJ, Bhaskaran KJ, et al. OpenSAFELY: factors associated with COVID-19-related hospital death in the linked electronic health records of 17 million adult NHS patients. Nature. 2020;584(7821):430–436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Westreich D, Edwards J, Smeden M. Comment on Williamson et al. (OpenSAFELY): the table 2 fallacy in a study of COVID-19 mortality risk factors. Epidemiology. 2021;32(1):e1–e2. [DOI] [PubMed] [Google Scholar]
- 37. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48. [PubMed] [Google Scholar]
- 38. Tennant P, Murray E. The quest for timely insights into COVID-19 should not come at the cost of scientific rigor. Epidemiology. 2021;32(1):e2. [DOI] [PubMed] [Google Scholar]