Abstract
Social epidemiology is concerned with the health effects of forces that are “above the skin.” Although causal inference should be a key goal for social epidemiology, social epidemiology and quantitative causal inference have been seemingly at odds over the years. This does not have to be the case and, in fact, both fields stand to gain through a closer engagement of social epidemiology with formal causal inference approaches. We discuss the misconceptions that have led to an uneasy relationship between these 2 fields, propose a way forward that illustrates how the 2 areas can come together to inform causal questions, and discuss the implications of this approach. We argue that quantitative causal inference in social epidemiology is an opportunity to do better science that matters, a win-win for both fields.
Keywords: causal inference, quantitative, social epidemiology
Social epidemiology is concerned with the health effects of forces that are “above the skin.”. Individual behaviors, interactions with others, characteristics of neighborhood environments, domestic policies, and global trends that may shape the health of populations are within the remit of social epidemiology. Despite its interest in this breadth of exposures—or perhaps because of it—social epidemiology is centrally concerned with 1 of the main goals of other more narrowly defined areas of epidemiology: the identification of potential interventions to improve health. This means that causal inference should be a key task for social epidemiology.
Yet social epidemiology maintains an uneasy relationship with formal approaches to causal inference. Despite attempts to reconcile the interests and goals of social epidemiology with causal inference (1, 2), a full adoption of a quantitative causal framework by social epidemiologists has proven challenging (3). The engagement of causal inference methods in social epidemiology is sometimes portrayed as a fight between 2 sides: methodologists who believe social epidemiology studies are hopeless for causal inference purposes and social epidemiologists who believe the inflexibility of the counterfactual framework for causal inference prevents its widespread use in their field.
Such an artificial divide does little to help us deal with the very real need to think carefully about how social exposures affect the health of the populations. Here, we provide a guide for social epidemiologists who are interested in identifying causal effects and contemplating how these effects inform the practice of social epidemiology. We explain the misconceptions we think have led to this state of affairs, propose a framework that readily embraces the full range of interests of social epidemiology, and discuss implications of such a framework.
MISCONCEPTIONS FOR THE DIVIDE BETWEEN SOCIAL EPIDEMIOLOGY AND CAUSAL INFERENCE
In our assessment, 3 misconceptions are primarily responsible for leading us astray when considering the role of methods for quantitative causal inference in social epidemiology.
Misconception 1
“Social” exposures are qualitatively different from other exposures and therefore should play by different rules for causal inference. This misconception may partly reflect that, compared with other areas of epidemiology, social epidemiology is at an earlier stage in the process of defining its exposures. Social epidemiology, a relatively young academic field, is still grappling with what constitutes a social exposure and how to think about its causal effects. We argue that the exposures studied in social epidemiology should be subject to the same rules for causal inference as other classes of exposures in epidemiology (4, 5). Importantly, an advantage of formal causal inference is that it helps us identify causal questions that are too vague, whether the exposures are social or not.
Misconception 2
The goal of causal inference is to identify causes. As we discuss in this article, quantitative causal inference helps define and quantify causal effects but is agnostic about the definition of the word “cause” (6). Therefore, social epidemiologists primarily interested in labeling factors as causes need to look elsewhere because a simpler answer will not emerge from the quantitative causal inference framework.
Misconception 3
Causal inference requires exposures that can be experimentally manipulated. This misconception leads to a corollary that quantitative causal inference is concerned with narrow causal questions. This concern has implications for all of epidemiology but is particularly concerning for social epidemiology, given its ambition to study factors that may not be plausibly manipulable. Fortunately, experimental manipulation is not a prerequisite for meaningful causal inference (6).
A WAY FORWARD
These misconceptions have hampered the adoption of methods for quantitative causal inference in social epidemiology. Aiming to reconcile social epidemiology and causal inference, we propose here a way forward that illustrates how the 2 areas can come together to inform causal questions.
Suppose we have a data set with thousands of individuals. For each individual, the data include demographic, social, and clinical variables measured during their lifetimes. Our goal is to quantify the causal effect of multiple social factors on the risk of cardiovascular disease.
We can think of all these social variables as lying somewhere along a spectrum, as shown in Figure 1 (7). At 1 end of the spectrum, we place variables for which there is perfect consensus about how they could have been experimentally manipulated in individuals similar to those of our population. At the other end, we place variables for which there is absolutely no consensus about how they could have been experimentally manipulated. In reality, few or no variables dwell at either end, but we can reasonably place some variables relative to others along the spectrum.
By way of example, consider 3 variables: income, residential segregation, and race. Of these 3 variables, income is the 1 closest to the “perfect consensus” extreme of the range and race is the closed to the “no consensus” extreme. Let us suppose we are interested in the causal effect of these variables on cardiovascular disease.
For income, we can envision an experiment in which each household’s income is randomly determined and its members followed for several decades to determine the incidence of cardiovascular disease. We could increase the income of randomly selected households in any number of ways, for example, by supplementing salaries with a 30% bonus during the study period. The findings from such experiment, if feasible, would quantify the causal effect of income changes on cardiovascular disease. In fact, the experiment is a way to operationalize the meaning of the “causal effect of income on cardiovascular disease.”
Conceiving an experiment in which we manipulate residential segregation in the same way as income poses more challenges, but we can probably still imagine ways to do so. For example, we can envision experiments to randomly assign families to neighborhoods with different levels of residential segregation. We could imagine offering, as an inducement for participation, a desirable home, and then randomly assigning families who volunteer for the study. We could then follow these families forward in time to assess incidence of cardiovascular disease. Again, just specifying this hypothetical experiment helps us endow the concept of “the causal effect of residential segregation on cardiovascular disease” with a meaning.
Hardest of all would be envisioning an experiment in which race is randomly determined. It is unrealistic to imagine that we could agree on a sensible procedure to randomly assign race. We might consider placing white newborns with black parents, or vice versa. Of course, this would be ethically indefensible, but it also would not capture much of the causal effect of interest. It is not clear, therefore, that the findings of any (hypothetical) experiment we can imagine would help quantify the causal effect of race on cardiovascular disease, and thus it is not even clear whether we have a definition of the “causal effect of race on cardiovascular disease” in this setting.
Income, residential segregation, and race all have preoccupied social epidemiologists for decades and will certainly continue to do so in the coming decades. They are all social variables of interest. However, these 3 social factors fall in quite different places on the spectrum we have outlined. Recognizing that social factors map differently onto a causal spectrum helps push us to clearer thinking about the nature of social exposures and which ones we may wish to study, and how.
It is also important to note that this discussion applies to other types of exposures as well. Consider, for example, the effect on cardiovascular disease of statin therapy, diet, and chronic inflammation. These 3 exposures are arguably in the same relative position along the spectrum as income, residential segregation, and race. This sheds the notion that social factors are in some way distinct and stand apart from other epidemiologic exposures.
IMPLICATIONS OF THIS APPROACH
Once we recognize that thinking about causal effects of epidemiologic exposures—including social exposures—requires a consideration of their position along the spectrum, 3 implications emerge for social epidemiology.
First, if our goal is to improve health, we should prioritize actionable causal inferences. That is, we need to move to the right on the aforementioned spectrum. For example, we recognize race is a construct that does not lend itself to any reasonable interventions to create healthier populations, whereas residential segregation or income may be plausible targets for interventions. Following this rationale, we may wish to focus on racism (rather than race), an exposure that offers a clearer path to actionable interventions. Therefore, full adoption of a quantitative causal framework by social epidemiologists helps sharpen our focus on what matters most to improve the health of populations. This harkens to a focus on an epidemiology of consequence (8, 9), that is, a focus on an epidemiology that can guide efforts to improve the health of populations.
Second, thinking of causal effects along the spectrum helps clarify the methods required to estimate these effects. For exposures closer to the right end of the spectrum, a randomized trial or an observational study that closely emulates the trial may be the best approach to estimate the causal effect. For factors more to the left of the spectrum, however, it may be impossible to find observational data that will allow us to emulate a hypothetical target trial (10, 11). Rather, data and subject-matter knowledge may need to be combined to construct simulation models that allow us to run in silico hypothetical experiments of complex interventions (12, 13).
Third, the relative position of exposures along the spectrum, and therefore the degree to which a causal effect is well defined, affects our ability to identify and measure confounders of the effect. The more ill-defined the causal effect is, that is, the more difficult it is to conceptualize hypothetical interventions, the harder the conceptualization of confounding for the effect. For example, when using observational data to emulate a target trial of income changes, residential segregation and race may be thought of as important confounders. However, it is unclear what a confounder for “race” is because the effect of race on, say, cardiovascular disease is an ill-defined concept.
CONCLUSION
An engagement with causal thinking can point to questions in social epidemiology that may have more direct implications for the health of populations, consonant with an epidemiology of consequence (14). This argues for more work on causal thinking in social epidemiology, a rekindling of the decade-old efforts to reconcile the 2 areas that have since fallen somewhat by the wayside as the field has evolved (15, 16).
A marriage between social epidemiology and causal thinking may not be particularly far off. We would argue rather that the questions of social epidemiology may be particularly fertile ground for causal thinking. Quantitative causal inference in social epidemiology is an opportunity to do better science that matters, a win-win for both fields.
ACKNOWLEDGMENTS
Author affiliations: Dean’s Office, Boston University School of Public Health, Boston, Massachusetts (Sandro Galea); and Departments of Epidemiology and Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts; Harvard-MIT Division of Health Sciences and Technology, Boston, Massachusetts (Miguel A. Hernán).
S.G. and M.A.H. contributed equally to the work.
This work was funded by The National Institute of Allergy and Infectious Disease at the National Institutes of Health (grant R37 AI102634).
We thank Shui Yu for editorial assistance.
This paper was presented as the Cassel Lecture at the 2019 Society for Epidemiologic Research Annual Meeting, June 18–21, 2019, Minneapolis, Minnesota.
Conflict of interest: none declared.
REFERENCES
- 1. Kaufman JS. Commentary: causal inference for social exposures. Annu Rev Public Health. 2019;40(1):7–21. [DOI] [PubMed] [Google Scholar]
- 2. Glymour MM, Rudolph KE. Causal inference challenges in social epidemiology: bias, specificity, and imagination. Soc Sci Med. 2016;166:258–265. [DOI] [PubMed] [Google Scholar]
- 3. Krieger N, Davey Smith G. The tale wagged by the DAG: broadening the scope of causal inference and explanation for epidemiology. Int J Epidemiol. 2016;45(6):1787–1808. [DOI] [PubMed] [Google Scholar]
- 4. Glass TA, Goodman SN, Hernán MA, et al. . Causal inference in public health. Annu Rev Public Health. 2013;34(1):61–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Robins JM, Weissman MB. Commentary: counterfactual causation and streetlamps: what is to be done? Int J Epidemiol. 2017;45(6):1830–1835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hernán MA. Does water kill? A call for less casual causal inferences. Ann Epidemiol. 2016;26(10):674–680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Robins JM, Greenland S. Causal inference without counterfactuals: comment. J Am Stat Assoc. 2000;95(450):431–435. [Google Scholar]
- 8. Keyes K, Galea S. What matters most: quantifying an epidemiology of consequence. Ann Epidemiol. 2015;25(5):305–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Hernán MA. Counterpoint: epidemiology to guide decision-making: moving away from practice-free research. Am J Epidemiol. 2015;182(10):834–839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Dorn HF. Philosophy of inferences from retrospective studies. Am J Public Health Nations Health. 1953;43(6 Pt 1):677–683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Hernan MA. Invited commentary: agent-based models for causal inference-reweighting data and theory in epidemiology. Am J Epidemiol. 2015;181(2):103–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Marshall BDL, Galea S. Formalizing the role of agent-based modeling in causal inference and epidemiology. Am J Epidemiol. 2015;181(2):92–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Galea S. An argument for a consequentialist epidemiology. Am J Epidemiol. 2013;178(8):1185–1191. [DOI] [PubMed] [Google Scholar]
- 15. Kaufman JS, Cooper RS. Seeking causal explanations in social epidemiology. Am J Epidemiol. 1999;150(2):113–120. [DOI] [PubMed] [Google Scholar]
- 16. Kaufman JS, Poole C. Looking back on “causal thinking in the health sciences”. Annu Rev Public Health. 2000;21:101–119. [DOI] [PubMed] [Google Scholar]