Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 May 1.
Published in final edited form as: Epidemiology. 2019 May;30(3):317–320. doi: 10.1097/EDE.0000000000000986

Is the Smog Lifting? Causal Inference in Environmental Epidemiology

W Dana Flanders 1,2,3, Michael D Garber 1
PMCID: PMC6456370  NIHMSID: NIHMS1520026  PMID: 30789424

We thank the Editor for this opportunity to comment on the accompanying paper by Pearce, Vandenbroucke, and Lawlor1 [PVL]. In their commentary, PVL discuss methods that they argue have been and should be useful for making causal inferences in environmental epidemiology. They also cite part of an ongoing debate in the literature (their references 6-12) that criticizes some causal-inference methods in epidemiology. In this commentary, we first point out many areas of agreement with PVL. We then comment on several issues raised by their commentary. One issue concerns omission of important references, and four issues concern claims that were poorly supported or potentially misleading. We also note how, with consideration of a wider scope of methods and more nuanced interpretation of recommendations regarding emulation of RCTs, many newer causal inference methods have been and should continue to be quite useful in environmental epidemiology and, in fact, have been used to justify many of the methods PVL suggested.

Six main points of this paper, at least at heart, are clear, documented, and convincing. In simple terms, these points are: 1) causal inference for certain questions in environmental epidemiology, such as studying the health effects of climate change, is difficult; 2) traditional methods have worked quite well for certain problems in environmental epidemiology; 3) additionally, at least five specific other methods (“extensions”) should be helpful in environmental epidemiology; 4) many methods can be useful in environmental epidemiology; 5) related to 4, the subgroup of methods as defined by PVL (‘RCT-mimicking set of “causal inference” methods’) doesn’t include everything that might be useful; and, 6) triangulation methods are one such additional method. The first and second points are supported by examples; the third point is also supported by examples for each method noted. Interestingly, all of the “extensions” have benefited by considerations from the causal inference tool box, as we elaborate below. The fourth and fifth points are perhaps self-evident, as we know of no subgroup of methods that covers everything or claims to. The last point is essentially an extension of the well-accepted Bradford Hill considerations, with additional considerations about the directions of potential bias. We agree with these six points.

The first issue we highlight concerns the omission of important references that are part of an ongoing debate about the merits of certain aspects of modern causal inference concepts, methods, and tools. While we welcome the largely muted level of criticism in this commentary, PVL cite only one side (their references 6-12) of this debate, omitting many responses to and commentaries on these criticisms29. To varying degrees, these responses and commentaries have clarified positions, provided strong rebuttals of certain criticisms, explained misunderstandings, pointed out a strawman fallacy, and offered conciliatory remarks.

Our second issue concerns PVL’s unqualified claim that ‘the term “causal inference” is being used to denote a specific set of newly developed methods …’, characterized as the ‘…RCT-mimicking set of “causal inference” methods, in contrast to the broader field of causal inference of which it is a part’. This claim is unsupported and potentially misleading because, in the context of the ongoing debate, it could be taken to suggest that those who have contributed to modern methods of causal inference and who are also involved in the ongoing debate used the term “causal inference” in this way. On the contrary, most, likely all, who have contributed to modern methods of causal inference and who are also involved in the ongoing debate about the merits of certain causal inference methods, do not, in general, use the term “causal inference” to refer only to the restricted, narrow subgroup of methods described by RVL as “RCT-mimicking”a. In other words, the definition put forth by PVL is consistent with neither the language used by these key contributors to modern methods, nor with the entirety of methods that those contributors have developed, used, and cited in their work.

Our third issue concerns PVL’s claim that: “This [modern causal inference movement, as defined by PVL] proposes that observational studies should mimic key aspects of randomized trials, since this allows them to be rooted in counterfactual reasoning, which is said to formalize the natural way that humans think about causality”. This claim is also potentially misleading because, in view of the ongoing debate29 and without qualification, the claim might be read as implying that all observational studies should mimic key aspects of randomized trials. Worse yet, it might be read as implying that it represented the position of those involved in the debate. These interpretations would be incorrect. We agree that Hernán strongly advocates attempting to emulate randomized experiments as a device to aid study design, a position justified in part because such emulation can help sharpen effect definitions as counterfactual contrasts of better-defined, possibly hypothetical interventions, thereby helping to reduce the vagueness in causal questions4, 10, 11. But with a broader reading of Hernán’s work and that of other causal-inference contributors, three important additional observations emerge. First, these causal-inference authors are making a conditional claim: if such emulation is possible, then many advantages will likely accrue. For example, as just noted the definition of the causal effect of interest may be clarified; additionally, an intervention of potential public health utility might be better identified and evaluated, and certain properties that can contribute to valid causal inference including exchangeability, consistency and positivity might be better defined and more evaluable2, 10. See also Daniel et al. and van der Laan.9, 12 Second, many causal-inference authors note that in some contexts (e.g., system-wide interventions), it may not be possible to conceptualize an RCT that can be usefully emulated by an observational study as part of the causal inference process2, 6, 13, 14. Third, these authors note that if one cannot emulate an RCT, then other methods are available and may apply, including those mentioned by PVL as useful in environmental epidemiology, such as instrumental variables, Mendelian randomization, and regression-discontinuity and difference-in-differences designs2, 1518.

As an alternative to the restrictive definition of causal-inference methods used by PVL, we suggest, in broad agreement with their overall message, that casual-inference methods correspond to what the phrase suggests and should include any, useful, valid method, many of which are already being used in environmental epidemiology. The list would include the methods listed by PVL plus others, some not mentioned or emphasized, such as G-computation, G-estimation, marginal structural models1921, probabilistic causal models22, structural equation models, combining information from multiple sources in a structural equation framework23, negative control outcomes and exposures24 with specific applications to air pollution and environmental epidemiology2527, directed acyclic graphs (DAGs), simulation modeling, and more. We specifically note the important contribution of causal graphs28, and in particular DAGs29. They were used in the development of methods to detect confounding with examples from environmental epidemiology in mind2527. More widely, they are commonly used to describe causal relationships and might be viewed as providing a language for doing so efficiently. As with good notation and language, DAGs can aid in the thought process30. In addition to the noted observational approaches, simulations, projection studies and agent-based modeling14, 31, 32 can be useful in environmental epidemiology, although more work remains in delineating and understanding conditions for valid effect estimation33, 34. In research on climate change and health, the projection of estimated effects under varying emissions and socioeconomic scenarios has become more common.3537 These types of studies are useful for projecting the health burden attributable to climate change but results are, of course, dependent upon the initial estimate of effects.

Our fourth issue concerns PVL’s statement that: ‘We are not arguing that “causal inference” methods that mimic randomized controlled trials are not useful; for example, they can improve individual studies with individual-level exposures that can be seen as interventions.” This claim is also potentially misleading because it could be read as implying or suggesting that studies that emulate randomized, controlled trials must involve randomization of individual-level exposures. In fact, the emulation applies more widely – one need only consider group-randomized trials and appreciate that neither randomization nor the potential outcome framework precludes group-level exposure. This wider appreciation is particularly relevant for environmental epidemiology since, as PVL note, environmental exposures often “… affect individuals across entire communities”.

Interestingly, many of the causal inference methods (e.g., “extensions of traditional approaches”) specifically mentioned by PVL have benefited from and have been justified by some of the modern causal inference concepts, methods, and tools. For example, Balke and Pearl38 used counterfactual models to derive identifiable limits for the magnitude of effect based on analyses of instrumental variables. Although the difference-in-differences design precedes the formulation of the more recent causal-inference concepts, the major identifying assumption of difference in differences – parallel trends – appears now to be generally understood in econometrics as based on a counterfactual approach (see, e.g. Abadie39 or Lechner40). Moreover, the regression-discontinuity design can be based on the assumption that potential outcomes have a continuous distribution at the threshold41. Indeed, Imbens and Woodridge, important contributors to the econometrics literature where instrumental-variable analysis, difference in differences, and regression discontinuity have long been used, stated that “… the Rubin potential outcomes framework is now the dominant framework”42.

In summary, we agree with much of PVL’s basic message as contained in at least six simplified main points. However, they omitted entirely the responses to and commentaries on the debate they cited. They also set up a restricted definition of “causal inference”, claiming without qualification that it refers to a subset of RCT-mimicking set of causal inference methods and proceeded to make additional, unqualified characterizations about the subset they defined. Further, they omitted discussion of how RCT-emulation can apply to group-level exposures although they note the importance of this in environmental epidemiology. These omissions, characterizations, and failure to qualify or note the conditional nature of potential advantages if a randomized trial can be emulated are, perhaps, unintentional. Regardless of intentionality, these distractions are unfortunate because the broad message – environmental epidemiology will benefit from use of a variety of causal inference methods – is one with which few would disagree including VanderWeele, Hernán, Tchetgen Tchetgen, and Robins, who pointed out these benefits more widely5.

Acknowledgments

We thank Drs. Tyler VanderWeele and Miguel Hernán for their helpful comments.

Biographies

Dr. Flanders is a professor of Epidemiology at Rollins School of Public Health. He teaches epidemiologic methodology, and does research in several areas, including epidemiologic methodology, environmental epidemiology, and cancer epidemiology.

Mr. Garber is a PhD student in Epidemiology at Rollins School of Public Health. He is interested in the effects of the environment on physical activity and injury and in epidemiologic methods.

Footnotes

Potential Conflicts of Interest: Dr. Flanders owns Epidemiologic Research & Methods, LLC which does consulting work for pharmaceutical companies, environmental laboratories, and attorneys. He knows of no conflicts of interest for this work. Mr. Garber knows of no conflicts of interest.

a

Specifically, Greenland, Hernán, Pearl, Robins, and VanderWeele (alphabetically), all contributors to modern causal inference methods and involved in the ongoing debate, have used causal inference to refer to a much broader range of methods and thus do not generally use the term in the restricted way described by PVL. Here are a few examples. In their book Causal Inference2, Hernán and Robins discuss the “context in which observational studies cannot often be conceptualized as conditionally randomized experiments…”[emphasis added]. Hernán also writes “Causal inference relies on transparency of assumptions and on triangulation of results from methods that depend on different sets of assumptions”17. Greenland coauthors the book Modern Epidemiology18. Chapter 2 on causation and causal inference includes an overview of the philosophy of scientific inference, with causal inference as a special case. Also included are the Bradford Hill ‘criteria’. Bareinboim and Pearl23 tackled the problems of combining different sources of information (data fusion) and addressing biases like confounding and selection bias to make causal inferences. Finally, VanderWeele has coauthored papers using methods, such as Mendelian randomization and meta-analysis for causal inference15. He also states “Inference to be best explanation is important in causal inference and diverse types of evidence can and should be used”7. He explicitly includes under the causal-inference umbrella: instrumental variable, regression discontinuity and difference in difference methods13.

References

  • [1].Pearce N, Vandenbroucke JP, Lawlor DA: Causal Inference in Environmental Epidemiology: Old and New. Epidemiology 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Hernán MA, Robins J: Causal Inference. Boca Raton: Chapman and Hall/CRC, 2018 [Google Scholar]
  • [3].Greenland S: For and against methodologies: some perspectives on recent causal and statistical inference debates. European journal of epidemiology 2017, 32:3–20. [DOI] [PubMed] [Google Scholar]
  • [4].Hernán MA: Does water kill? A call for less casual causal inferences. Annals of epidemiology 2016, 26:674–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].VanderWeele TJ, Hernán MA, Tchetgen Tchetgen EJ, Robins JM: Re: Causality and causal inference in epidemiology: the need for a pluralistic approach. International journal of epidemiology 2016, 45:2199–200. [DOI] [PubMed] [Google Scholar]
  • [6].Robins JM, Weissman MB: Commentary: Counterfactual causation and streetlamps: what is to be done? International journal of epidemiology 2016, 45:1830–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].VanderWeele TJ: Commentary: On Causes, Causal Inference, and Potential Outcomes. International journal of epidemiology 2016, 45:1809–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].VanderWeele TJ: On well-defined hypothetical interventions in the potential outcomes framework. Epidemiology 2018, 29:e24–e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Daniel RM, De Stavola BL, Vansteelandt S: Commentary: The formal approach to quantitative causal inference in epidemiology: misguided or misrepresented? International journal of epidemiology 2016, 45:1817–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Hernán MA, Taubman SL: Does obesity shorten life? The importance of well-defined interventions to answer causal questions. Int J Obes (Lond) 2008, 32 Suppl 3:S8–14. [DOI] [PubMed] [Google Scholar]
  • [11].Hernán MA: Invited commentary: Hypothetical interventions to define causal effects - afterthought or prerequisite? Am J Epidemiol 2005, 162:618–20. [DOI] [PubMed] [Google Scholar]
  • [12].Petersen ML, van der Laan MJ: Causal models and learning from data: integrating causal modeling and statistical estimation. Epidemiology (Cambridge, Mass) 2014, 25:418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].VanderWeele TJ, Mathur MB, Chen Y: Outcome-wide longitudinal designs for causal inference: a new template for empirical studies arXiv preprint arXiv:181010164 2018. [Google Scholar]
  • [14].Hernán MA: Invited commentary: agent-based models for causal inference—reweighting data and theory in epidemiology. American journal of epidemiology 2014, 181:103–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Song Y, Yeung E, Liu A, VanderWeele TJ, Chen L, Lu C, Liu C, Schisterman EF, Ning Y, Zhang C: Pancreatic beta-cell function and type 2 diabetes risk: quantify the causal effect using a Mendelian randomization approach based on meta-analyses. Human molecular genetics 2012, 21:5010–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Pearl J: Does obesity shorten life? Or is it the soda? On non-manipulable causes. Journal of Causal Inference 2018, 6. [Google Scholar]
  • [17].Swanson SA, Hernán MA: Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology 2013, 24:370–4. [DOI] [PubMed] [Google Scholar]
  • [18].Rothman KJ: Modern Epidemiology. Boston: Little, Brown and Co, 1986. [Google Scholar]
  • [19].Robins J: A new approach to causal inference in mortality studies with sustained exposure periods-application to control of the health worker survivor effect. Mathematical Modeling 1986, 7:1393–515. [Google Scholar]
  • [20].Robins J, Hernán M, Brumback B: Marginal structural models and causal inference in epidemiology. Epidemiology 2000, 11:761–8. [DOI] [PubMed] [Google Scholar]
  • [21].Robins JM, Blevins D, Ritter G, Wulfsohn M: G-estimation of the effect of prophylaxis therapy for Pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 1992:319–36. [DOI] [PubMed] [Google Scholar]
  • [22].Pearl J: Causality. 2nd ed. Cambridge: Cambridge University Press, 2009. [Google Scholar]
  • [23].Bareinboim E, Pearl J: Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences 2016, 113:7345–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Lipsitch M, Tchetgen E, Cohen T: Negative Controls: A Tool for Detecting Confounding and Bias in Observational Studies. Epidemiology 2010, 21:383–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Flanders WD, Klein M, Darrow LA, Strickland MJ, Sarnat SE, Sarnat JA, Waller LA, Winquist A, Tolbert PE: A Method to Detect Residual Confounding in Spatial and Other Observational Studies. Epidemiology (Cambridge, Mass) 2011, 22:823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Flanders WD, Klein M, Strickland M, Darrow L, Sarnat S, Sarnat J, Waller L, Tolbert P: A method of identifying residual confounding and other violations of model assumptions. Epidemiology 2009, 20:S44–S5. [Google Scholar]
  • [27].Flanders WD, Klein M, Strickland M, Darrow L, Sarnat S, Sarnat J, Waller L, Winquist A, Tolbert PE: A Method for Detection of Residual Confounding in Time-Series and Other Observational Studies. Epidemiology 2011, 22:59–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Robins JM, Richardson T: Alternative graphical causal models and the identification of direct effects. Center for Statistics and the Social Sciences, University of Washington Causality and Psychopathology: Finding the Determinants of Disorders and their Cures, Chapter 6, pp 1–52 Oxford University Press Edited by Shrout P, Keyes K, Ornstein K. Oxford University Press, 2011. pp. 1–52. [Google Scholar]
  • [29].Pearl J: Causal diagrams for empirical research (with discussion). Biometrika 1995, 82:669–710. [Google Scholar]
  • [30].Iverson KE: Notation as a tool of thought. ACM SIGAPL APL Quote Quad 2007, 35:2–31. [Google Scholar]
  • [31].Auchincloss AH, Diez Roux AV: A new tool for epidemiology: the usefulness of dynamic-agent models in understanding place effects on health. American journal of epidemiology 2008, 168:1–8. [DOI] [PubMed] [Google Scholar]
  • [32].Marshall BD, Galea S: Formalizing the role of agent-based modeling in causal inference and epidemiology. American journal of epidemiology 2014, 181:92–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Murray EJ, Robins JM, Seage GR, Freedberg KA, Hernán MA: A comparison of agent-based models and the parametric g-formula for causal inference. American journal of epidemiology 2017, 186:131–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Keyes KM, Tracy M, Mooney SJ, Shev A, Cerdá M: Invited commentary: agent-based models—bias in the face of discovery. American journal of epidemiology 2017, 186:146–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Ebi KL, Hallegatte S, Kram T, Arnell NW, Carter TR, Edmonds J, Kriegler E, Mathur R, O’Neill BC, Riahi K: A new scenario framework for climate change research: background, process, and future directions. Climatic Change 2014, 122:363–72. [Google Scholar]
  • [36].Sellers S, Ebi KL: Climate change and health under the shared socioeconomic pathway framework. International journal of environmental research and public health 2017, 15:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Springmann M, Mason-D’Croz D, Robinson S, Garnett T, Godfray HCJ, Gollin D, Rayner M, Ballon P, Scarborough P: Global and regional health effects of future food production under climate change: a modelling study. The Lancet 2016, 387:1937–46. [DOI] [PubMed] [Google Scholar]
  • [38].Balke A, Pearl J: Counterfactual probabilities: Computational methods, bounds and applications. Proceedings of the Tenth international conference on Uncertainty in artificial intelligence: Morgan Kaufmann Publishers Inc., 1994. pp. 46–54. [Google Scholar]
  • [39].Abadie A: Semiparametric difference-in-differences estimators. The Review of Economic Studies 2005, 72:1–19. [Google Scholar]
  • [40].Lechner M: The estimation of causal effects by difference-in-difference methods. Foundations and Trends® in Econometrics 2011, 4:165–224. [Google Scholar]
  • [41].Moscoe E, Bor J, Bärnighausen T: Regression discontinuity designs are underutilized in medicine, epidemiology, and public health: a review of current and best practice. Journal of clinical epidemiology 2015, 68:132–43. [DOI] [PubMed] [Google Scholar]
  • [42].Imbens GW, Wooldridge JM: Recent developments in the econometrics of program evaluation. Journal of economic literature 2009, 47:5–86. [Google Scholar]

RESOURCES