Skip to main content
American Journal of Public Health logoLink to American Journal of Public Health
editorial
. 2018 May;108(5):622. doi: 10.2105/AJPH.2018.304379

Data Are Not Enough—Hurray For Causality!

Arnaud Chiolero 1,
PMCID: PMC5888075  PMID: 29617621

Causal inference is of major importance in epidemiology and public health because the determination that an association between an exposure and a health outcome is causal indicates a potential for intervention to improve this health outcome.1 In the current issue of AJPH, Hernán (p. 616) argues that “using the term ‘causal’ is necessary to improve the quality of observational research.”(p616) How can we go, in observational epidemiology, from the usually prudent “association is not causation” to an explicitly assumed causal inference?

I think that epidemiology is primarily a scientific practice to inform public health practitioners, policymakers, clinicians, and citizens to help them make adequate health- and disease-related decisions. The goal is to eventually improve health outcomes. Causality concepts should be in line with this practice, and this is why recent developments in causal thinking in observational epidemiology are inspiring. The key, as argued by Hernán, is first to formulate better causal questions, and second to better adjust for confounding.

Until recently, how to formulate adequate causal questions had neither been formalized nor taught in epidemiology. Indeed, training epidemiologists, health scientists, and public health practitioners in causality has often been limited to the study of complex epistemological and philosophical concepts, but without links to observational research practices (data centered)—beyond the (inadequate) use of Hill causal criteria. The counterfactual, or potential, outcomes approach, with the increasing use of directed acyclic graphs and new statistical tools such as G methods for the analysis of observational data,2 is a driving force to improve how we formulate causal questions. Such an approach is indeed highly effective for addressing issues of confounding, selection bias, and mediation within the same framework.3 It is also a powerful tool for differentiating association from causality and intervention in public health. This is why I agree with Hernán’s call to use the “C-word” more explicitly.

However, there are issues to resolve if we want to be sure that using the notion of causality more explicitly will help observational epidemiology and, eventually, inform public health. One issue is the growing use of big data analyses, linked to overconfidence in complex statistical modeling, because more than ever it feeds the confusion between causation, prediction, and association.4 Applied to the study of multiple and weak associations, with a tolerance for vague and unspecified causal effect, it leads to numerous findings without any relevance for public health.2 Unfortunately, within data-driven epidemiology, almost all resources are devoted to data management and analyses, leaving no room for causal thinking or for the formulation (before running the analyses) of research questions that are truly answerable by observational epidemiology. Another issue is that most health scientists lack the training to conduct adequate statistical analyses for causal inference.3 Inadequate and unjustified statistical adjustments are legion in observational studies, leading to false causal claims.5

The credibility of observational epidemiology has suffered from too much confidence in data and complex statistical modeling. If done properly, putting causality explicitly at the heart of observational epidemiology will help move this field toward actionable skepticism, that is, move it from a risk factor epidemiology centered on correctly estimating statistical associations to an epidemiology of consequences.6 Public health needs a data-driven and an evidence-based decision-making model; it has to be explicitly causally justified. Data are not enough—hurray for causality!

Footnotes

See also Galea and Vaughan, p. 602; Hernán, p. 616; Begg and March, p. 620; Ahern, p. 621; Glymour and Hamad, p. 623; Jones and Schooling, p. 624; and Hernán, p. 625.

REFERENCES

  • 1.Glass TA, Goodman SN, Hernán MA, Samet JM. Causal inference in public health. Annu Rev Public Health. 2013;34:61–75. doi: 10.1146/annurev-publhealth-031811-124606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Chiolero A. Counterfactual and interventionist approach to cure risk factor epidemiology. Int J Epidemiol. 2016;45(6):2202–2203. doi: 10.1093/ije/dyw159. [DOI] [PubMed] [Google Scholar]
  • 3.Mansournia MA, Etminan M, Danaei G, Kaufman JS, Collins G. Handling time varying confounding in observational research. BMJ. 2017;359:j4587. doi: 10.1136/bmj.j4587. [DOI] [PubMed] [Google Scholar]
  • 4.Chiolero A. Big data in epidemiology: too big to fail? Epidemiology. 2013;24(6):938–939. doi: 10.1097/EDE.0b013e31829e46dc. [DOI] [PubMed] [Google Scholar]
  • 5.Kaufman JS. Statistics, adjusted statistics, and maladjusted statistics. Am J Law Med. 2017;43(2–3):193–208. doi: 10.1177/0098858817723659. [DOI] [PubMed] [Google Scholar]
  • 6.Galea S. An argument for a consequentialist epidemiology. Am J Epidemiol. 2013;178(8):1185–1191. doi: 10.1093/aje/kwt172. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Public Health are provided here courtesy of American Public Health Association

RESOURCES