Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2024 Sep 17;194(6):1476–1481. doi: 10.1093/aje/kwae367

An evolved interpretation of Austin Bradford Hill’s causal viewpoints and their influence on epidemiologic methods

Catherine R Lesko 1,, Matthew P Fox 2
PMCID: PMC12133282  PMID: 39289169

Abstract

In 1965, Sir Austin Bradford Hill articulated nine viewpoints for evaluating whether a body of evidence about the relationship between an exposure and outcome should be interpreted causally. In this commentary, we highlight a selection of the ways in which these viewpoints have had an impact on the field of epidemiology in terms of methods development, study design, and interpretation of results. Additionally, we opine on how the viewpoints relate to our understanding of basic epidemiologic concepts—for example, our choice of absolute or relative measures of effect, our evolving understanding of the role of context in the generalizability of study results, and modern epistemologies for causal inference (ie, the potential outcomes framework and graphical causal models). Hill cautioned his audience that evidence should be weighed according to the policy choice it would inform and the context in which that policy would be implemented. We root our remarks in considerations of the public health impact of our conclusions about the causal nature of an observed relationship.

Keywords: causality, induction, epistemology


Editor’s note: The opinions expressed in this article are those of the authors and do not necessarily reflect the views of the American Journal of Epidemiology.

Introduction

In 1965, Sir Austin Bradford Hill delivered a lecture to the Royal Society of Medicine to answer the questions, “Upon what basis should we proceed to [pass from an observation of association to a verdict of causation]?”1, p.295 Hill’s viewpoints are presented and often used as a way of using inductive reasoning to evaluate a hypothesis: a way of moving from a body of evidence to a general theory about causation, where the theory is whether we should assume a non-null average causal effect, with less concern about the magnitude of that effect or the conditions under which the effect might vary. This exercise is most aptly called “hazard identification,” that is, seeking to make a dichotomous decision about whether some exposure is harmful or not harmful.2 In contrast, evidence synthesis (one example being meta-analysis) or “risk assessment” is a more nuanced process that seeks to combine estimates of effect from multiple sources, make a statement about the quality of the evidence and then make a decision about the strength of recommendation about a policy choice the evidence seeks to inform.

Despite setting up a hazard identification task in his preamble, Hill concluded his lecture by cautioning his audience that evidence should be weighed according to the policy choice it would inform and the context in which that policy would be implemented—more of a risk assessment task since policy decisions do not rely only on hazard identification alone but also on the magnitude of effect (in addition to numerous other factors). In this spirit, we ground our remarks in considerations of the public health impact of our conclusions about the causal nature of an observed relationship. We humbly argue that for decisions about public health policy, a risk assessment approach is more relevant because policy decisions often require considerations of impact-cost trade-offs, for which the true magnitude of an effect matters, as does context and effect heterogeneity (To be fair, hazard identification tasks often take context—eg, prevalence or distribution of an exposure—into account when deciding what exposures to investigate for their harmful potential and when to reconsider the evidence, so this distinction between tasks may not be as clear as is implied.)2

But Hill’s viewpoints are not a clear roadmap. Hill was clear he was not presenting a checklist for ascribing causation and that none of his viewpoints were necessary or sufficient for concluding causality. As applying the Hill viewpoints is a subjective exercise, there is often only fair agreement among epidemiologists about whether or not a body of evidence should be interpreted causally.3 So what use do the Hill viewpoints have? And what role should they play nearly six decades later? Hill’s viewpoints and their strengths and limitations have been detailed in multiple other publications,4-7 and we skim over prior arguments here. Our focus is on where we think we can contribute something new to the discussion.

Before we begin, we briefly discuss the level at which we should consider Hill’s viewpoints. Hill is himself vague on this topic, asking: “How in the first place do we detect these relationships between sickness, injury and conditions of work?”1, p.295—which would appear to refer to the question of how should individual studies be conducted and early signals be detected. However, he goes on to say, “There are, of course, instances in which we can reasonably answer these questions from the general body of medical knowledge.”1, p.295—which clearly refers to a collection of studies or anecdotal experiences or expert opinions (more than just one study). He goes on to occasionally reference individual studies and other times discuss triangulating evidence (eg, the viewpoints of coherence or consistency). Since Hill’s lecture, we have entire systems (eg, GRADE)8,9 that have been designed to evaluate a literature and new methods for synthesizing and meta-analyzing results.10-12 We do not review these here. We, like Hill, remain versatile in our application of these viewpoints to discussions of individual studies (eg, how should we design the next study to improve the body of evidence?) and a collection of studies (eg, how should we consider a collection of effect estimates from different studies?). However, we also note that often, the distinction is not relevant. For example, sensitivity analyses can be applied to individual study results or to a consensus (ie, pooled via meta-analysis) estimate of effect.

Strength

Hill argues that a larger association between an exposure and an outcome is more likely to be causal than not because the unmeasured confounding required to produce a biased, strong association when the truth is no association, is likely more than the confounding required to produce a biased weak association. However, he is clear that large associations still could result from biases and that small associations might be causal—the same criticisms often levied at this viewpoint.4,6,13 In 1959, Jerome Cornfield estimated how much bias would be required to nullify the observed association between smoking and lung cancer14—this proof is known as Cornfield’s Inequality and is described as the first quantitative sensitivity analysis.15 This concept has experienced a resurgence in the form of the “E-value,” which makes some strong simplifying assumptions.16,17 While both Hill and Cornfield only presented confounding bias as a source of difference between an association and the true effect, measurement error and selection bias could also mislead about the direction, strength, or existence of an effect.18-22

What is underappreciated is that Hill’s question (is there a causal relationship?), framed as a hazard identification question, presumes a dichotomy that does not consider whether we have over- or under-estimated the true causal effect—if anything, we may be inclined to over-estimate causal effects by only accepting as true those that are “strong” (Indeed, when statistical significance is used as a marker of strength,23 and only “statistically significant” associations are submitted or accepted for publication, this results in publication bias for individual studies, which can then bias pooled effect estimates.)10 Yet, from a risk assessment perspective, the magnitude of an association and the prevalence of exposure are critical considerations for public health or clinical decision-making. A common exposure with a small effect might lead to more excess cases of disease than a rare exposure with a big effect. Additionally, an over-estimate of the true effect of some exposure might lead to incorrect conclusions about the cost-efficiency of an intervention to alter the distribution of that exposure.

More nuanced sensitivity analyses incorporating existing knowledge or theories about the prevalence of a hypothesized confounder and its association with the exposure and outcome can better quantify the impact of confounding beyond just describing a point on that distribution (where the estimated effect of an exposure on an outcome becomes null).24 Further, while confounding bias was presented by Hill and addressed by Cornfield as a singular alternative non-causal explanation for an association,1,14 selection bias, measurement error, missing data and issues with definitions of zero time could also explain away a strong observed association5 or impact the magnitude of the effect estimates, and their impact should be incorporated into sensitivity analyses.

Consistency

Hill used “consistency” to mean that the association has been observed repeatedly in different people, places, circumstances, and times. This may be interpreted to have parallels with current discussions about the importance of replicability (getting the same results with different data or a different study design and a different team of researchers.)25 Hill is explicit that replicating a study using the same study design “will not invariably greatly strengthen the original evidence” (it may just replicate the same biases) and rather what is needed is “similar results reached in quite different ways”.1, p.297 This last caution about consistency is often lost, with researchers using different data but the same study design or approach (eg, “standard” confounder control through regression adjustment, stratification, or standardization) to replicate prior research studies. Approaching the same question in different ways that rely on different assumptions and have different threats to validity can avoid merely replicating prior biases.26 For example, one might conduct an instrumental variable analysis (if an appropriate instrument exists)27,28 or check for unmeasured confounding with outcome or exposure negative controls.29-31 There are good papers demonstrating the value of integrating evidence from different study designs through triangulation, although these exercises rely on understanding the different sources of bias from different study designs.29,32 While there is limited space in our epidemiology curriculums, if we value consistency, it would seem to imply we should value teaching different study designs as much or more than teaching an array of analytic methods that all rely on the same assumption of “no unmeasured confounding.”

There are epidemiologic considerations that span strength and consistency. While Hill seems to mainly present “strength” as synonymous with “unable to be explained away by confounding,” he does comment that the most relevant scale on which to measure strength is the ratio scale.1, p.296 It has been noted, but not widely acknowledged, that this aside (as well as similar assertions by Cornfield),14 both of which were intended to reconcile observed, large absolute associations between smoking and cardiovascular disease and other conditions, which were not believed to be causal, has had a profound impact on our subsequent epidemiologic practice. This is true both in terms of the measures of association to which we default and see as “etiologic”33,34 and more recently in debates about whether some measures of association are more transportable across populations, as if relative measures of effect were physical constants.35-39

Given all this, we indulge in a few thoughts about “strength” and choice of effect measure. Even in the absence of biases, “strength” is an ill-defined property of an effect measure. First, strength is determined by the risk of the outcome in the reference group and the relative prevalence of the component causes, not by some inherent feature of the exposure itself.6,40 For a rare outcome, an effect that is “strong” on the relative scale may be “weak” on the absolute scale and vice versa for a common outcome.33 Second, there is not agreement on what constitutes a “strong” or “weak” effect and guidance has not always been internally consistent when classifying the strength of effects on either side of the null.41-43 For example, a protective risk ratio of 0.7 might be seen as “strong” whereas the inverse measure of effect achieved by switching the coding of the exposure, a causal risk ratio of 1.4, might be seen as only a “moderate” effect. The strength of the risk ratio measure is bounded by the risk of the outcome in the referent group, and can be arbitrarily manipulated by changing the definition or coding of the outcome.44

As regards “consistency,” the magnitude and potentially the direction of an effect can and often will differ across populations and contexts because the effect size is a function of the distribution of other causes of the outcome in a population, whether those other causes are effect modifiers or whether they interact causally with the exposure of interest.40,45-47 Because effect heterogeneity and transportability are scale-dependent, the choice of measure of association is (again) relevant. Because we overwhelmingly rely on relative effects33-35,38 and because relative effects are strongest when the baseline risk of some outcome is small (often corresponding to small absolute change in disease associated with changes in exposure),48 one has to wonder whether we have spent a disproportionate amount of effort studying outcomes without maximizing the public health impact of our research (ie, we might prevent more cases of a common disease that boasts “weaker” effects).

A preference for relative measures of association over absolute measures has persisted, in part, based on “consistency.” Some have argued that relative measures of effect are more homogeneous than absolute measures35,36,38 while others have argued that this observation is a statistical artifact37 and that absolute measures of effect are most clinically and policy-relevant.45,46,49,50 Given nearly all (possibly all) health outcomes are multifactorial, effect measure modification on at least one scale is a mathematical inevitability (although whether or not the modification is clinically meaningful is another matter).36 Thus, the problem of choosing a scale on which to report results is both a mathematical problem (which model fits the data best without a product term?) and a scientific problem (which model best describes the data generation process/nature?).39 The scientific problem is related to recent developments in generalizability and transportability51-54 and the desire to have a measure of effect that is specific to the target population in which the exposure is to be intervened upon, not just in direction but in magnitude. Rather than see the presence of heterogeneity as evidence against “causation,” we might extend the idea of consistency to subgroups (on a particular scale) and argue that a full understanding of “the” causal effects of an exposure includes understanding how those effects vary across populations and individuals55,56 and, going beyond effect heterogeneity, understanding which variables interact causally with the exposure of interest4,40,45,47 (We will conveniently ignore the role of random error in determining whether effects are the same or different across subgroups, or how to define subgroups as these topics have received in-depth treatments elsewhere.46,50,57 Suffice it to say, we recognize this is a vague goal.) Understanding how to target interventions when there are resource constraints or what interventions should be paired will improve health faster than blindly applying “universal” interventions and letting social and economic structures allocate resources for us.55,57-59

Experiment

Trials can be helpful but are not a panacea for understanding causation more generally—even when ethical and perfectly conducted, they may not tell us what we would like to know. Often, they are conducted in highly controlled settings that do not mirror the real world (eg, with additional monitoring that is itself an intervention or with adherence support) and in highly select populations, in conflict with the viewpoint of Consistency. Furthermore, they may be unbiased in expectation for an intention-to-treat effect but if adherence is imperfect, results will not apply to any individuals and will not apply in other settings with different probabilities of adherence. This is because they test the effect of the treatment assignment rather than the treatment itself. Practically speaking, it is challenging to conduct a trial completely devoid of missing data (eg, due to non-response or loss to follow-up) and measurement error in covariates or the outcome. Therefore, while trials can reduce confounding bias in expectation, other biases may undermine the results.

Hill seemed to be speaking more about “quasi-experimental” evidence than directly discussing trials. His examples are about observing changes in the frequency of the associated event after some intervention to change the prevalence of the exposure. To challenge a causal hypothesis, we might consider exploring other study designs (eg, quasi-experimental methods like instrumental variable analyses27,28 or rate change methods60) that leverage temporal changes in the exposure distribution and rely on different assumptions than more common individual-level models and regression adjustment for measured covariates.

Caution: context

Hill also reminded his audience that the evidence should be weighed according to the policy choice it would inform and the context in which that policy would be implemented—that in the face of a potentially harmful exposure, we might accept less evidence to conclude causality if removing the exposure would have minimal disruption, while we might expect more evidence if there are substantial financial or logistic consequences of removing the exposure. This is another “missed lesson” of Hill’s speech61—nearly 60 years later, the COVID-19 pandemic has laid clear the consequences of failing to act or acting too quickly in the face of minimal evidence. For example, we could have used the viewpoint of analogy and biological plausibility (Masks have been effective against other respiratory viruses.) to err on the side of recommending rather than actively dissuading mask use early on—a situation in which a policy decision was required but sufficient evidence was not available to confidently draw definitive conclusions. Arguably, some individual viewpoints might be adapted to the context and consequences of the public health decision at hand. In the face of a particularly devastating outcome, we might be more interested in a “weak” association than we would be if the outcome of interest was not as damaging. If we are studying an exposure disease pair with a long induction period, we might rely on temporality and biological plausibility to extrapolate effects on proximal outcomes to long-term clinical outcomes. Hill does not explicitly discuss how to weigh policy considerations and consequences against the uncertainty around a possible causal relationship, but cost–benefit analyses or decision theory could provide a way forward.61,62

Impact on the field

Despite their limitations and Hill’s cautions against using his viewpoints as criteria or a checklist, epidemiologists and medical researchers continue to rely on many of the criteria, most prominently strength and consistency. Weed and Gorelic reviewed two series of review papers published between 1985 and 1994 and reported on how authors arrived at causal conclusions.63 They found that when authors specified their causal reasoning, they relied on the 1964 Surgeon General’s report64 or Hill’s 1965 lecture,1 primarily focusing on strength, consistency, biological gradient, and plausibility.63 Temporality, coherence, and analogy were rarely cited.63 In a psychometric experiment in the late 1990s, Holman et al. sent out a survey to members of the Australasian Epidemiologic Society with computer-generated summaries of the “empirical” evidence (a manufactured set of hypothetical research studies) concerning an association between a generic exposure and generic outcome. Participants were asked to indicate whether, based on each summary, they were more likely than not to ascribe the evidence to a causal relationship between the exposure and outcome. The summaries were designed to vary along axes approximately corresponding to Hill’s viewpoints. Strong predictors of ascribing causation to a scenario were strength of association, consistency (number of studies providing “statistical support” increased probability of causal attribution until about 12, and then there were no additional effects of more studies), evidence of a dose–response relationship, or postulation of a well-defined mechanism of action.3 What constituted “statistical support” was not clearly specified in the prompts and when presented with P values, there was a discontinuity in the probability of causal attribution at P = 0.05,3 suggesting we are all, as a field, still grappling with how to balance statistical and causal considerations when interpreting evidence. While these studies are from over 20 years ago and we hope that epidemiologic teaching has progressed since then, we suspect these views remain common.

In these surveys of practice and empirical surveys, some additional viewpoints were also commonly considered when evaluating causal hypotheses. In Weed and Gorelic’s review, confounding and bias were commonly cited considerations that do not appear in Hill’s original list.1,63 In Holman et al.’s experiment, participants were also more likely to ascribe causality if summaries asserted there were “no known confounders” or identified results as coming from prospective cohort studies more-so than from case–control studies. Epidemiologists agreed only 12% more often than by chance when presented with the same scenario.3 While Holman et al.’s experiment was perhaps simplistic and we might expect agreement to be better in more realistic settings, we should be cautious that judging existing evidence, even with the guidelines above, is a subjective exercise and subject to confirmation bias. Agreement about the evidence for causality in a real-world setting cannot always be taken as an indication that the evidence itself is unbiased.

Modern causal epistemologies

Hill’s viewpoints may seem outdated or inadequate in the face of the “causal revolution” and focus on potential outcomes and graphical causal models (Directed Acyclic Graphs; DAGs). Because Hill was focused on causal theories rather than specific causal effects measured in a specific population, with a specific exposure definition, and a specific follow-up period, comparing inference from each of the epistemologies is a bit like comparing apples and oranges. However, there is significant overlap between some of the concepts laid out by Hill and the foundations of causal inference using potential outcomes or DAGs. The causal identification assumptions required to go from the causal estimand to an expression that is a function of the observed data encompass the auxiliary hypotheses that underlie many of Hill’s viewpoints. Furthermore, the use of DAGs allows us to encode our knowledge about biological processes (plausibility and coherence) into an unambiguous summary of our assumptions.65,66 Recent understanding of generalizability, transportability, and selection biases51,53,54,67,68 can help make sense of a set of estimated causal effects that are not consistent across studies and guide us in our efforts to better account for context. Finally, as alluded to by Hill,1 “experiment” need not mean a traditional randomized trial; explicit formulation of causal questions using potential outcomes and DAGs have led to new study designs that leverage instruments (which are viewable on DAGs) or natural experiments or that mimic trials.28,69

One benefit of the potential outcomes framework and DAGs is their ability to systematically aid in the identification of the potential sources of bias for a single study that is part of the evidence base for a causal theory. Another benefit is their utility in explicitly defining confounding, selection bias, and measurement error, which were not articulated in Hill’s viewpoints—aside from the potential of confounding, though not explicitly named, to explain away a large association—but which are clearly considered by modern epidemiologists when evaluating scientific evidence.3,63 Ideally, a better understanding of the results of a single study would lead to better synthesis of the results from multiple studies.

A wholistic view of causation

We have considered how Hill’s viewpoints individually might help us weigh the available evidence for a causal theory or design subsequent studies to fill gaps in the available evidence. But how should we consider causal theories wholistically? How might Hill’s viewpoints aid us in integrating evidence across studies? An assessment of the total evidence about a causal theory should consider all of Hill’s viewpoints together and if there is a paucity of evidence in one area (eg, if the hypothesized effect is “weak”), we might require stronger evidence in another area (eg, we observe a similar, “weak” effect using different study designs that all target the same estimand). We might also incorporate what we know about biases from more modern causal epistemologies into our reasoning. For example, if we observe an inconsistent relationship using different study designs, we might employ plausibility and coherence (DAGs and our “causal knowledge” that comes from other scientific disciplines65) to hypothesize about how the lack of consistency might be explained by versions of treatment or effect modification or unmeasured confounders.

Any conclusions about causation need to come from considerations of multiple types of evidence from different scientific disciplines and from different types of epidemiologic studies. Causal theories continue to evolve and must be re-evaluated regularly to inform urgent action based on available evidence and to incorporate new evidence as it arises. Hill’s causal viewpoints are a useful but imperfect framework for thinking about a body of evidence. It is useful to think about enriching them with an understanding of threats to validity that have been illustrated with modern causal methods—for example, when thinking about the “strength” of an estimated effect and how it may have been produced by confounding, selection, or information bias. Finally, we need to think of the viewpoints outside of the false present/absent dichotomy assumed by Hill. We care about the magnitude and variability of a causal effect, not merely whether it exists or not.

Contributor Information

Catherine R Lesko, Department of Epidemiology, Johns Hopkins University, Baltimore, MD, United States.

Matthew P Fox, Departments of Epidemiology & Global Health, Boston University, Boston, MA, United States.

Funding

This work was supported by the National Institutes of Health (K01 AA028193).

Conflict of interest

None declared.

References

  • 1. Hill  AB. The environment and disease: association or causation?  Proc R Soc Med. 1965;58(5):295-300. 10.1177/003591576505800503 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Pearce  N, Blair  A, Vineis  P, et al.  IARC monographs: 40 years of evaluating carcinogenic hazards to humans. Environ Health Perspect. 2015;123(6):507-514. 10.1289/ehp.1409149 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Holman  CDJ, Arnold-Reed  DE, de  Klerk  N, et al.  A psychometric experiment in causal inference to estimate evidential weights used by epidemiologists. Epidemiology. 2001;12(2):246-255. 10.1097/00001648-200103000-00019 [DOI] [PubMed] [Google Scholar]
  • 4. Rothman  K, Greenland  S. Hill’s criteria for causality. Encyclopedia of Biostatistics. Chichester, West Sussex, England: John Wiley & Sons; 1998, vol. 3, pp. 1920-1924. 10.1002/0470011815.b2a03072 [DOI] [Google Scholar]
  • 5. Höfler  M. The Bradford Hill considerations on causality: a counterfactual perspective. Emerg Themes Epidemiol. 2005;2(1):11. 10.1186/1742-7622-2-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Rothman  KJ, Lash  TL, VanderWeele  TJ, et al.  Modern Epidemiology. Fourth ed. Philadelphia: Wolters Kluwer; 2021. [Google Scholar]
  • 7. Susser  M. The logic of sir Karl Popper and the practice of epidemiology. Am J Epidemiol. 1986;124(5):711-718. 10.1093/oxfordjournals.aje.a114446 [DOI] [PubMed] [Google Scholar]
  • 8. Balshem  H, Helfand  M, Schünemann  HJ, et al.  GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol. 2011;64(4):401-406. 10.1016/j.jclinepi.2010.07.015 [DOI] [PubMed] [Google Scholar]
  • 9. Schünemann  H, Hill  S, Guyatt  G, et al.  The GRADE approach and Bradford Hill’s criteria for causation. J Epidemiol & Community Health. 2011;65(5):392-395. 10.1136/jech.2010.119933 [DOI] [PubMed] [Google Scholar]
  • 10. Lin  L, Chu  H. Quantifying publication bias in meta-analysis. Biometrics. 2018;74(3):785-794. 10.1111/biom.12817 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Blettner  M, Sauerbrei  W, Schlehofer  B, et al.  Traditional reviews, meta-analyses and pooled analyses in epidemiology. Int J Epidemiol. 1999;28(1):1-9. 10.1093/ije/28.1.1 [DOI] [PubMed] [Google Scholar]
  • 12. Egger  M, Schneider  M, Davey  SG. Spurious precision?  Meta-analysis of observational studies BMJ. 1998;316(7125):140-144. 10.1136/bmj.316.7125.140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Rothman  KJ, Greenland  S, Lash  TL. Modern epidemiology. 3rd ed.  Philadelphia, PA: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008.  x, 758 p. Publisher description http://www.loc.gov/catdir/enhancements/fy0743/2007036316-d.html. Table of contents only  http://www.loc.gov/catdir/enhancements/fy0828/2007036316-t.html [Google Scholar]
  • 14. Cornfield  J, Haenszel  W, Hammond  EC, et al.  Smoking and lung cancer: recent evidence and a discussion of some questions. JNCI: Journal of the National Cancer Institute. 1959;22(1):173-203. 10.1093/jnci/22.1.173 [DOI] [PubMed] [Google Scholar]
  • 15. Greenhouse  JB. Commentary: Cornfield, epidemiology and causality. Int J Epidemiol. 2009;38(5):1199-1201. 10.1093/ije/dyp299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Haneuse  S, VanderWeele  TJ, Arterburn  D. Using the E-value to assess the potential effect of unmeasured confounding in observational studies. JAMA. 2019;321(6):602-603. 10.1001/jama.2018.21554 [DOI] [PubMed] [Google Scholar]
  • 17. VanderWeele  TJ, Ding  P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268-274. 10.7326/M16-2607 [DOI] [PubMed] [Google Scholar]
  • 18. Lash  TL, Fox  MP, Fink  AK. Applying Quantitative Bias Analysis to Epidemiologic Data. New York: Springer Science+Business Media, LLC; 2009. [Google Scholar]
  • 19. Kristensen  P. Bias from nondifferential but dependent misclassification of exposure and outcome. Epidemiology. 1992;3(3):210-215. 10.1097/00001648-199205000-00005 [DOI] [PubMed] [Google Scholar]
  • 20. Edwards  JK, Cole  SR, Westreich  D. All your data are always missing: incorporating bias due to measurement error into the potential outcomes framework. Int J Epidemiol. 2015;44(4):1452-1459. 10.1093/ije/dyu272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Hernán  MA, Hernández-Diaz  S, Robins  JM. A structural approach to selection bias. Epidemiology. 2004;15(5):615-625. 10.1097/01.ede.0000135174.63482.43 [DOI] [PubMed] [Google Scholar]
  • 22. Howe  CJ, Cole  SR, Lau  B, et al.  Selection bias due to loss to follow up in cohort studies. Epidemiology. 2016;27(1):91-97. 10.1097/EDE.0000000000000409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Fedak  KM, Bernal  A, Capshaw  ZA. Applying the Bradford Hill criteria in the 21st century: how data integration has changed causal inference in molecular epidemiology. Emerg Themes Epidemiol. 2015;12(1):14. 10.1186/s12982-015-0037-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Sjölander  A, Greenland  S. Are E-values too optimistic or too pessimistic? Both and neither!  Int J Epidemiol. 2022;51(2):355-363. 10.1093/ije/dyac018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Busija  L, Lim  K, Szoeke  C, et al.  Do replicable profiles of multimorbidity exist? Systematic review and synthesis. Eur J Epidemiol. 2019;34(11):1025-1053. 10.1007/s10654-019-00568-5 [DOI] [PubMed] [Google Scholar]
  • 26. Grant  WB, Boucher  BJ, Al Anouti  F, et al.  Comparing the evidence from observational studies and randomized controlled trials for nonskeletal health effects of vitamin D. Nutrients. 2022;14(18):3811. 10.3390/nu14183811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Angrist  JD, Imbens  GW, Rubin  DB. Identification of causal effects using instrumental variables. J Am Stat Assoc. 1996;91(434):444-455. 10.1080/01621459.1996.10476902 [DOI] [Google Scholar]
  • 28. Greenland  S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722-729. 10.1093/ije/29.4.722 [DOI] [PubMed] [Google Scholar]
  • 29. Pearce  N, Vandenbroucke  J, Lawlor  DA. Causal inference in environmental epidemiology: old and new. Epidemiology. 2019;30(3):311-316. 10.1097/EDE.0000000000000987 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Lipsitch  M, Tchetgen Tchetgen  E, Cohen  T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology. 2010;21(3):383-388. 10.1097/EDE.0b013e3181d61eeb [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Miao  W, Shi  X, Tchetgen  ET. A Confounding Bridge Approach for Double Negative Control Inference on Causal Effects. 2020; Accessed April 6, 2024. http://arxiv.org/abs/1808.04945
  • 32. Lawlor  DA, Tilling  K, Davey  SG. Triangulation in aetiological epidemiology. Int J Epidemiol. 2016;45(6):1866-1886. 10.1093/ije/dyw314 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Poole  C. On the origin of risk relativism. Epidemiology. 2010;21(1):3-9. 10.1097/EDE.0b013e3181c30eba [DOI] [PubMed] [Google Scholar]
  • 34. Broadbent  A. Risk relativism and physical law. J Epidemiol Community Health. 2015;69(1):92-94. 10.1136/jech-2014-204347 [DOI] [PubMed] [Google Scholar]
  • 35. Spiegelman  D, VanderWeele  TJ. Evaluating public health interventions: 6. Modeling ratios or differences? Let the data tell us. Am J Public Health. 2017;107(7):1087-1091. 10.2105/AJPH.2017.303810 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Ding  P, VanderWeele  TJ. The differential geometry of homogeneity spaces across effect scales. arXiv preprint arXiv:1510.08534. 2015.
  • 37. Poole  C, Shrier  I, VanderWeele  TJ. Is the risk difference really a more heterogeneous measure?  Epidemiology. 2015;26(5):714-718. 10.1097/EDE.0000000000000354 [DOI] [PubMed] [Google Scholar]
  • 38. Engels  EA, Schmid  CH, Terrin  N, et al.  Heterogeneity and statistical significance in meta-analysis: an empirical study of 125 meta-analyses. Stat Med. 2000;19(13):1707-1728. [DOI] [PubMed] [Google Scholar]
  • 39. Panagiotou  OA, Trikalinos  TA. Commentary: on effect measures, heterogeneity, and the Laws of nature. Epidemiology. 2015;26(5):710-713. 10.1097/EDE.0000000000000359 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Rothman  KJ. Causes. Am J Epidemiol. 1976;104(6):587-592. 10.1093/oxfordjournals.aje.a112335 [DOI] [PubMed] [Google Scholar]
  • 41. Gerstman  B. Epidemiology Kept Simple: An Introduction to Classic and Modem Epidemiology. John Wiley & Sons; 1998. [Google Scholar]
  • 42. Lilienfeld  DE, Stolley  PD. Foundations of Epidemiology. 3rd ed. Oxford University Press; 1994:384. [Google Scholar]
  • 43. Oleckno  WA. Essential Epidemiology: Principles and Applications. 1st ed. Waveland Press; 2002:384. [Google Scholar]
  • 44. Sheps  MC. Shall we count the living or the dead?  New England Journal of Medicine. 1958;259(25):1210-1214. 10.1056/NEJM195812182592505 [DOI] [PubMed] [Google Scholar]
  • 45. Greenland  S, Poole  C. Invariants and noninvariants in the concept of interdependent effects. Scand J Work Environ Health. 1988;14(2):125-129. 10.5271/sjweh.1945 [DOI] [PubMed] [Google Scholar]
  • 46. VanderWeele  TJ, Knol  MJ. Interpretation of subgroup analyses in randomized trials: heterogeneity versus secondary interventions. Ann Intern Med. 2011;154(10):680-683. 10.7326/0003-4819-154-10-201105170-00008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. VanderWeele  TJ. On the distinction between interaction and effect modification. Epidemiology. 2009;20(6):863-871. 10.1097/EDE.0b013e3181ba333c [DOI] [PubMed] [Google Scholar]
  • 48. Rothman  KJ, Poole  C. A strengthening programme for weak associations. Int J Epidemiol. 1988;17(4):955-959. 10.1093/ije/17.4.955 [DOI] [PubMed] [Google Scholar]
  • 49. Poole  C. Coffee and myocardial infarction. Epidemiology. 2007;18(4):518-519. 10.1097/EDE.0b013e31806466e5 [DOI] [PubMed] [Google Scholar]
  • 50. Lesko  CR, Henderson  NC, Varadhan  R. Considerations when assessing heterogeneity of treatment effect in patient-centered outcomes research. J Clin Epidemiol. 2018;100:22-31. 10.1016/j.jclinepi.2018.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Cole  SR, Stuart  EA. Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. Am J Epidemiol. 2010;172(1):107-115. 10.1093/aje/kwq084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Dahabreh  IJ, Robertson  SE, Tchetgen  EJ, et al.  Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals. Biometrics. 2019;75(2):685-694. 10.1111/biom.13009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Lesko  CR, Buchanan  AL, Westreich  D, et al.  Generalizing study results: a potential outcomes perspective. Epidemiology. 2017;28(4):553-561. 10.1097/EDE.0000000000000664 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Hernán  MA, VanderWeele  TJ. Compound treatments and transportability of causal inference. Epidemiology. 2011;22(3):368-377. 10.1097/EDE.0b013e3182109296 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Rose  G. Sick individuals and sick populations. Int J Epidemiol. 2001;30(3):427-432. 10.1093/ije/30.3.427 [DOI] [PubMed] [Google Scholar]
  • 56. Maldonado  G, Greenland  S. Estimating causal effects. Int J Epidemiol. 2002;31(2):422-438. 10.1093/ije/31.2.422 [DOI] [PubMed] [Google Scholar]
  • 57. VanderWeele  TJ, Luedtke  AR, van der  Laan  MJ. Selecting optimal subgroups for treatment using many covariates. Epidemiology. 2019;30(3):334-341. 10.1097/EDE.0000000000000991 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Westreich  D. From exposures to population interventions: pregnancy and response to HIV therapy. Am J Epidemiol. 2014;179(7):797-806. 10.1093/aje/kwt328 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Link  BG, Phelan  J. Social conditions As fundamental causes of disease. J Health Soc Behav. 1995:80-94. 10.2307/2626958 [DOI] [PubMed] [Google Scholar]
  • 60. van  Aalst  R, Thommes  E, Postma  M, et al.  On the causal interpretation of rate-change methods: the prior event rate ratio and rate difference. Am J Epidemiol. 2021;190(1):142-149. 10.1093/aje/kwaa122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Phillips  CV, Goodman  KJ. The missed lessons of sir Austin Bradford Hill. Epidemiologic Perspectives & Innovations. 2004;1(1):3. 10.1186/1742-5573-1-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Lindley  DV. Understanding uncertainty. Vol. xvi. Revised ed.  John Wiley & Sons, Inc.; 2014:393. [Google Scholar]
  • 63. Weed  DL, Gorelic  LS. The practice of causal inference in cancer epidemiology. Cancer Epidemiol Biomarkers Prev. 1996;5:303-311. [PubMed] [Google Scholar]
  • 64. US Department of Health, Education, and Welfare. Smoking and health. Government Printing Office; 1964. [Google Scholar]
  • 65. Hernán  MA, Hernández-Diaz  S, Werler  MM, et al.  Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176-184. 10.1093/aje/155.2.176 [DOI] [PubMed] [Google Scholar]
  • 66. Greenland  S, Pearl  J, Robins  JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37-48. 10.1097/00001648-199901000-00008 [DOI] [PubMed] [Google Scholar]
  • 67. Westreich  D, Edwards  JK, Lesko  CR, et al.  Target validity and the hierarchy of study designs. Am J Epidemiol. 2019;188(2):438-443. 10.1093/aje/kwy228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Lu  H, Cole  SR, Howe  CJ. Toward a clearer definition of selection bias when estimating causal effects. Epidemiology. 2022;33(5):699-706. 10.1097/EDE.0000000000001516 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Shadish  WR, Cook  TD, Campbell  DT. Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin; 2001:xxi, p.623. Publisher description http://www.loc.gov/catdir/enhancements/fy1105/2001131551-d.html. Table of contents only  http://www.loc.gov/catdir/enhancements/fy1105/2001131551-t.html [Google Scholar]

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES