Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jan 17.
Published in final edited form as: J Allergy Clin Immunol. 2016 Nov 7;139(2):448–450. doi: 10.1016/j.jaci.2016.09.030

Weighing the evidence: Bias and confounding in epidemiologic studies in allergy/immunology

Elizabeth C Matsui 1, Corinne A Keet 1
PMCID: PMC6335960  NIHMSID: NIHMS1005331  PMID: 27833026

Although experimental study designs provide the strongest evidence for a causal relationship between a risk factor or treatment and a disease, epidemiologic studies contribute to the causal evidence base, and causality can be established with epidemiologic studies alone. For many questions related to whether an exposure causes a disease, such as the link between smoking and lung disease, it is simply not possible to conduct a randomized controlled trial because it is unethical or impractical. Examples of this type of question include questions related to the role of pollution, allergen, chemical, or microbial exposures in causing asthma. A set of criteria for assessing the strength of the evidence for causality was proposed by Bradford Hill in 1965 (Table I),1 and critical public health findings established by epidemiologic studies include the relationship between smoking and lung cancer, sleep position and sudden infant death syndrome (SIDS), and air pollution and cardiovascular and respiratory disease. In addition, even when an epidemiologic study might not conclusively establish causality, such studies could suggest hypotheses, such as was the case with the association between earlier peanut introduction and reduced risk of peanut allergy.2 With the broadening availability of large amounts of data from electronic medical records and other sources, observational studies offer the potential to efficiently study questions of major importance to allergy.

TABLE I.

Bradford Hill’s criteria

Criterion Comments
Strength of association Stronger associations are more likely to be causal, although some causal relationships are weak.
Consistency Consistency within a study (eg, across subgroups) and between studies supports a causal association.
Specificity This criteria is infrequently used because many exposures (ie, smoking) have been shown to have multiple effects.
Temporality Exposures that cause a disease occur before disease onset.
Dose response or gradient Typically, higher doses of exposure should have a stronger effect on the outcome if the environmental factor is causal.
Plausibility Ideally, causal relationships should have biological plausibility.
Coherence Epidemiologic and laboratory studies showing similar results support a causal relationship.
Experiment When possible, an experiment (ie, intervention) provides the highest level of evidence for a causal relationship.
Analogy If there is strong evidence for similar or analogous exposures causing a disease, this provides some weak evidence in support of the exposure under study causing the disease.

Although a full discussion of epidemiologic methods, including study design, is beyond the scope of this article, we focus on some key pitfalls to consider when either designing or interpreting an epidemiologic study, and we will draw on examples from the allergy/immunology literature to illustrate these concepts (Table II). The focus of this article is on the fundamental concepts of epidemiology that are critical to assessing the validity of findings from epidemiologic studies. Those studies that emerge as having little threat to their validity then provide the evidence base for assessing the causal role of an exposure with the framework of the Hill criteria. Although beyond the scope of this article, there are also specific causal inference methods that can be applied to some of the more challenging causal problems in our field.3 In addition, although one topic of particular importance in allergy/immunology is the pitfalls of screening and diagnostic test development, this topic will not be addressed because it is addressed elsewhere by Keet.4 Here we will focus on 2 major pitfalls that can threaten the validity of a study’s results, conclusions, or both: bias and confounding.

TABLE II.

Selected threats to validity in epidemiologic studies

Threat Definition
Bias A systematic error in the conduct of a study that leads to an estimate of the association between an exposure or an outcome that is incorrect
Measurement error Error in measuring either the exposure or outcome that can be nondifferential (error in measuring the exposure without respect to the outcome or vice versa) or differential
Example: misclassification error A type of measurement error when the outcome or exposure is dichotomous
Selection bias Bias that arises because subjects are more or less likely to be selected because of a combination of exposure, outcome, or both
Example: healthy worker effect A type of selection bias that occurs in occupational cohorts when workers with occupationally related diseases are more likely to drop out of the cohort
Confounding Bias that arises when a factor outside of the studied exposure and outcome is causally related to both the exposure and outcome
Example: confounding by indication A type of confounding that occurs when studying the association between a medication or treatment and disease development if the medication is taken for symptoms associated with the disease
Reverse causality A type of bias that occurs when the exposure being studied is actually caused by the disease being studied instead of causing the disease

Conceptually, bias is defined as a systematic error in the conduct of the study (design; measurement of exposure, outcome, or both; or selection or follow-up of participants) that leads to an estimate of the association between the exposure and the outcome that is incorrect. Whether the incorrect estimate of the association is larger or smaller than the size of the true association depends on the nature of the systematic error.

A systematic error in measuring either the exposure or the outcome will lead to bias, and understanding the methods for measuring and/or defining these is critical to understanding whether there is measurement error and therefore a risk of bias. One key question that has been difficult to answer because of measurement error is whether prematurity is a risk factor for asthma. Although measurement error can affect both the exposure and outcome in this question, we will focus on how measurement of asthma could result in a biased estimate of the association between prematurity and asthma. Because there is no diagnostic test for asthma and common symptoms with other causes, such as cough, can be misconstrued as asthma, measuring this outcome is difficult. If the study definition of asthma tends to misclassify those without asthma as having asthma, then the overall incidence of asthma estimated in the study will be higher than the true incidence. Conversely, if the definition tends to misclassify those with asthma as not having asthma, then the overall incidence of asthma estimated in the study will be lower than the true incidence.

This misclassification of asthma also affects the estimate of the association between prematurity and asthma. If the misclassification of asthma happens proportionately between those who were premature and those who were not premature, this would be referred to as “nondifferential misclassification” because the probability of misclassification is not related to the exposure. In nondifferential misclassification the estimate of the association between prematurity and asthma will be “biased to the null”, meaning that the estimate will be attenuated compared with the true association, so that a study might find no association when in fact there is an association.

On the other hand, if those who are premature are more likely to be misclassified as having asthma when in fact they do not, a positive association between prematurity and asthma will be found when there is not truly a positive association. This scenario is referred to as “differential misclassification” and could occur if the definition of asthma included symptoms that are specific to prematurity. One study recently tackled this very issue by analyzing data from a large population-based birth cohort study using different definitions of asthma.5 In this study the researchers found consistent positive associations between prematurity and asthma risk, regardless of the method used to classify asthma, providing evidence to support an association between prematurity and asthma.

Another common type of bias is selection bias. There are many flavors of selection bias, but generally speaking, selection bias occurs when those in the target population who have both the risk factor and outcome (or neither the risk factor nor the outcome) are more likely to be enrolled in the study or remain in the study than others. Sometimes this can occur inadvertently, such as when a study of the effect of a proposed risk factor on a disease is advertised as such, thereby disproportionately attracting those with both the risk factor and the disease, resulting in an association in the study population when there is no association in the target population. However, other times, the selection bias is a direct result of the study design, such as in case-control studies with biased selection of control subjects.

One example is a case-control study undertaken to test the hypothesis that soy formula or soy milk exposure is a risk factor for the development of peanut allergy.6 The authors identified patients with peanut allergy and control subjects who were likely at very low risk of food allergy and ascertained soy formula and soy milk exposure in both groups and found that soy formula or soy milk exposure was more common among those with peanut allergy than the control subjects without food allergy. However, because trials of non–cow’s milk–based formulas are common among infants with food allergy, it is quite plausible that soy formula or soy milk exposure was simply a marker for a child with food allergy generally and not peanut allergy specifically.7 Use of a control group composed of children who had food allergies, but not peanut allergy, would have helped to address this study design issue, and was done in a recent study that found no association between soy exposure and risk of peanut allergy.8

In prospective studies selection bias can also occur if loss to follow-up occurs disproportionately by risk factor and outcome. For example, in occupational cohorts high levels of exposure to an occupational allergen might appear to protect against the development of the disease, when in reality, workers with high levels of exposure leave work because of the adverse health effects of the exposure, leaving a disproportionate number of healthy workers who have high levels of exposure to the allergen. This type of selection bias in an occupational cohort is called a healthy worker effect and leads to the erroneous conclusion that high levels of exposure are not harmful. Careful follow-up of all workers that includes assessment of health status at the time of departure and reasons for departure help mitigate this type of selection bias.9

Confounding is generally a concern when the hypothesized causal factor being studied is associated with another factor that is believed to be causally related to the outcome of interest. A study might seek to determine whether mouse allergen exposure causes asthma exacerbations, but if mouse allergen exposure is associated also with cockroach allergen exposure, which is a known risk factor for asthma exacerbations, part or all of the associations between mouse allergen exposure and asthma exacerbations might be explained by cockroach allergen exposure.10 In such a case, it would be important for the researchers to adjust for cockroach allergen to estimate the association between mouse allergen exposure and asthma morbidity that is independent of cockroach allergen exposure.

Confounding by indication can be a less obvious type of confounding but is a common concern in studies evaluating the effects of exposure to a medication or treatment on the development of a disease. It is a concern when the medication or treatment would be used to treat symptoms that are a precursor to the disease, so that medication use would be more common among those who have a known risk factor for the disease than those who do not. One commonly used medication, acetaminophen, has been hypothesized to increase the risk of asthma among young children,11 but respiratory tract infections in early childhood, which are a known risk factor for asthma, would also be expected to increase the likelihood of receiving acetaminophen. Therefore the question is whether associations that have been reported between acetaminophen and asthma risk are explained by early-life respiratory tract infections,12 which are often accompanied by fever, and are an indication for treatment with acetaminophen. Ultimately, it might not be possible to settle the concern of confounding by indication without a randomized controlled trial, but the unresolved threat of confounding by indication should lead to a cautious interpretation of the epidemiologic findings to date.

Reverse causality occurs when the risk factor being evaluated is a known manifestation or effect of the disease. For example, several observational studies have noted that the presence of a furred pet is associated with a reduced risk of allergic disease, but it is quite plausible that those who are not allergic to a furred animal are more likely to have a furred pet. The challenge is to determine which “direction the arrow goes” Does having a furred pet protect against allergy, or is it that those who are not allergic are more likely to have a furred pet? This scenario occurs even in the setting of a prospective cohort study when one can determine whether the acquisition of the pet occurred early in life before a child’s allergic status is established because parents who are not allergic and therefore pass on a “nonallergic” genetic background to their children might be more likely to have a furred pet.13 Because a randomized controlled trial of pet acquisition during infancy is not likely to be feasible, it will be important for epidemiologic studies to account for parental allergy to understand the effects of early-life pet exposure that are independent of parental allergy. In addition, epidemiologic studies, combined with laboratory-based experiments, should evaluate a biologically plausible mechanism, such as how the microbes associated with furred pets modulate the infant’s immunologic phenotype in a way that protects against allergy, because this would provide critically important evidence in favor or against a causal role of early-life pet exposure in reducing the risk of allergic disease.

Epidemiologic studies play a critical role in evaluating exposures as causes of allergic diseases and are the primary scientific tool used to establish causality when experimental study designs are not possible. Even when randomized controlled trials are possible, epidemiologic studies provide the preliminary evidence to support (or not) devoting resources and subjecting participants to the time commitment and potential risks of enrolling in a randomized controlled trial. Understanding how to assess the potential effect of bias and confounding on a study’s results and interpretation helps to ensure that epidemiologic studies remain a sharp tool for assessing the causal role of exposures in allergic diseases.

Acknowledgments

Supported by the National Institute of Environmental Health Sciences (R01ES023447 and R01ES026170) and the National Institute of Allergy and Infectious Diseases (1K23AI103187, 1R21AI107085, K24AI114769, and U01AI08328).

Disclosure of potential conflict of interest: E. C. Matsui has received a grant from the National Institutes of Health (NIH). C. A. Keet has received a grant from the NIH and has consultant arrangements with the Environmental Defense Fund.

REFERENCES

  • 1.Hill AB. The environment and disease: association or causation? Proc R Soc Med 1965;58:295–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Du Toit G, Katz Y, Sasieni P, Mesher D, Maleki SJ, Fisher HR, et al. Early consumption of peanuts in infancy is associated with a low prevalence of peanut allergy. J Allergy Clin Immunol 2008;122:984–91. [DOI] [PubMed] [Google Scholar]
  • 3.Hackstadt AJ, Matsui EC, Williams DL, Diette GB, Breysse PN, Butz AM, et al. Inference for environmental intervention studies using principal stratification. Stat Med 2014;33:4919–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Keet CA. A call to improve standards for reporting of diagnostic test research in allergy. J Allergy Clin Immunol 2016;137:1761–3. [DOI] [PubMed] [Google Scholar]
  • 5.He H, Butz A, Keet CA, Minkovitz CS, Hong X, Caruso DM, et al. Preterm birth with childhood asthma: the role of degree of prematurity and asthma definitions. Am J Respir Crit Care Med 2015;192:520–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lack G, Fox D, Northstone K, Golding J. Avon Longitudinal Study of Parents and Children Study Team. Factors associated with the development of peanut allergy in childhood. N Engl J Med 2003;348:977–85. [DOI] [PubMed] [Google Scholar]
  • 7.Matsui EC, Wood RA. Peanut allergy. N Engl J Med 2003;349:301–3; author reply 301–3. [PubMed] [Google Scholar]
  • 8.Koplin J, Dharmage SC, Gurrin L, Osborne N, Tang ML, Lowe AJ, et al. Soy consumption is not a risk factor for peanut sensitization. J Allergy Clin Immunol 2008;121:1455–9. [DOI] [PubMed] [Google Scholar]
  • 9.Peng RD, Paigen B, Eggleston PA, Hagberg KA, Krevans M, Curtin-Brosnan J, et al. Both the variability and level of mouse allergen exposure influence the phenotype of the immune response in workers at a mouse facility. J Allergy Clin Immunol 2011;128:390–6.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ahluwalia SK, Peng RD, Breysse PN, Diette GB, Curtin-Brosnan J, Aloe C, et al. Mouse allergen is the major allergen of public health relevance in Baltimore City. J Allergy Clin Immunol 2013;132:830–5.e1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Etminan M, Sadatsafavi M, Jafari S, Doyle-Waters M, Aminzadeh K, Fitzgerald JM. Acetaminophen use and the risk of asthma in children and adults: a systematic review and metaanalysis. Chest 2009;136:1316–23. [DOI] [PubMed] [Google Scholar]
  • 12.Sordillo JE, Scirica CV, Rifas-Shiman SL, Gillman MW, Bunyavanich S, Camargo CA Jr, et al. Prenatal and infant exposure to acetaminophen and ibuprofen and the risk for wheeze and asthma in children. J Allergy Clin Immunol 2015;135:441–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Almqvist C, Egmar AC, van Hage-Hamsten M, Berglind N, Pershagen G, Nordvall SL, et al. Heredity, pet ownership, and confounding control in a population-based birth cohort. J Allergy Clin Immunol 2003;111:800–6. [DOI] [PubMed] [Google Scholar]

RESOURCES