Abstract
Longitudinal studies are often viewed as the “gold standard” of observational epidemiologic research. Establishing a temporal association is a necessary criterion to identify causal relations. However, when covariates in the causal system vary over time, a temporal association is not straightforward. Appropriate analytical methods may be necessary to avoid confounding and reverse causality. These issues come to light in 2 studies of breastfeeding described in the articles by Al-Sahab et al. (Am J Epidemiol. 2011;173(9):971–977) and Kramer et al. (Am J Epidemiol. 2011;173(9):978–983) in this issue of the Journal. Breastfeeding has multiple time points and is a behavior that is affected by multiple factors, many of which themselves vary over time. This creates a complex causal system that requires careful scrutiny. The methods presented here may be applicable to a wide range of studies that involve time-varying exposures and time-varying confounders.
Keywords: breast feeding, causality, confounding
The topic of breastfeeding has engendered debate in the public discourse as well as in epidemiologic research. Many studies have tried to clarify causal relations among maternal characteristics, breastfeeding practices, and pediatric outcomes including overweight/obesity, child immunity, and risk of chronic disease, while confronting a number of important methodological challenges in the process. In this issue of the American Journal of Epidemiology, 2 groups provide contributions to the literature of studies of breastfeeding: one evaluating breastfeeding as a cause of menarcheal timing and the other evaluating breastfeeding practice as an effect of infant growth trajectories (1, 2). Taken together, these studies highlight issues related to temporality to which studies of breastfeeding are particularly prone, but which may impact epidemiologic investigations in general and impact our ability to assess causation.
Al-Sahab et al. (1) report an analysis of duration of exclusive breastfeeding and age at menarche and find longer duration to be associated with older age at menarche. Given the importance of age at menarche for a wide range of adverse health outcomes, the potential impact from breastfeeding has important implications. The authors used multivariate modeling to adjust for a number of maternal (socioeconomic status, parity, age, body mass index, age at menarche) and infant (birth weight and length, gestational age) characteristics to address confounding, and they note that possible mechanisms for this observation are uncertain.
Kramer et al. (2) report results of analyses aimed at addressing the bidirectional relation between breastfeeding and infant weight gain. Reverse causality has been proposed as an explanation for findings from previous studies of breastfeeding and infant obesity (3). Data from the Promotion of Breastfeeding Intervention Trial (PROBIT) study, which includes longitudinal assessment of breastfeeding and childhood growth, provide an opportunity to assess directionality of relations. Infant weight gain, as described in the study by Kramer et al. (2), is a dynamic process and appears to affect breastfeeding practices, supporting the reverse causal link between infant weight and breastfeeding that has been proposed. Furthermore, novel methods using marginal structural models have been proposed on a previous publication from this group (4).
In trying to draw inference from epidemiologic studies of breastfeeding, the papers by Al-Sahab et al. (1) and Kramer et al. (2) raise important questions of confounding, temporal ordering, and statistical modeling that may be relevant to a wide range of epidemiologic investigations. In order to address confounding, one must answer the question, how would the age at menarche of girls who were breastfed differ had they not been breastfed? This is a simplified counterfactual setting in which breastfeeding is presented as all or nothing and does not take into account other aspects of the exposure, such as duration, intensity, or exclusivity. Using observational data, one must use statistical analyses to adjust for factors that skew this counterfactual comparison. For example, maternal body mass index may affect both breastfeeding practices and pediatric outcomes and is a potential source of confounding.
Temporality has long been recognized as a critical component for causal inference (5). Correct temporal ordering is a sine qua non criterion for a factor to be considered causal; that is, by definition, cause precedes effect, issues of detection and measurement notwithstanding. A link between confounding and reversals of temporal order has been nicely described and illustrated (6, 7). When an exposure of interest (e.g., breastfeeding) is affected by uncontrolled confounding by undiagnosed preclinical disease (e.g., obesity), it may lead to reverse causality that, as discussed, impacts our ability to determine causal effects (6).
The nature of infant and child weight gain adds complexity to studies of the relation between breastfeeding and outcomes like age at menarche. We use directed acyclic graphs to show the causal relations among these and unknown covariates shown in Figure 1. As noted in both papers (1, 2), the causal relation between breastfeeding and infant weight has been unclear. Al-Sahab et al. (1) consider weight gain to be on the causal pathway between breastfeeding and menarche (clearly visualized in Figure 1A, though apparent in each of the models); Kramer et al. (2) consider weight gain to be a predictor of breastfeeding, first evident in Figure 1B with the addition of a time-varying weight variable.
RELATION OF ISSUES TO THE ARTICLES BY AL-SAHAB ET AL. AND KRAMER ET AL.
Both groups are correct if the directed acyclic graphs in Figure 1, B–D, hold, that is, where infant weight is a variable that is simultaneously a determinant of breastfeeding practices and a causal intermediate between breastfeeding and menarcheal age. Weight gain that postdates cessation of breastfeeding can reasonably be considered to be on the causal path between breastfeeding and menarche. In this circumstance where weight gain is a causal intermediate, adjustment for weight gain is inappropriate. Inclusion of weight gain in models of breastfeeding and menarche is considered overadjustment (8) and may also lead to other biases (9).
The results presented by Kramer et al. (2) suggest that infant weight gain affects decisions regarding breastfeeding. If infant growth affects breastfeeding decisions and is also related to age at menarche, then infant growth exemplifies the type of unmeasured confounding that gives rise to reverse causality discussed by Robins (6). That is, because of infant weight gain, girls who were breastfed for a longer duration also have a later age at menarche. Given the proximity of the hazard ratio estimate (0.94, 95% CI: 0.90, 0.98) to the null and the potential effects of confounding, this possibility cannot be ignored.
Additionally, as shown in the directed acyclic graph (Figure 1D), represented by U, infant weight gain is not the only factor that may fit the role of an unmeasured factor that is causal of both breastfeeding and of age at menarche. Al-Sahab et al. (1) have correctly noted potential confounding by a number of factors that they included in their adjusted models. Others may be postulated as well, especially when one considers the possibility that infant weight gain may affect breastfeeding practices, even if only weakly. The causal effect of infant weight gain on breastfeeding opens the possibility of confounding by all factors that affect both infant growth and age at menarche. Confounding due to this type of causal system represents a large potential source of bias for estimates (7). Caution is particularly warranted in drawing inference from statistically significant but small effect estimates in light of this potential confounding.
APPROPRIATE STATISTICAL MODELS—G-ESTIMATION
When the dimension of time is added to confounding in this manner (e.g., when infant weight gain may affect breastfeeding practices that may in turn affect infant weight gain at a later time point), conventional analysis may be affected by time-varying or time-dependent confounding (10). The circumstances of time-varying and time-modified confounding present special challenges to investigators and demand use of appropriate statistical methodology and attention to study design. The implications of decisions to adjust or not adjust vary by causal system. Considering obesity and related disease risk, Flanders and Augestad (7) have noted the potential impact on naïve estimates because disease may cause changes in weight. In the same context, Robins (6) has demonstrated use of G-estimation to derive correct estimates, where conventional approaches will lead to incorrect inference. When interest is in a true intermediate, then the convention to avoid adjustment makes sense to avoid bias due to overadjustment (8). However, even a variable that seems to fit the role of intermediate may act differently once time is taken into account and the full causal process is considered. Specifically, if the variable serving as intermediate is both affected by and affects other variables of interest, the situation is more complex, as has been addressed by Robins (6). Under a set of assumptions including no unmeasured confounding, consistency, positivity, and proper model specification, longitudinal data can be used to derive weights that are used to estimate causal parameters (11).
In order to reach appropriate inferences from data that include a confounding intermediate variable, efforts must be at the stages of study design and data collection. For application of G-estimation and related models, it is not sufficient to measure only point exposure and confounders, even while maintaining temporal ordering. Longitudinally collected data on time-dependent exposure and time-dependent confounders are essential. In the context of breastfeeding and infant growth, researchers must consider factors including the following: those that affect the lactation decision at baseline; those that may affect duration of breastfeeding; and those that would be affected by prior breastfeeding decisions, such as infant crying, other signs of hunger, and supplementation (2).
Defining the variables in a causal system is critical for all epidemiologic studies and can be challenging in light of varying hypotheses among investigators regarding the true underlying causal relations. Relevant variables must be defined as being strictly mediators or both an intermediate variable and a confounder in order to determine the appropriate analysis. The dimension of time adds complexity to causal systems, as the role of variables may be time varying. The studies of Al-Sahab et al. (1) and Kramer et al. (2) provide excellent examples of such a situation with the variable “weight.” Al-Sahab et al. (1) suggest that weight in childhood is on the causal pathway between breastfeeding and age at menarche. In this case, a traditional adjustment would be inappropriate. Kramer et al. (2) postulate that weight in infancy can affect breastfeeding practices. If true, infant weight would confound the association between breastfeeding and age at menarche.
The articles by Al-Sahab et al. (1) and Kramer et al. (2) introduce important issues facing studies of breastfeeding-related outcomes and epidemiologic investigations in general. Although we strive for parsimony in statistical models, the analytical approach must be sufficiently complex to address potential biases, such as those that arise from a failure to consider temporality. Study design, data collection, and statistical analysis must all be taken into account in order to evaluate relations between breastfeeding and outcomes. Doing so is important to inform public health policy and to educate women of child-bearing age. In addition to the established short-term benefits of breastfeeding (12), emerging biologic hypotheses suggest that there are additional long-term effects. These data can be routinely collected and, with the use of previously described methods, the answers are within reach.
Acknowledgments
Author affiliations: Epidemiology Branch, Division of Epidemiology, Statistics, and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland (Enrique Schisterman, Katherine Bowers); and Department of Public Health, Division of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, Massachusetts (Brian Whitcomb).
Conflict of interest: none declared.
References
- 1.Al-Sahab B, Adair L, Hamadeh M, et al. Impact of breastfeeding duration on age at menarche. Am J Epidemiol. 2011;173(9):971–977. doi: 10.1093/aje/kwq496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kramer MS, Moodie EEM, Dahhou M, et al. Breastfeeding and infant size: evidence of reverse causality. Am J Epidemiol. 2011;173(9):978–983. doi: 10.1093/aje/kwq495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hediger ML, Overpeck MD, Kuczmarski RJ, et al. Association between infant breastfeeding and overweight in young children. JAMA. 2001;285(19):2453–2460. doi: 10.1001/jama.285.19.2453. [DOI] [PubMed] [Google Scholar]
- 4.Moodie E, Platt R, Kramer M. Estimating response-maximized decision rules with applications to breastfeeding. J Am Stat Assoc. 2009;104(485):155–165. [Google Scholar]
- 5.Hill AB. The environment and disease: association or causation? Proc R Soc Med. 1965;58:295–300. doi: 10.1177/003591576505800503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Robins JM. Causal models for estimating the effects of weight gain on mortality. Int J Obes (Lond) 2008;32(suppl 3):S15–S41. doi: 10.1038/ijo.2008.83. [DOI] [PubMed] [Google Scholar]
- 7.Flanders WD, Augestad LB. Adjusting for reverse causality in the relationship between obesity and mortality. Int J Obes (Lond) 2008;32(suppl 3):S42–S46. doi: 10.1038/ijo.2008.84. [DOI] [PubMed] [Google Scholar]
- 8.Schisterman EF, Cole SR, Platt RW. Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology. 2009;20(4):488–495. doi: 10.1097/EDE.0b013e3181a819a1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Cole SR, Platt RW, Schisterman EF, et al. Illustrating bias due to conditioning on a collider. Int J Epidemiol. 2010;39(2):417–420. doi: 10.1093/ije/dyp334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Platt RW, Schisterman EF, Cole SR. Time-modified confounding. Am J Epidemiol. 2009;170(6):687–694. doi: 10.1093/aje/kwp175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11(5):550–560. doi: 10.1097/00001648-200009000-00011. [DOI] [PubMed] [Google Scholar]
- 12.Godfrey JR, Lawrence RA. Toward optimal health: the maternal benefits of breastfeeding. J Womens Health (Larchmt) 2010;19(9):1597–1602. doi: 10.1089/jwh.2010.2290. [DOI] [PubMed] [Google Scholar]