Skip to main content
International Journal of Epidemiology logoLink to International Journal of Epidemiology
. 2020 Jun 12;49(4):1314–1315. doi: 10.1093/ije/dyaa084

Commentary: Compositional data call for complex interventions

Alexander Breskin d1,, Eleanor J Murray d2
PMCID: PMC7660156  PMID: 32529230

In their article,1 Arnold et al. describe the unique complexities of causal inference in the setting of compositional exposures. To interpret an association causally, one must make untestable assumptions.2 Perhaps the most fundamental of these is causal consistency, which links the world of observations to the world of potential outcomes.3–5 One way in which causal consistency can be violated is when there are relevant variations in the treatment or intervention of interest.4 Thus, specifying a clear and well-defined intervention of the exposure of interest is the best way to justify the assumption of consistency.6 The case of compositional exposures presented by Arnold et al. provides an example of how insufficiently well-defined interventions can lead to conflicting or misleading interpretations of causal effects, even when those interventions could be assessed in a randomized trial. The authors describe two potential causal estimands of interest in the setting of compositional exposures, and briefly note that ‘in certain situations these effects may not be sufficiently “well-defined”’.1 Although well-defined interventions are not the focus of their paper, we believe this issue deserves further attention.

The first estimand proposed by Arnold et al. is the ‘total effect’ which represents the causal effect of changing a single component of exposure and, as a consequence, changing the overall size of the total composite variable. Since the total effect is obtained by changing a specific component, well-defined interventions are simple to conceptualize but may be challenging to implement in practice. For example, imagine a randomized trial in which individuals are assigned to increase their intake of protein by 5 g per day, while keeping all other aspects of their diet constant. Since many individuals do not have a constant normal diet and may have difficulty with a higher total caloric intake, increasing protein consumption may lead to unintentional changes in other diet components, potentially violating causal consistency and complicating the estimation and interpretation of the total effect.

The second proposed estimand is the ‘relative effect’, or the effect of holding the total size of the composite variable constant despite changes to a component of interest. For the total composite to remain fixed, one or more other components must necessarily also change. As a result, the effect of the exposure component of interest is in some sense mixed with the effect of the changes in the levels of the other components. The challenge of specifying a well-defined intervention for estimating the relative effect will depend on the number of components which can be intervened upon. The more components available, the more detail will be required to ensure consistency. For example, a relative effect of fat consumption could correspond to a trial in which individuals are told to increase their intake of fat by 5 g per day, and to maintain a constant caloric intake by decreasing their intake of carbohydrates by an equivalent number of calories. Such an intervention will be possible to assess only if participants are able to make these targeted changes to their diets.

To better understand the challenges for interpretation in the absence of sufficiently well-defined interventions, consider a third scenario for estimating causal effects from compositional data, which we refer to as the ‘weighted average compositional effect’ (WACE). That is, the causal effect of changing one component exposure of a composite variable without specifying the required changes to other components or to the composite total.

The WACE can be validly estimated in a trial in which individuals are simply told to increase their fat consumption without being given guidance or encouragement on the other components of their diets. In such a trial, for some individuals, an increase in fat consumption will lead to a decrease in protein and carbohydrate consumption, whereas for others the other components of their diets will not change or may even increase.

Such a trial would be simple to design and conduct and thus the WACE may seem to be an attractive alternative to the total and relative effects. However, the results of a trial aimed at estimating this effect would be challenging to interpret, as the causal estimand includes, as the name implies, a weighted average of all changes to the other components and to the total value of the composite variable which occurred among the trial participants, with weights equal to the frequency of those changes in the trial sample.

Unless rich data over follow-up are collected, it is likely that both the nature and frequency of these changes are unknown. Unfortunately, whenever the distribution of these changes varies between the trial sample and the target population, the value of the WACE will be expected to vary between populations, even when an internally valid effect estimate is obtained from a randomized controlled trial. In an observational study, the weighted average effect presents even more challenges—without an understanding of all the ways in which the other components change, it becomes difficult to make the claim that all confounders can be identified and controlled.

To conclude, in their description of causal inference in the setting of compositional data, Arnold et al. provide an opportunity to explore the implications of different interventions on the same exposure. Although the call for well-defined interventions has been described as overly restrictive and even politically conservative,7 we believe the contrary. By aiding interpretation, improving the plausibility of causal assumptions and informing identification strategies, well-defined interventions lead to robust, translatable causal inference that can be used to guide practice and policy. Such information is needed to design and choose among interventions to reduce disparities and improve health outcomes. Estimands that do not correspond to even theoretical well-defined interventions may not translate into actionable effects in the real world, and can potentially lead to misleading evidence and poorly informed decisions.

Funding

EM was partially supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (R21HD098733).

Conflict of interest

None declared.

References

  • 1. Arnold K, Berrie L, Tennant P  et al.  A causal inference perspective on the analysis of compositional data. Int J Epidemiol  2020;49:1307--13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Hernán MA, Robins JM.  Causal Inference: What IF. Boca Raton, FL: Chapman & Hall/CRC, 2020. [Google Scholar]
  • 3. Cole SR, Frangakis CE.  The consistency statement in causal inference. Epidemiology  2009;20:3–5. [DOI] [PubMed] [Google Scholar]
  • 4. VanderWeele TJ.  Concerning the consistency assumption in causal inference. Epidemiology  2009;20:880–83. [DOI] [PubMed] [Google Scholar]
  • 5. Pearl J.  On the consistency rule in causal inference: axiom, definition, assumption, or theorem?  Epidemiology  2010;21:872–75. [DOI] [PubMed] [Google Scholar]
  • 6. Hernán MA.  Does water kill? A call for less casual causal inferences. Ann Epidemiol  2016;26:674–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Schwartz S, Prins SJ, Campbell UB  et al.  Is the “well-defined intervention assumption” politically conservative?  Soc Sci Med  2016;166:254–57. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from International Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES