Doubly robust estimators are an approach used for estimating causal effects, usually based on fitting 2 statistical models (1). As the initial motivating example, Scharfstein et al. defined a robust estimator of the causal effect of some exposure on outcome using models for both X and ; they demonstrated that such an estimator is consistent if “at least one of the [fitted] models… is correct” (2, p. 1142). Such estimators were later termed “doubly robust” (3, p. 6). Here, we demonstrate that using common (but incorrect) intuition about what makes a model “correct” or “incorrect” can turn doubly robust estimators into estimators that are inconsistent if at least one of the fitted models is wrong. We introduce and resolve this double-robust paradox, demonstrating what must be meant by “correct model.”
We illustrate this double-robust paradox using 2 scenarios for binary, random variables X, Y, and Z measured in a target population. We wish to estimate the average causal effect of on , given by , where the potential outcome is the value of we would have observed had been set to . Doubly robust estimation proceeds with a statistical model for , given by , and a statistical model for , given by , where covariate sets and are both either empty or contain , and where may (or may not) equal ; and are the parameters that index those models. If Z or ∅ is sufficient to control confounding between and , a doubly robust approach fitted in this way can consistently estimate the average causal risk difference.
Given uncertainty about whether Z or ∅ is sufficient to control confounding, the doubly robust property seems advantageous. In such cases, we may be tempted to use in one model and ∅ in the other. For example, we could fit regression models and . Our regression model for contains , Z, and their product as regressors, while the “model” for contains a single parameter for .
First, suppose is a confounder. The model is “correct” because 1) it adjusts for and 2) it is saturated. The model for is “incorrect” because it omits the confounder . (We emphasize that “correct” and “incorrect” have common-use definitions rather than technical definitions, a key distinction explained below.) Due to the doubly robust property, we might anticipate that our estimator would consistently estimate the average causal effect of on .
Now suppose is instead a mediator. In this case, the model for is “correct” because, to identify the average causal effect of on , we should not generally adjust for variables affected by exposure (4). The model for is “incorrect” for this same reason. Under the common-use definition of correct model specification, our doubly robust estimator will (again) appear to be consistent for the average causal effect. Therein lies the problem; for a doubly robust estimator to give us a correct answer, regardless of whether is a mediator or a confounder, the statistical model would necessarily yield causal information about and . If the common-use definition were correct, we could, in principle, compare doubly robust estimators with other estimators to test whether was a mediator or a confounder.
Seen another way, if is a mediator and is not a modifier on the additive scale, one model is “correct” for the controlled, direct risk difference , the effect we would observe if we could manipulate while also setting the mediator to . The doubly robust estimator in the fully saturated case estimates the quantity this holds regardless of whether is a confounder or a mediator. Under the common-use definition of correct model specification, our doubly robust estimator would be consistent for bothandunder the same model specification. If is truly a mediator, these estimands cannot, in general, be equal, meaning that the doubly robust property has failed. We term this contradiction the “double-robust paradox.”
This apparent paradox is resolved by observing that the common-use definition of correct model specification is, itself, incorrect. The error results from bestowing a causal interpretation on the doubly robust estimate without considering causal identification conditions. Through the lens of causal directed acyclic graphs, we cannot infer the direction of the edges that go between and without external information (5). That is, numerical values in a single data set do not tell us how variables affect each other (6). Only with external information, such as temporal ordering or biological plausibility, can we make accurate causal inference.
Under appropriate causal identification conditions, this identifies an effect of under the observed distribution of . Thus, can be a marginal effect if is a confounder; a controlled, direct effect if Z is a mediator and is not a modifier on the additive scale; or something else that may not have a causal interpretation (7). For our scenario, a doubly robust method will estimate if is in either the model for or the model for X, meaning that we can make incorrect causal inferences by misspecifying (common usage) either model. Thus, if we mistake a mediator for a confounder, what was a doubly robust estimator becomes a “double-jeopardy” estimator: Estimators are inconsistent if at least one of the fitted models is wrong.
The paradox is resolved by noting that a “correctly specified” model refers to a statistical model that contains the true density of the data; to estimate causal effects, we also must understand identification conditions for the potential outcome of interest, be it or . We propose that definitions of “model specification” should avoid conflation of statistical and causal concepts by clearly separating the two sets of concepts. Pearl notes that causal assumptions reflect concepts “that cannot be realized by imposing statistical assumptions” (8, p. 39). Further clarification of what distinguishes causal concepts from statistical concepts is needed: We are not aware of a general set of rules that distinguish between causal mistakes that are amenable to double-robustness and causal mistakes that result in double jeopardy.
Acknowledgments
This work was supported by the National Institutes of Health (grants DP2-HD-08-4070, T32 HD057833-07, and R01-AI-100654).
Conflict of interest: none declared.
References
- 1. Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc. 1994;89(427):846–866. [Google Scholar]
- 2. Scharfstein DO, Rotnitzky A, Robins JM. Rejoinder. [Re: “Adjusting for nonignorable drop-out using semiparametric nonresponse models”]. J Am Stat Assoc. 1999;94(448):1135–1146. [Google Scholar]
- 3. Robins JM. Robust estimation in sequentially ignorable missing data and causal inference models In: Proceedings of the American Statistical Association Section on Bayesian Statistical Science. (Vol. 1999). Alexandria, VA: American Statistical Association; 1999:6–10. [Google Scholar]
- 4. Rosenbaum PR. The consquences of adjustment for a concomitant variable that has been affected by the treatment. J R Stat Soc Ser A. 1984;147(5):656–666. [Google Scholar]
- 5. Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48. [PubMed] [Google Scholar]
- 6. Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology. 2001;12(3):313–320. [DOI] [PubMed] [Google Scholar]
- 7. Hubbard AE. Direct effects and effect among the treated In: Van der Laan MJ, Rose S, eds. Targeted Learning: Causal Inference for Observational and Experimental Data. 1st ed (Springer Series in Statistics). New York, NY: Springer Science & Business Media; 2011:133–143. [Google Scholar]
- 8. Pearl J. Causality: Models, Reasoning, and Inference. New York, NY: Cambridge University Press; 2000. [Google Scholar]