Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Feb 1.
Published in final edited form as: Eur J Epidemiol. 2013 Feb 1;28(2):113–117. doi: 10.1007/s10654-013-9770-6

Unmeasured confounding and hazard scales: sensitivity analysis for total, direct, and indirect effects

Tyler J VanderWeele 1
PMCID: PMC3606287  NIHMSID: NIHMS441215  PMID: 23371044

Mediation Analysis Using Additive Hazards Versus Proportional Hazards Models

In their paper, Nordahl et al. [1] use an additive hazards model approach to examine potential behavioral mediators governing the relationship between education and coronary heart disease. They also compare and contrast their approach and results with that using a proportional hazard model.

As noted by Nordahl et al. [1], and also elsewhere [2,3,4], the standard approaches to using the proportional hazards model for mediation by adding a mediator as a covariate to the proportional hazard model and then seeing if the coefficient for the exposure changes is subject to some important limitations. As discussed in greater detail below, such an approach makes strong confounding assumptions and it also makes a number of modeling assumptions. However, this approach using the proportional hazards model is also sometimes problematic because the hazards ratio is not in general a ‘collapsible’ measure: the hazards ratios when different sets of covariates are included in the model are not directly comparable to one another. This creates problems for the conventional approach (sometimes referred to as the “difference method”) of adding a mediator as a covariate to the proportional hazard model and then seeing if the coefficient for the exposure changes. It was argued on these grounds by Lange and Hansen [2] that the proportional hazards model should thus not be used in this way to assess mediation. However, VanderWeele [3] showed that, although these objections apply generally to the proportional hazards model, the approach can still be used provided the outcome is relatively rare. With a rare outcome, the hazards ratio is approximately collapsible and then the “difference method” of assessing mediation is valid provided the confounding assumptions and modeling assumptions hold [3]. One of the advantages of the additive hazard model approach used by Nordahl et al. [1], is that mediation can be assessed without the outcome being rare.

In the actual application of Nordahl et al. [1], the outcome, coronary heart disease, is relatively rare and thus either the proportional hazards model or the additive hazards model approach could potentially be used. The two models make somewhat different modeling assumptions: the additive hazards model assumes that the hazard is linear in the covariates. The proportional hazards model assumes that the log of the hazard ratio is linear in the covariates. In their most basic form, when used to assess mediation, both models would also assume no interaction between the exposure and the mediator; however this assumption can be relaxed in assessing mediation both with the additive hazards model [2] and with the proportional hazards model [3]. Indeed allowing for exposure-mediator interaction is one of the advantages of the new methods for direct and indirect effects that have been developing in the causal inference literature [210]. In their analysis, however, Nordahl et al. report not finding strong evidence of interaction.

Nordahl et al. [1] report and compare results from both the additive hazards model and the proportional hazards model using three mediators: smoking, physical activity and BMI. They compare short to long periods of education and medium to long periods of education. They consider results separately for men and women. For smoking as a mediator, comparing short to long education and using an additive hazards model, they report 6% proportion mediated for women and 8% for men; the corresponding estimates from the proportional hazards model are 6% and 6% respectively. For physical activity as a mediator, comparing short to long education and using an additive hazards model, they report 0% proportion mediated for women and 1% for men; the corresponding estimates from the proportional hazards model are 0% and 2% respectively. For BMI as a mediator, comparing short to long education and using an additive hazards model, they report 14% proportion mediated for women, and 21% for men; the corresponding estimates from the proportional hazards model are 15% and 23% respectively. The results in all cases are nearly identical when comparing the additive hazards and proportional hazards models. The use of both the additive hazards and proportional hazards models suggests that the results are not particularly dependent on the modeling assumptions made in these approaches in their application.

Sensitivity Analysis for Unmeasured Confounding for Survival Data

However, the analysis of mediation does not only make statistical modeling assumptions, but also strong assumptions about confounding. The confounding assumptions that are required to draw valid conclusions about mediation and direct and indirect effects are in general much stronger than those required to draw valid conclusions about total effects. As noted in Nordahl et al. [1], and elsewhere [210], to obtain valid estimates of direct and indirect effects one in general must assume that the measured covariates suffice to control for (i) exposure-outcome confounding, (ii) mediator-outcome confounding, and (iii) exposure-mediator confounding, and one must further assume (iv) none of the mediator-outcome confounders are affected by the exposure [2,3,6,10]. These are strong assumptions. In many settings they will be violated. For example, in the study of Nordahl et al., when considering smoking as a mediator, these assumptions would require that control had been made for confounders of (i) the education-heart disease relationship, (ii) the smoking-heart disease relationship, and (iii) the education-smoking relationship, and further that (iv) none of the smoking-heart disease confounders are affected by the education. Nordahl et al. note that country of birth might be an unmeasured confounder in their study; it might for example affect both the mediator, smoking, and the outcome, coronary heart disease.

With observational data it will not in general be possible to control for all confounding variables. However, sensitivity analysis can be used to assess the extent to which an unmeasured confounding variable might change estimates of effects. Such techniques are well established for total effects [1115] and have begun to develop for direct and indirect effects [8,9]. Most of the literature on sensitivity analysis concerns binary or continuous outcomes. One way to proceed with sensitivity analysis with time-to-event data would be to take whatever survival analysis model is being employed and to use this to compute survival probabilities at a fixed point in time e.g. 1-year survival or 5-year survival. In comparing survival rates for two different exposure levels, one could then use these survival probabilities and apply sensitivity analysis techniques for a binary outcome on either the difference or ratio scale [1115]. Often, however, as in the paper by Nordahl et al. [1], hazard scales are used e.g. the hazard difference or the hazard ratio. Sensitivity analysis using hazard ratio scales can be more difficult due to the non-collapsibility of hazards as mentioned above. However, as before, when the outcome is rare, this non-collapsibility is essentially a non-issue and progress can be made. In the remainder of this commentary, a number of sensitivity analysis techniques are presented that can be used for total, direct and indirect effects on the hazards difference or hazards ratio scale when the outcome is rare. The results presented here take techniques for binary and continuous outcomes [8,15] and extend their applicability to time-to-event outcomes. Formal derivations are given in the online supplement to this commentary. Nordahl et al. [1] give an example of such sensitivity analysis in their paper using ideas presented in VanderWeele [8]. The development here gives the formal results justifying their approach for the hazard scale.

Sensitivity Analysis for Total Effects for Hazard Differences and Hazard Ratios

First, consider total effects on the hazard difference scale. Suppose we obtain estimates of such effects e.g. using an additive hazards model, controlling for our measured covariates which we will denote by C. Suppose now that there is also an unmeasured covariate U and that we would have controlled for confounding for the effect of the exposure on the outcome if we had controlled for C and U but not simply by controlling for C alone. Under some simplifying assumptions, we can proceed with a very easy-to-use sensitivity analysis technique to assess what the estimate would be that we would have obtained had we been able to adjust for U as well. Specifically we will assume that the unmeasured variable U is binary and that the effect of U on the outcome on the hazard difference scale is the same for both exposure groups (i.e. no interaction between the effects of U and the exposure on the additive scale). These assumptions can be relaxed and a more general approach is presented in the online supplement. We will also assume here, and in all of the results in this commentary, that the outcome is relatively rare (e.g. as in the Nordahl et al. study). Under these assumptions we can carry out sensitivity analysis by specifying two sensitivity analysis parameters. We need to specify the effect of U on the outcome on the hazard difference scale, conditional on the exposure and the covariates; we will call this parameter γ. We also need to specify the difference in the prevalence of U amongst the exposed and the unexposed; we will call this parameter δ. We can then calculate what we might call a bias factor by the taking the product γδ. As shown in the online supplement, to obtain a “corrected” estimate of the effect on the hazard difference scale (i.e. what we would have obtained had we controlled for U as well), we can simply take our estimate from the observed data, controlling only for C, and subtract the bias factor γδ from the estimate; under the simplifying assumptions above we can also obtain a corrected confidence interval by subtracting the bias factor γδ from both limits of the confidence interval. We could then vary the sensitivity analysis parameters according to values that were thought plausible or as informed by external information or expert opinion to see how sensitive estimates were to the possibility of unmeasured confounding.

A similar approach can be carried out on the hazard ratio scale. For the hazard ratio scale we will again assume a rare outcome and a binary unmeasured confounder U but now we will assume that the effect of U on the outcome on the hazard ratio scale does not vary across exposure groups (i.e. no interaction between the effects of U and the exposure on the hazard ratio scale). Under these simplifying assumptions we will specify three parameters to carry out the sensitivity analysis. We will now let γ denote the effect of U on the outcome on the hazard ratio scale, conditional on the exposure and the covariates. And we will let π1 and π0 denote the prevalence of U amongst the exposed and unexposed respectively, conditional on measured covariates C. Once we have specified these parameters we can obtain a bias factor on the hazard ratio scale by the formula: [1+(γ−1) π1]/ [1+(γ−1) π0]. We can then can simply take our estimate of the hazard ratio from the observed data, controlling only for C, and now divide the estimate by this bias factor to obtain a corrected estimate (what we would have obtained had we controlled for U as well). Under the simplifying assumptions above we can also obtain a corrected confidence interval by dividing both limits of the confidence interval by the bias factor. A similar formula for total effects for the hazard ratio scale, but using different assumptions, was obtained by Lin et al. [16] but the assumptions they made were not plausible in the context of U being an unmeasured confounder [17,18] as they effectively assume that either the unmeasured covariate U or the measured covariates C do not affect exposure (in which case the variable would not be a confounder). The simplifying assumption here that the effect of U on the outcome on the hazard ratio scale in is the same in both exposure groups might be reasonable in some settings; more general results are also given in the online supplement.

The two formulae presented above for the additive hazard scale and the hazard ratio scales are exactly the same for those presented on the additive scale for binary or continuous outcomes or on the ratio scale for binary outcomes, respectively [12,15]. As shown in the online supplement, the formulas are justified and the analogues hold under the assumption that the time-to-event outcome is relatively rare.

Sensitivity Analysis for Direct and Indirect Effects for Hazard Differences and Hazard Ratios

In the paper by Nordahl et al. [1], interest is not simply in the total effects of the education exposure on the coronary heart disease outcome but rather in assessing mediation and direct and indirect effects. One notion of a direct effect, sometimes called a “controlled direct effect” [6], is the measure of the effect of the exposure on the outcome when the mediator is fixed to some particular level. For example, in the context of the Nordahl et al. [1] study with smoking as a mediator, we might want to consider how large the effect of education on coronary heart disease would be if we could intervene to prevent smoking for everyone. For such controlled direct effects we need to control for both exposure-outcome and mediator-outcome confounding. Suppose that there were an unmeasured mediator outcome confounder U but we controlled only for measured covariates C. Suppose that if we had controlled for both C and U we would have controlled for exposure-outcome and mediator-outcome confounding. We can once again use sensitivity analysis to examine how such an unmeasured confounder might change estimates of the controlled direct effect on the hazard difference or hazard ratio scale (e.g. estimates that would be obtained for example using an additive hazard or a proportional hazard model including the exposure, mediator and measured covariates C in the model).

For the hazard difference scale, we again specify two parameters. We will again assume the outcome is rare and we assume a binary unmeasured confounder U such that the effect of U on the outcome on the hazard difference scale is the same for both exposure groups. The approach is very similar to that for total effects but the interpretation of the parameters is somewhat different. Suppose controlling only for measured covariates C we obtained (e.g. using an additive hazard model) an estimate of the hazard difference of the exposure on the outcome with the mediator fixed to value m. For the first parameter, we specify the effect of U on the outcome on the hazard difference scale, conditional on the exposure, the mediator and the covariates; note this is the effect of the unmeasured confounder U on the outcome not through the mediator M; we will call this parameter γm. We also need to specify the difference in the prevalence of U amongst the exposed conditional on M=m and the prevalence of U among the unexposed, again now conditional on M=m; we will call this parameter δm. See VanderWeele [8] for further discussion of the interpretation of these prevalences of the unmeasured confounder U conditional on the mediator value M=m. We can then compute a bias factor for the controlled direct effect on the hazard difference scale by the taking the product γmδm. We can obtain a corrected estimate of the controlled direct effect on the hazard difference scale by subtracting the bias factor from the estimate and, under the simplifying assumptions above, we can also obtain a corrected confidence interval by subtracting the bias factor from both limits of the confidence interval.

For the hazard ratio scale, we assume again a rare outcome and that U is a binary unmeasured mediator-outcome confounder; we now assume that the effect of U on the outcome on the hazard ratio scale is the same for both exposure groups. Suppose, controlling only for measured covariates C, we had obtained (e.g. using a proportional hazards model) an estimate of the hazard ratio for the effect of the exposure on the outcome with the mediator fixed to value m. We specify three sensitivity analysis parameters. We now let γm denote the effect of U on the outcome on the hazard ratio scale, conditional on the exposure, the mediator and the covariates; again this is the effect of U on the outcome not through the mediator. We let π1m and π0m denote the prevalence of U amongst the exposed conditional on M=m and the prevalence of U among the unexposed, again conditional on M=m. We can obtain a bias factor on the hazard ratio scale by the formula: [1+(γm −1) π1m]/ [1+(γm −1) π0m]. We can take our estimate of the hazard ratio from the observed data, controlling only for C, and divide the estimate by this bias factor to obtain a corrected estimate and, under the simplifying assumptions above, we can also obtain a corrected confidence interval by dividing both limits of the confidence interval by the bias factor as well. More general expressions for controlled direct effects for both the hazard difference and hazard ratio scales, which do not require the simplifying assumptions, are given in the online supplement. The expressions given here are analogous to those given in VanderWeele [8] for binary and continuous variables. As shown in the online supplement, they are applicable to time-to-event outcomes as well when the outcome is rare.

Using an additive hazard model, Nordahl et al. [1] estimate quantities on the hazard difference scale that are sometimes called “natural direct and indirect effects” [6]. These natural direct and indirect effects have the property that they sum up to the total effect. Under the confounding assumptions mentioned above that the measured covariates suffice to control for (i) exposure-outcome confounding, (ii) mediator-outcome confounding, and (iii) exposure-mediator confounding, and that (iv) none of the mediator-outcome confounders are affected by the exposure, these effects can be estimated from the data [2,3,610]. Under the further assumption that the exposure and mediator do not interact in their effects on the outcome the controlled direct effect will equal the natural direct effect; and the natural indirect effect will equal the total effect minus the controlled direct effect [3,7]. If we were concerned about unmeasured mediator-outcome confounding but were willing to assume absence of exposure-mediator interaction then we could apply the techniques described above for controlled direct effects, under the assumptions described above, and use them for the natural direct effects. We could also then use the opposite of these formulas for the natural indirect effects i.e. for the natural indirect effect on the hazard difference scale we could add the bias factor γmδm to the natural indirect effect estimate (whereas we would subtract this from the natural direct effect) and for the natural indirect effect on the hazard ratio scale we could multiply the natural indirect effect estimate by [1+(γm −1) π1m]/ [1+(γm −1) π0m] (whereas for the natural direct effect we would divide the estimate by this expression). More general sensitivity analysis techniques for natural direct and indirect effects for the hazard difference scale which do not assume the absence of exposure-mediator interaction and which do not make the simplifying assumptions above are given in the online supplement.

Conclusion

The results given here will be of use to investigators when considering the role of unmeasured confounding in the analysis of total, direct and indirect effect estimates with time-to-event data on either the hazard difference or hazard ratio scale. The central limitation of the techniques described here is that they are only applicable to time-to-event outcomes that are relatively rare. While the rare outcome requirement may be reasonable in some settings, such as that in Nordahl et al. [1], in many cases the outcome may be common and the results here inapplicable. The results here are analogous to those for binary and continuous outcomes but the rare outcome assumption was needed to derive the analogues for hazard scales. Future research could consider alternative sensitivity analysis techniques for time-to-event data on the hazard difference and hazard ratio scales which do not require a rare outcome. See Tchetgen Tchetgen [19] for one such approach. As noted above, with a common time-to-event outcome, estimates could also be converted into e.g. 1-year or 5-year survival probabilities and sensitivity analysis techniques for a binary outcome could be employed with the survival probabilities. Techniques to conduct sensitivity analysis on a hazard scale with a common outcome still need further development.

Supplementary Material

10654_2013_9770_MOESM1_ESM

References

  • 1.Nordahl H, Rod NH, Frederiksen BL, Andersen I, Lange T, Diderichsen F, Prescott E, Overvad K, Osler M. Education and risk of coronary heart disease: Assessment of mediation by behavioral risk factors using the additive hazards model. European Journal of Epidemiology. doi: 10.1007/s10654-012-9745-z. in press. [DOI] [PubMed] [Google Scholar]
  • 2.Lange T, Hansen JV. Direct and indirect effects in a survival context. Epidemiology. 2012;22:575–581. doi: 10.1097/EDE.0b013e31821c680c. 2011. [DOI] [PubMed] [Google Scholar]
  • 3.VanderWeele TJ. Causal mediation analysis with survival data. Epidemiology. 2011b;22:575–581. doi: 10.1097/EDE.0b013e31821db37e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Martinussen T, Vansteelandt S, Gerster M, von Bornemann Hjelmborg J. Estimation of direct effects for survival data by using the Aalen additive hazards model. Journal of the Royal Statistical Society, Series B. 2011;73:773–788. [Google Scholar]
  • 5.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]
  • 6.Pearl J. Direct and Indirect Effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann; San Francisco, CA. 2001. pp. 411–420. [Google Scholar]
  • 7.VanderWeele TJ, Vansteelandt S. Odds ratios for mediation analysis with a dichotomous outcome. American Journal of Epidemiology. 2010;172:1339–1348. doi: 10.1093/aje/kwq332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.VanderWeele TJ. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology. 2010;21:540–551. doi: 10.1097/EDE.0b013e3181df191c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Imai K, Keele L, Tingley D. A General Approach to Causal Mediation Analysis. Psychological Methods. 2010a;15(4):309–334. doi: 10.1037/a0020761. [DOI] [PubMed] [Google Scholar]
  • 10.Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods. doi: 10.1037/a0031034. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder LL. Smoking and lung cancer: Recent evidence and a discussion of some questions. Journal of the National Cancer Institute. 1959;22:173–203. [PubMed] [Google Scholar]
  • 12.Schlesselman JJ. Assessing effects of confounding variables. American Journal of Epidemiology. 1978;108:3–8. [PubMed] [Google Scholar]
  • 13.Rosenbaum PR, Rubin DB. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society Series B. 1983a;45:212–218. [Google Scholar]
  • 14.Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. 3rd Edition. Philadelphia: Lippincott; 2008. [Google Scholar]
  • 15.VanderWeele TJ, Arah OA. Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments and confounders. Epidemiology. 2011;22:42–52. doi: 10.1097/EDE.0b013e3181f74493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lin DY, Psaty BM, Kronmal RA. Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics. 1998;54:948–963. [PubMed] [Google Scholar]
  • 17.Hernán MA, Robins JM. Letter to the editor of Biometrics. Biometrics. 1999;55:1316–1317. [PubMed] [Google Scholar]
  • 18.VanderWeele TJ. Sensitivity analysis: distributional assumptions and confounding assumptions. Biometrics. 2008;64:645–649. doi: 10.1111/j.1541-0420.2008.01024.x. [DOI] [PubMed] [Google Scholar]
  • 19.Tchetgen Tchetgen EJ. On causal mediation analysis with a survival outcome. International Journal of Biostatistics. 2011;7(Article 33):1–38. doi: 10.2202/1557-4679.1351. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

10654_2013_9770_MOESM1_ESM

RESOURCES