Abstract
A substitution estimator can be used to predict how shifts in population exposures might change health. We illustrated this method by estimating how an upper limit on alcohol outlet density might alter binge drinking in the New York Social Environment Study (n = 4000), and provided statistical code and sample data.
The largest differences in binge drinking were for an upper limit of 70 outlets per square mile; there was a −0.7% difference in binge drinking prevalence for New York City overall (95% confidence interval [CI] = −0.2%, −1.3%) and a −2.4% difference in binge drinking prevalence for the subset of communities the intervention modified (95% CI = −0.5%, −4.0%).
A substitution estimator is a flexible tool for estimating population intervention parameters and improving the translation of public health research results to practitioners.
The production of population health is inextricably linked to the social and physical environments that shape our health-related exposures. Recognizing this, in recent years community randomized trials and nonrandomized interventions have aimed to alter community characteristics to improve population health.1 For example, on the basis of evidence that community gender norms are related to HIV risk behaviors, community trials have aimed to alter norms to reduce HIV incidence.2 Similarly, research has identified aspects of the community food environment that are related to obesity, and a variety of strategies are being implemented to increase access to and consumption of healthier foods.3 In his seminal work, The Strategy of Preventive Medicine, Geoffrey Rose was among the first to articulate the importance of shifting health distributions rather than targeting only high-risk individuals.4 This work has contributed to the framing of theoretical models of health that incorporate determinants at multiple levels of influence.5–7
To effectively promote community health in both trial and programmatic settings, we need to understand the comparative impact of different possible interventions on community- or population-level factors. Traditional analytic methods, such as regression, although useful for identifying which community characteristics are related to specific aspects of health, fall short if we are interested in gauging the magnitude of a change in health that might be expected from different plausible shifts in community exposures. Estimation of these magnitudes is important both for deciding between intervention efforts and for anticipating the plausible magnitudes of change once interventions are under way; these are often far more modest than a regression coefficient would indicate.
Methods have been developed in recent years that are well suited to the examination of the magnitudes of changes in health that might be expected from specific changes in population-level exposures. These methods were originally motivated by problems posed by complex confounding.8,9 However, they are also useful tools for answering questions such as those Rose has highlighted about how shifting exposures have the potential to change distributions of health.
Most health researchers are familiar with population-attributable risk, a measure that aims to quantify the health effects of removing an exposure entirely from the population, and these newer methods allow the generalization of measures like population-attributable risk to any alteration of the exposure that corresponds to a change we might aim to achieve with intervention. For example, it might not be feasible to entirely remove an exposure, but to reduce it by a specified amount might be a reasonable goal. A researcher might also wonder whether reducing the exposure for all places is best or whether focusing efforts on certain areas would be better. It may also be important to consider and incorporate some variability in the extent to which the reduction goal is achieved. These newer methods can be used to estimate how specific alterations in the exposure, including incremental or decremental and targeted or stochastic alterations, relate to changes in health in the population. Thus far, explanations and illustrations of these methods have largely involved clinical applications and been limited to the epidemiology journals.10–15
We have explained the utility and implementation of 1 analytic method that can address questions about specific shifts in population exposures and how they relate to health. To illustrate this method, we estimated the relations between different alterations in alcohol outlet density and binge drinking in New York City (NYC) on the basis of overall relations previously reported in the Journal.16 We have provided computer programming code to implement analyses in R (R Foundation for Statistical Computing, Vienna, Austria) and Stata (StataCorp LP, College Station, TX) as well as a simple simulated data set to facilitate broader use of the method.
METHODS
The data for this illustration come from the New York Social Environment Study. Briefly, the New York Social Environment Study is a multilevel study of 4000 adults conducted in 2005 and designed to examine economic, social, and structural characteristics of neighborhoods and their relations with substance use and mental health in NYC. The neighborhood units for this analysis were the 59 community districts in NYC (Ahern and Galea17 and Ahern et al.18,19 provide more details).
Measures
The questionnaire captured demographic and socioeconomic characteristics, and we acquired neighborhood median household income data from the 2000 US Census. We acquired alcohol license data from New York State, and we calculated the density of off-premises alcohol outlets per square mile.
We assessed drinking behavior using binge drinking questions from the National Institute on Alcohol Abuse and Alcoholism20 and the alcohol module from the World Mental Health Comprehensive International Diagnostic Interview.21,22 The outcome was binge drinking monthly or more frequently. This was determined by the National Institute on Alcohol Abuse and Alcoholism binge drinking questions that assessed the number of occasions in the past 12 months when women consumed 4 or more and men consumed 5 or more drinks in a 2-hour period. We used the World Mental Health Comprehensive International Diagnostic Interview alcohol measures, which include a retrospectively recalled history of alcohol use, to capture the history of drinking before residence in the current neighborhood.
We used the weights that capture the ratio of the persons in the household to telephone lines in the household to account for the probability of selection for interview. We adjusted individual characteristics that were conceptually considered confounders (i.e., age, race/ethnicity, gender, marital status, place of birth, education, income, employment, years lived in the current neighborhood, interview language, and history of drinking) and neighborhood median income in multivariable analysis. To handle missing data, we considered both retention of respondents with missing values for covariates by use of indicator variables and application of multiple imputation. Results did not vary between the 2 ways of handling missingness, so we used the missingness indicator approach in all analyses for simplicity.
Analysis Approach
Overall, this approach uses a regression model as a tool to estimate how alterations in the exposure might change the outcome and is a type of substitution estimator.8,23 We have outlined the steps of the approach, and the appendix (available as a supplement to the online version of this article at http://www.ajph.org) contains statistical code and simulated sample data for implementing similar analyses in R and Stata, with annotation to connect each coding step with the analysis steps.
Step 1: Fit a multivariable regression. The analysis approach starts with an estimated multivariable regression of the outcome as a function of the exposure and all confounders. In this application, we used generalized additive models with locally weighted scatterplot smoothing (family binomial, logit-link) to model binge drinking as a function of alcohol outlet density and the confounders and to capture potential nonlinearity in the relation of outlet density with binge drinking.16,24
Step 2: Alter the exposure values to the pattern of interest. The researcher temporarily alters the exposure values in the data set to the exposure pattern of interest, keeping the confounders fixed at their actual values.
Step 3: Predict outcomes under the new exposure pattern. The regression that was fit on the actual data is used to predict the outcomes for individuals in the population had they experienced different levels of the exposure. To make these predictions, the altered exposure values and actual covariate values of each individual are multiplied by the corresponding regression coefficient and then summed, along with the regression intercept, to generate a predicted outcome for each individual. If smoothing terms were included, as in the applied example, the smoothing function from the fitted model would be applied to the altered exposure value or actual covariate value. If needed, the predicted outcome for each individual can then be converted back to the true scale for the outcome, for example, a sum of the log-odds of the outcome from a logistic model would be converted back to the probability of the outcome with some simple algebra. Typically, in statistical software this can all be done with a simple prediction command (appendix available as a supplement to the online version of this article at http://www.ajph.org). In this application, this process provides a prediction for individuals of the probability of binge drinking under a level of alcohol outlet density that may be different from what they actually experienced and incorporates their probabilities of binge drinking attributable to actual covariates.
Step 4: Take the average to estimate the population level of the outcome. For each exposure pattern of interest, predicted individual outcomes are then averaged, providing a population-level estimate of the outcome associated with the exposure pattern. If a variety of exposure patterns is of interest, Steps 2–4 should be followed for each exposure alteration that is of interest. In this application, this provides an estimate of the prevalence of binge drinking under a new pattern of alcohol outlet density.
Step 5: Compare estimates of the outcome under different exposure patterns. Depending on the research question, a comparison of the outcome under different exposure patterns may be the parameter of interest. For example, the researcher may want to compare the outcome under the observed exposure to the outcome under an altered exposure pattern or compare 2 altered exposure patterns. This comparison is done by taking the difference (or ratio) between the estimated population levels of the outcome that are of interest from Step 4.
Step 6: Calculate confidence intervals (CIs). Although caveats apply, CIs can be estimated using a nonparametric bootstrap.25 A bootstrap approach uses resampling (with replacement) of the independent units from the study data to approximate sampling variability that would have occurred in repeated samples from the source population.
Application of Analysis Approach to Study Data
We applied this general approach to estimate the difference in the prevalence of binge drinking for 2 different alcohol outlet exposure patterns and for 2 different populations of interest. On the basis of previous work,16 the overall association between outlet density and binge drinking indicated a positive and nonlinear relation, with a steeper slope at higher outlet densities.
One interesting intervention could be a hypothetical new policy that restricts outlet density to an upper limit of δ, such that any community currently above that density would be reduced to that upper limit.26 The first alcohol outlet exposure pattern we estimated involves setting this upper limit deterministically, for example, any community with more than 100 outlets per square mile would be altered so that the density was set to 100. The second alcohol outlet pattern we considered involves setting this upper limit with stochastic variation around the upper limit,27 more closely reflecting the likely reality that not all communities could be reduced exactly to the limit but might achieve reductions with some variation around the limit.
With respect to populations of interest, we might be interested in the difference in binge drinking associated with these outlet density alterations both for NYC overall, which is our first population of interest, and for the communities that actually experience some change in outlet density, our second population of interest. The association for NYC overall is analogous to the population-attributable risk, whereas the association for the subset of communities that experience a change in density is analogous to the attributable risk among the exposed.28 Technical details on these parameters is provided in the appendix (available as a supplement to the online version of this article at http://www.ajph.org).
RESULTS
The results of the deterministic outlet density reduction to an upper limit for NYC overall are presented in Figure 1. The x-axis indicates the value of the upper limit on alcohol outlet density for each estimate, or the δ. Thus each data point corresponds to the difference in binge drinking (the value on the left y-axis) associated with an outlet density reduction to the x-axis value for any neighborhood originally above that value. For example, the largest difference in binge drinking is found for a maximum density (x-axis value) of 70 outlets per square mile. This indicates that if all neighborhoods with an outlet density above 70 instead had a density of 70 (and the rest remained with density as observed), we estimate a −0.7% difference in the prevalence of binge drinking across the total population of NYC (95% CI = −0.2%, –1.3%).
FIGURE 1—
Estimated Difference in Binge Drinking for a Deterministic Outlit Density Reduction to an Upper Limit of δ for New York City (NYC) Overall: New York Social Environment Study; New York, NY; 2005
Note. CD = community districts.
The second y-axis (on the right) corresponds to the dashed curve that crosses the figure and indicates the proportion of the NYC population that is affected by the intervention. For example, we can see that at the point where we estimate a −0.7% difference in binge drinking, about 35% of the population would be affected by the hypothetical intervention (i.e., 35% live in a community that has experienced an outlet reduction), and thus 65% of the population would experience no change. Setting the upper outlet density limit at levels even lower than 70 indicates no additional reductions in binge drinking because the overall relation between outlet density and binge drinking flattens.16
Our second population of interest was the subset of communities in which outlet density was modified by the hypothetical intervention. We estimated the difference in binge drinking that would be expected for only the subset communities that experience an outlet reduction, rather than for NYC overall. These results are presented in Figure 2. The largest differences in binge drinking are at similar upper limits on outlet density as is the association for NYC overall, as would be expected, but the magnitude of difference is larger. For example, if all communities with a density of above 70 outlets per square miles instead had a density of 70, we estimate a −2.4% difference in the prevalence of binge drinking in the subset of communities modified by the intervention (95% CI = −0.5%, −4.0%).
FIGURE 2—
Estimated Difference in Binge Drinking for a Deterministic Outlit Density Reduction to an Upper Limit of δ for the Community Districts (CDs) Modified by the Intervention: New York Social Environment Study; New York, NY; 2005
The appendix (available as a supplement to the online version of this article at http://www.ajph.org) shows the estimates of differences in the prevalence in binge drinking associated with an upper limit on outlet density set with stochastic variation for NYC overall and for the subset of communities modified by the intervention. The overall magnitudes and patterns of results are similar, but the estimates vary for each intervention, reflecting differences in which communities were assigned more or less successful interventions as part of the stochastic process. When communities with higher observed densities and larger populations are more successful, this translates to a larger difference in the prevalence of binge drinking. As would be expected, there is more variation in the estimates when greater stochastic variation is introduced into the hypothetical intervention in outlet density.
DISCUSSION
We have demonstrated how to estimate the potential health impacts of hypothetical interventions on community-level exposures using a substitution estimator approach. In this illustration, we have estimated population-level differences in the prevalence of binge drinking under altered patterns of alcohol outlet density. We estimated differences in binge drinking prevalence both for NYC overall and for the subset of communities modified by the intervention—2 estimates that would likely be of interest in gauging whether the health effects of an intervention under consideration would make a meaningful difference in the health of the population.
When applying this approach for actual intervention planning, careful consideration of how much modification in the exposure would be feasible is a critical first step in defining the exposure difference that should be used for estimation. If there is uncertainty, introducing stochastic variation into the exposure difference or estimation for a range of possible degrees of exposure difference is a helpful way to quantify the range of outcomes that might be expected under differing degrees of success. In general, this substitution estimator approach can be a useful and flexible tool for gauging how realistic and specific alterations of exposure might be expected to correspond with magnitudes of change in health.8,27 Our aim in explaining and illustrating this method is to facilitate the estimation of these parameters in public health research.
Although our previous work indicated a strong overall relation between alcohol outlet density and binge drinking,16 the parameters estimated using the current approach indicate more modest differences in binge drinking prevalence associated with hypothetical outlet density changes. This is not surprising because the overall association estimates differences in binge drinking across the full range of outlet density values observed—effectively the difference between the extremes of exposure values.
By contrast, the hypothetical intervention parameters estimate differences in binge drinking prevalence associated with more modest and realistic changes in outlet density that do not cover the range of observed values and do not result in alterations for all communities. This smaller magnitude is a general property of measures of population effects of exposure shifts such as population-attributable risk28 and has also been noted in other work on these hypothetical intervention parameters.11,12
It is particular to this exposure–outcome relationship that decreasing outlet density for communities at a density greater than 70 outlets per square mile is the hypothetical intervention with the most potential benefit for the population. However, had we found that the benefits increased linearly with decreasing outlet densities, then we might have estimated the difference in binge drinking associated with reducing the average density in all places by the same amount, consistent with the more common formulation of Rose’s exposure shifts.4
As with any other estimation method, this approach relies on assumptions for the interpretation of the estimated differences in health as unbiased estimates of the actual impact of future interventions.9,29 These assumptions may seem technical, but they are important because they provide the critical link between the estimated association and the causal effect. Making this link is known as assessing identifiability of the intervention effects. These assumptions apply to all empirical analytic approaches, including standard regression, but have only typically been elaborated in the literature on newer estimation approaches, often called “causal inference” methods.30,31
Some of these assumptions are familiar to many researchers and include the requirement that the exposure occurred before the outcome (known as temporality) and that all confounders have been controlled (known as the randomization assumption or exchangeability). Another important but less familiar assumption is that within every confounder subgroup, all exposure values of interest must be experienced (known as the positivity assumption). As a simple example, if there were an exposure that men never experienced, it would not be possible to estimate the relationship between exposure and outcome for men, and attempting to do so would violate positivity. The details of all 5 assumptions and discussion of ways to assess and consider whether they have been met can be found in the appendix (available as a supplement to the online version of this article at http://www.ajph.org).
Because a causal interpretation relies on assumptions, it is important to consider how researchers should proceed when, as in many instances, not all assumptions are met. When there are major concerns that the assumptions have not been met, such as unclear temporal ordering or key confounders that have not been measured, the parameter should be interpreted as a statistical association (as opposed to a causal effect); this is how we interpret results of more standard analysis approaches when there are concerns about residual sources of bias, and the same applies to these parameters. The estimation of different types of parameters does not change the need to be clear about the limitations of the observed data and to specify which assumptions are met and which are not. One advantage of systematic consideration of these assumptions, which is not typical in public health research, is that it elucidates how future research will need to be strengthened to move closer to a causal interpretation of results.
Other methods can be used to flexibly estimate parameters that correspond to hypothetical changes in the exposure. Inverse probability of treatment weighting methods have also been used for this purpose,32 although these estimators can lack robustness.33 An exciting new development in substitution estimators is targeted maximum likelihood estimation, an augmented form of the substitution estimator we have explained. Targeted maximum likelihood estimation uses a model for the outcome, as well as the probability of treatment (also known as the propensity score), and can incorporate more flexible modeling approaches and thereby avoid relying on parametric assumptions about model form.33 Research on the relative performance of different estimators is available elsewhere.34,35
Naturally, our substitution estimation approach and other related methods do not solve all the challenges with estimating the potential consequences of interventions. Estimating the health consequences of exposure changes does not tell you how to alter the exposure, and different approaches to alteration might have different implications for health.36 For example, in our application, an upper limit on alcohol outlet density could be reached by not allowing any new licenses in a community, leading to attrition over time, or by active closures. These 2 approaches might have distinct results.
In addition, the substitution estimation approach allows the consideration of 1 health outcome at a time and thus does not easily lead to estimates of the net effects of exposure alterations on overall health. The consideration of broader outcomes, such as overall rates of morbidity and mortality, could be helpful in this regard. The creation of summaries of health is, however, challenging and assumption laden because the researcher subjectively determines the scope of outcomes included and the weight given to different outcomes.37
Our illustration considered only 1 exposure; however, the method can be used to estimate alterations in multiple exposures at once, as long as the positivity assumption (that at least some individuals in each covariate subgroup experience each of the exposure combinations of interest) is met. In estimating differences in the outcome associated with alterations in the exposure, it is important not to extrapolate beyond the exposure values in the observed data. For example, a researcher might be interested in the possible reductions in binge drinking associated with the removal of all alcohol outlets. However, the lowest density of outlets observed in our study was 5 outlets per square mile, and estimation for densities lower than that value would involve extrapolation.
Parameters for specific alterations in exposures provide a helpful complement to overall associations, such as risk differences and risk ratios, because they quantify what magnitude of difference in the outcome could be expected from reasonable alterations to the exposure. More widespread use of this approach would improve the translation of research results to practitioners and provide a more realistic sense of the magnitudes of changes in the outcome that could be expected from interventions on the exposure under study for different populations of interest.
ACKNOWLEDGMENTS
Funding for this work was provided in part by the National Institutes of Health (grant R01 DA 017642, DP2 HD 080350).
HUMAN PARTICIPANT PROTECTION
The study protocol was approved by the institutional review boards of the New York Academy of Medicine, the University of Michigan, and the University of California, Berkeley.
Footnotes
See also Galea and Vaughan, p. 1901.
REFERENCES
- 1.Diez Roux AV, Mair C. Neighborhoods and health. Ann N Y Acad Sci. 2010;1186:125–145. doi: 10.1111/j.1749-6632.2009.05333.x. [DOI] [PubMed] [Google Scholar]
- 2.Pettifor A, Lippman SA, Selin AM et al. A cluster randomized–controlled trial of a community mobilization intervention to change gender norms and reduce HIV risk in rural South Africa: study design and intervention. BMC Public Health. 2015;15:752. doi: 10.1186/s12889-015-2048-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bowen DJ, Barrington WE, Beresford SA. Identifying the effects of environmental and policy change interventions on healthy eating. Annu Rev Public Health. 2015;36:289–306. doi: 10.1146/annurev-publhealth-032013-182516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Rose GA. The Strategy of Preventive Medicine. Oxford, UK: Oxford University Press; 1992. [Google Scholar]
- 5.Krieger N. Theories for social epidemiology in the 21st century: an ecosocial perspective. Int J Epidemiol. 2001;30(4):668–677. doi: 10.1093/ije/30.4.668. [DOI] [PubMed] [Google Scholar]
- 6.Susser M, Susser E. Choosing a future for epidemiology: II. From black box to Chinese boxes and eco-epidemiology. Am J Public Health. 1996;86(5):674–677. doi: 10.2105/ajph.86.5.674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.McMichael AJ. Prisoners of the proximate: loosening the constraints on epidemiology in an age of change. Am J Epidemiol. 1999;149(10):887–897. doi: 10.1093/oxfordjournals.aje.a009732. [DOI] [PubMed] [Google Scholar]
- 8.Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Mod. 1986;7(9–12):1393–1512. [Google Scholar]
- 9.van der Laan MJ, Robins JM. Unified Methods for Censored Longitudinal Data and Causality. New York, NY: Springer; 2003. [Google Scholar]
- 10.Snowden JM, Rose S, Mortimer KM. Implementation of G-computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol. 2011;173(7):731–738. doi: 10.1093/aje/kwq472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Slama R, Siroux V. On influencing population means. Epidemiology. 2012;23(3):501–503. doi: 10.1097/EDE.0b013e31824da303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Westreich D. From exposures to population interventions: pregnancy and response to HIV therapy. Am J Epidemiol. 2014;179(7):797–806. doi: 10.1093/aje/kwt328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Keil AP, Edwards JK, Richardson DB, Naimi AI, Cole SR. The parametric g-formula for time-to-event data intuition and a worked example. Epidemiology. 2014;25(6):889–897. doi: 10.1097/EDE.0000000000000160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fleischer NL, Fernald LC, Hubbard AE. Estimating the potential impacts of intervention from observational data: methods for estimating causal attributable risk in a cross-sectional analysis of depressive symptoms in Latin America. J Epidemiol Community Health. 2010;64(1):16–21. doi: 10.1136/jech.2008.085985. [DOI] [PubMed] [Google Scholar]
- 15.Ahern J, Hubbard A, Galea S. Estimating the effects of potential public health interventions on population disease burden: a step-by-step illustration of causal inference methods. Am J Epidemiol. 2009;169(9):1140–1147. doi: 10.1093/aje/kwp015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ahern J, Margerison-Zilko C, Hubbard A, Galea S. Alcohol outlets and binge drinking in urban neighborhoods: the implications of nonlinearity for intervention and policy. Am J Public Health. 2013;103(4):e81–e87. doi: 10.2105/AJPH.2012.301203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ahern J, Galea S. Collective efficacy and major depression in urban neighborhoods. Am J Epidemiol. 2011;173(12):1453–1462. doi: 10.1093/aje/kwr030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ahern J, Galea S, Hubbard A, Midanik L, Syme SL. “Culture of drinking” and individual problems with alcohol use. Am J Epidemiol. 2008;167(9):1041–1049. doi: 10.1093/aje/kwn022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ahern J, Galea S, Hubbard A, Syme SL. Neighborhood smoking norms modify the relation between collective efficacy and smoking behavior. Drug Alcohol Depend. 2009;100(1–2):138–145. doi: 10.1016/j.drugalcdep.2008.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.National Institute on Alcohol Abuse and Alcoholism. Recommended alcohol questions. Available at: https://www.niaaa.nih.gov/research/guidelines-and-resources/recommended-alcohol-questions. Accessed January 29, 2016.
- 21.Kessler RC, Ustün TB. The World Mental Health (WMH) Survey Initiative Version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI) Int J Methods Psychiatr Res. 2004;13(2):93–121. doi: 10.1002/mpr.168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kessler RC, Abelson J, Demler O et al. Clinical calibration of DSM-IV diagnoses in the World Mental Health (WMH) version of the World Health Organization (WHO) Composite International Diagnostic Interview (WMHCIDI) Int J Methods Psychiatr Res. 2004;13(2):122–139. doi: 10.1002/mpr.169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Greenland S, Drescher K. Maximum likelihood estimation of the attributable fraction from logistic models. Biometrics. 1993;49(3):865–872. [PubMed] [Google Scholar]
- 24.Hastie TJ, Tibshirani RJ. Generalized Additive Models. New York, NY: Chapman & Hall; 1990. [Google Scholar]
- 25.Efron B, Tibshirani R. Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci. 1986;1(1):54–77. [Google Scholar]
- 26.Hubbard AE, Laan MJ. Population intervention models in causal inference. Biometrika. 2008;95(1):35–47. doi: 10.1093/biomet/asm097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Díaz I, van der Laan MJ. Assessing the causal effect of policies: an example using stochastic interventions. Int J Biostat. 2013;9(2):161–174. doi: 10.1515/ijb-2013-0014. [DOI] [PubMed] [Google Scholar]
- 28.Szklo M, Nieto FJ. Epidemiology: Beyond the Basics. Burlington, MA: Jones and Bartlett; 2014. Measuring associations between exposures and outcomes; pp. 77–103. [Google Scholar]
- 29.VanderWeele TJ. Ignorability and stability assumptions in neighborhood effects research. Stat Med. 2008;27(11):1934–1943. doi: 10.1002/sim.3139. [DOI] [PubMed] [Google Scholar]
- 30.Pearl J. Causality: Models, Reasoning and Inference. 2nd ed. New York, NY: Cambridge University Press; 2009. [Google Scholar]
- 31.Rubin DB. Estimating causal effects of treatment in randomized and nonrandomized studies. J Educ Psychol. 1974;66(5):688–701. [Google Scholar]
- 32.Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11(5):550–560. doi: 10.1097/00001648-200009000-00011. [DOI] [PubMed] [Google Scholar]
- 33.van der Laan MJ, Rose S. Targeted Learning: Causal Inference for Observational and Experimental Data. New York, NY: Springer; 2011. [Google Scholar]
- 34.Colson KE, Rudolph KE, Zimmerman SC et al. Optimizing matching and analysis combinations for estimating causal effects. Sci Rep. 2016;6:23222. doi: 10.1038/srep23222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Porter KE, Gruber S, van der Laan MJ, Sekhon JS. The relative performance of targeted maximum likelihood estimators. Int J Biostat. 2011;7(1) doi: 10.2202/1557-4679.1308. article 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Hernán MA. Invited commentary: hypothetical interventions to define causal effects–afterthought or prerequisite? Am J Epidemiol. 2005;162(7):618–620. doi: 10.1093/aje/kwi255. [DOI] [PubMed] [Google Scholar]
- 37.Kaufman JS. Making causal inferences about macrosocial factors as a basis for public health policies. In: Galea S, editor. Macrosocial Determinants of Population Health. New York, NY: Springer; 2007. pp. 355–373. [Google Scholar]


