Abstract
Background
In 2012, the EPA enacted more stringent National Ambient Air Quality Standards for fine particulate matter (PM2.5). Few studies have characterized the health effects of air pollution levels lower than the most recent NAAQS for long-term exposure to PM2.5 (now set at 12 μg/m3).
Methods
We construct a cohort of 32,119 Medicare beneficiaries residing in 5,138 U.S. ZIP codes who were interviewed as part of the Medicare Current Beneficiary Survey (MCBS) between 2002 and 2010. We considered four outcomes: death, all-cause hospitalizations, hospitalizations for circulatory diseases and for respiratory diseases.
Results
We found that increasing exposure to PM2.5 from levels lower than 12 μg/m3 to levels higher than 12 μg/m3 causally increases all-cause admissions, and circulatory admission hazard rates by 7%, (95% CI 3–10%) and 6% (95% CI 2–9%). When we restrict the analysis to enrollees with exposure always lower than 12 μg/m3, we found that increasing exposure from levels lower than 8 μg/m3 to levels higher than 8 μg/m3 would increase all-cause, circulatory and respiratory admission hazard rates by 15% (95% CI 8–23%), 18% (95% CI 10–27%) and 21% (95% CI 9–34%), respectively.
Conclusions
Using a nationally representative sample of Medicare enrollees, we found that changes in exposure to PM2.5, even at levels always below the standards, leads to significant increases in hospital admissions for all-cause, cardiovascular and respiratory diseases. The robustness of our results to inclusion of many additional individual level potential confounders adds validity to studies of air pollution that rely entirely on administrative data.
INTRODUCTION
To protect public health and welfare against the dangers of air pollution, the U.S. Environmental Protection Agency (EPA) establishes National Ambient Air Quality Standards (NAAQS). In response to mounting evidence demonstrating the harmful effects of exposure to fine particulate matter, in 2012 the EPA enacted more stringent NAAQS for fine particulate matter (PM2.5). As air pollution standards decrease, regulatory actions are becoming increasingly expensive with the annual cost of implementation and compliance with the NAAQS reaching tens of billions of dollars1–2. While there are massive benefits to reduced air pollution levels3–4 that far exceed their costs, research examining the public health benefits of cleaner air will be subjected to immense scrutiny due to the potential costs associated with more stringent regulatory policy. Despite a substantial amount of epidemiological literature on the health effects of long-term exposure to air pollution,5–13 few studies have characterized the health effects of air pollution at levels in accordance with or lower than the most recent NAAQS for long-term exposure to PM2.5 (now set at 12 μg/m3). From this point forward, when we refer to the NAAQS, we will be referring to the long-term standards for PM2.5. Recent studies14–15 have found positive associations between short-term exposure to air pollution and mortality, while another study16 found a protective effect of short-term PM2.5 on COPD exacerbation. Positive associations between long-term exposure to concentrations of PM2.5 mostly below 12 μg/m3 and mortality were recently reported in a Canadian cohort17. Additionally, there has been little scientific literature examining the effects of air pollution in smaller cities, towns, and rural areas and areas with sparse monitoring. As air pollution levels decrease, studies are needed to determine if further reductions will lead to substantial improvements in health.
In addition, traditional observational cohorts have modeled the outcome as a function of exposure and confounders. Provided that the confounder model is correctly specified (including no omitted confounders), such studies provide causal estimates of the effect of exposure, conditional on the covariates. More recent causal modeling approaches model exposure as a function of covariates, and conditional on the exposure model being correctly specified, can provide marginal estimates of the causal effects of exposure on outcome. Often this can be advantageous because many predictors of health (e.g. alcohol consumption) are not causes of air pollution, but are indirectly associated with it through a common cause, such as socio-economic status. It may be easier to model the effect of income on exposure than the effect of alcohol on cardiovascular disease. We have applied one such causal modeling approach to our data.
In this study, we build upon the existing literature in several ways: 1) We use inverse probability weighting (IPW), enabling us to estimate: a) the “causal” effects of increasing PM2.5 levels from below 12 μg/m3 to above 12 μg/m3, and b) the “causal” effects of increasing PM2.5 from below 8 μg/m3 to above 8 μg/m3 but always below 12 μg/m3; 2) We use estimates of fine particulate matter (PM2.5) on a 1km by 1km grid to compute exposure at the ZIP code level; 3) We use open cohort data from Medicare claims data, which allows us to enroll new individuals each year and examine the health effects over time as air pollution levels continue to decline; 4) We link Medicare claims data to data from the Medicare Current Beneficiary Survey (MCBS),18 which provides information on an extensive list of individual level behavioral risk factors and allows us to control for important confounders such as BMI and smoking habits; 5) We assess the sensitivity of our estimates of causal effects with respect to several modeling assumptions including: a) restriction of our study population to individuals already exposed to low pollution levels (< 12 μg/m3), and most importantly b) inclusion/exclusion of a large set of individual level behavioral risk factors (such as smoking and BMI) when we consider methods for confounding adjustment. Assessing the robustness of causal effects of air pollution to the lack of adjustment for these individual level behavioral risk factors is very important as these factors are generally hard to measure and only available from cohort studies.
METHODS
Cohort Creation
Medicare-MCBS cohort
We consider all Medicare fee-for-service enrollees who reside in the continental US, and participated in the Medicare Current Beneficiary Survey (MCBS) from 2002 to 2010. This allows us to construct an open cohort of N=32,119 Medicare beneficiaries residing in 5,138 unique ZIP codes. The MCBS is a representative survey of the Medicare population. It is designed as a rotating panel, where every MCBS participant is interviewed three times a year for a maximum of four consecutive years. For the purposes of this study, we only retain one interview per year, leading to a total of 68,789 unique patient years. We define the reference date to be the last interview date in a given year. Figure 2 shows the timeline and study design.
We exclude patients not enrolled in Medicare for the entire look back period and outcome observation period. Specifically, we exclude patients who are not yet enrolled in Medicare or ones who are enrolled in a Healthcare Maintenance Organization (HMO). We also exclude patients who reside in US outlying territories. Details regarding inclusion/exclusion criteria are described in Figure 1.
Low Pollution Cohort (LPC)
We create a ‘low pollution cohort’ that only includes those individuals from the full cohort whose exposure to PM2.5 is lower than 12 μg/m3 during the two-year period prior to the reference date. This reduces the number of unique subjects included in the cohort from 32,119 to 18,144. The purpose of constructing the ‘low pollution cohort’ is to assess if there is evidence of a causal effect of air pollution on health outcomes even among individuals with exposure levels that are already below the annual NAAQS. In particular, we will use this cohort to examine if there exists a further reduction in risk for subjects exposed to PM2.5 less than 8 μg/m3, which has been identified by previous work as a level with low risk19.
Study Design
Exposure to PM2.5
To estimate daily levels of PM2.5 for the entire study period (2002–2010) and for every ZIP code included in the study we applied a previously developed exposure prediction model.20 This model integrates satellite-based aerosol optical depth measurement, chemical transport model simulation, meteorological variables, land-use terms and other auxiliary variables. We trained this hybrid model to monitored PM2.5 with a neural network. Neural networks account for nonlinearity and interactions between variables, thus improving model performance. We used the trained neural network to estimate daily PM2.5 on a 1km×1km grid for the entire continental US. We then estimate each individual’s exposure to PM2.5 by averaging PM2.5 levels across space (from the 1km x1km grid to ZIP code of residence) and across time (for the 2 years prior to the reference date). See Figure 3. In previous work,20 we reported a ten-fold cross-validation of R2=0.84 for daily measurements, at the monitoring sites, for the period 2000 to 2012, and for the entire continental US. This indicates high correlation between predicted and monitored PM2.5. This correlation is anticipated to be even higher when we aggregate these values across time (day to year) and across space (1kmx 1km grid cells to ZIP code). For further details of the exposure assessment refer to Di et al.20
Outcome Observation Period
We identify a one-year follow up period from the reference date to ascertain health outcomes from the claims data (MedPAR Part A). We consider: 1) all-cause mortality; 2) all-cause hospitalizations; 3) hospitalizations with a coded circulatory disease [ICD9: 390–459]; 4) hospitalizations with a coded respiratory disease [ICD9: 460–519]. Diagnoses, procedures and outcomes are defined according to the highest level of the ICD9 hierarchy.
Potential confounders
Data extracted from multiple sources (listed below) provide information on a total of 122 potentially confounding factors. Table S1 in the supplemental material summarizes the mean and standard deviation of all variables and outcomes in the study, separately for exposure higher and lower than 12 μg/m3, respectively.
MCBS Data
For each enrollee in the MCBS-Medicare cohort, we extract an extensive list of potential confounders from the MCBS data that is collected at the reference date. These include: patients’ functional status (e.g., if they have difficulty walking), their behavioral risk factors (e.g., smoking status), and their detailed demographics (e.g., marital status and level of education) among others (p=73).
Look Back Period
We extract information from Medicare claims data on individual level co-morbidity during the one-year look back period. Specifically, from Medicare Part A, we construct several binary variables encoding the presence or absence of a number of procedures during hospitalization (e.g., operations on the digestive system) (p=27). Basic patient demographics (e.g., age, race, gender, mailing ZIP code) are collected from the Master Beneficiary Summary and the Denominator files (p=9).
ZIP Code Level Data
Finally, we gather ZIP code level data including urbanization score as estimated by the US Department of Agriculture (USDA) (p=3), and socio economic variables from the US Census (p=10).21
Main Analysis
Throughout, we will be relying on three key assumptions necessary for making causal statements: the stable unit treatment value assumption (SUTVA), positivity, and the assumption of no unmeasured confounding. The SUTVA,22 assumes that the outcome of a given observational unit is not affected by the treatment assignment (i.e. exposure to high versus low pollution levels) received by another unit. Positivity states that all experimental units have a positive probability of receiving each level of treatment (i.e. exposure to high or low levels of air pollution). We will assess this assumption by looking at propensity score overlap in Supplementary Materials Figure S1 and find that it is reasonable. Finally, no unmeasured confounding implies that our full set of available covariates (p=122) is adequate to adjust for residual confounding. This assumption is not testable, but we argue that it is unlikely that there exists covariates that are uncorrelated with the p=122 observed covariates and that can lead to confounding bias.
We applied inverse probability weighting (IPW)23–26 to the full cohort and to the low pollution cohort (LPC) to estimate the causal hazard rate ratio, which can be interpreted as the hazard of mortality (or hospitalization) at any time t had all subjects been exposed to PM2.5 levels higher than 12 μg/m3 (in the LPC: higher than 8 μg/m3, but always lower than 12 μg/m3) divided by the hazard of mortality (or hospitalization) at time t had all subjects had been exposed to PM2.5 levels lower than 12 μg/m3 (in the LPC: lower than 8 μg/m3). The estimation of causal effects using IPW involves two steps: 1) estimation of the inverse probability weights, denoted swi, and 2) fitting a Cox proportional hazards model26 to the observations weighted by swi. Specifically:
Step 1: Inverse Probability Weighting
Let Ti represent the binary exposure for subject i . More specifically we assume that (ti =0 when Ti <12) and (ti =1 when Ti >12) for the full cohort and (ti =0 when Ti <8) or (ti =1, when 8< Ti <12) in the LPC). We denote by Ci be the full set (p=122) of individual level and ZIP code level covariates. For each subject we estimate swi as:
IPW weighting should produce a weighted sample where the distribution of covariates is balanced with respect to Ti, and hence allow a causal estimate of the effect of Ti.
Step 2: Cox proportional hazards model (CPHM)
We then fit to the data a Cox proportional hazards model where every individual observation is weighted by swi. The left tail and the right tail of the weights are truncated at the 10th and 90th quantiles of the distribution of the standardized weights, to mitigate the effect of excessively large or small weights.25,28 Time to event is calculated as the time from reference date until death, the first respiratory, circulatory or all-cause hospitalization (see Figure 2). Death dates are censored at the end of the one-year outcome observation period. Hospitalization dates are censored at the end of the one-year outcome observation period or death, whichever comes first. We calculate 95% confidence intervals based on robust, sandwich variance estimators29 to take into account within-subject correlation induced by repeated measures, the standardized weights, and correlation between subjects living in the same ZIP code.
To measure the potential public health impact of lowering pollution levels below 12 μg/m3, we will calculate the number of events attributable to a change in long-term exposure to PM2.5 from below 12 μg/m3 to above 12 μg/m3. We will use the formula A = N * (1 – (1/HR)) where HR is the hazard ratio comparing exposure above and below 12 μg/m3, N is the number of events in the Medicare population, and A is the number of events attributable to an increase in PM2.5 from below to above 12 μg/m3.
Sensitivity Analyses
We conducted several sensitivity analyses, summarized in Table S2 in the supplementary material. First, to directly compare our results to the American Cancer Society Cohort (ACS) and the Harvard Six Cities Studies,5,6,30–32 we analyze the data using a standard Cox proportional hazards model with continuous exposure and adjustment for confounding by including all the available covariates as linear terms into the model (SA1 in the supplementary material, Figure S2 and Table S3). Second, we perform a Wald test to assess if there is evidence of the non-linearity of the exposure-response function (SA2 in the supplementary material, Table S4), and we plot the resulting nonlinear exposure-response curves (SA2 in the supplementary material, Figure S3). Third, we run the analyses restricting to subjects living in areas with long-term exposure to PM2.5 less than 12 μg/m3, though we use as an exposure a binary indicator of being below 10 μg/m3 instead of 8 μg/m3 as done in the main analysis (SA3 in the supplementary material, Figure S4 and Table S5). Finally, we investigate the sensitivity of the results to the exclusion of the behavioral risk factors extracted from MCBS data (e.g. smoking, BMI, etc.) from the confounding adjustment.
RESULTS
Table 1 summarizes the main characteristics of the MCBS-Medicare cohort (for both the full and low pollution cohorts) in comparison to the characteristics of the cohorts from the two original landmark studies – the ACS and Six Cities studies5,6,30–32. Please note that in our study, the average level of PM2.5 (equal to 12 μg/m3) is substantially lower than what was observed in the Harvard Six Cities Study and in the ACS Cohort (16.4 and 17.7 μg/m3, respectively).
Table 1.
Characteristic | MCBS-Medicare Full Cohort | MCBS-Medicare Low Pollution Cohort (Cohort with annual PM2.5 < 12 μg/m3) | American Cancer Society Cohort (Pope et al 1995, 2002)6,12 | Harvard Six Cities Study Cohort (Dockery et al NEJM 1993, Laden 2006)5,31 |
---|---|---|---|---|
Number of individuals | 32,119 | 18,144 | ~293,000 | ~8,000 |
Mean age at enrollment | 72.0 | 72.3 | 58.6 | 49.7 |
Number of years of follow up from interview date | 1 | 1 | 18 | 24 |
Study period | 2002–2010 | 2002–2010 | 1982–2000 | 1974–1998 |
Time period where exposure was measured | 2000–2010 | 2000–2010 | 1979–1983, 1999–2000 | 1979–1988, 1990–1998 |
Spatial resolution for exposure assessment | ZIP codes (N=5,138) | ZIP codes (N=3,079) | Counties (N=50) | Cities (N=6) |
PM2.5 (mean, IQR) during the study period (μg/m3) | 12 (3.41) | 10.18 (2.46) | 17.7 (3.7) | 16.4 (5.6) |
No of confounders | 122 | 122 | ~50 | ~40 |
Figure 3 shows the average PM2.5 exposure in the 5138 ZIP codes (1067 unique counties) where MCBS enrollees resided in 2002. During the 1 year follow up period from the reference date, 4.95% died, 22.2% had one or more hospitalizations, 19% were hospitalized at least once with a circulatory disease and 9.7% were hospitalized at least once for a respiratory disease.
Table 2 summarizes the results of IPW applied to both the full cohort and the LPC. We found that increasing long-term exposure to PM2.5 from levels lower than 12 μg/m3 to levels higher than 12 μg/m3 causally increases all-cause admissions, and circulatory admission hazard rates by 7% (95% CI 3–10%), and 6% (95% CI 2–9%) respectively. This implies that the total number of all-cause admissions and circulatory admissions from 2002 to 2010 in Medicare attributable to an increase in long-term average PM2.5 levels from below 12 μg/m3 to above 12 μg/m3 is estimated to be 5,861,028 and 1,417,962, respectively. We did not find evidence of a statistically significant increase in mortality or respiratory admissions. We also found that in the LPC increasing PM2.5 levels from below 8 μg/m3 to above 8 μg/m3 (but always lower than 12 μg/m3) causally increases all-cause, circulatory and respiratory admission hazard rates by 15%, (95% CI 8–23%), 18% (95% CI 10–27%) and 21% (95% CI 9–34%), respectively and all these effects were statistically significant. We did not find evidence of a statistically significant increase in mortality.
Table 2.
Full cohort, Threshold = 12 μg/m3, N = 32,119 person years = 68,789 | Low pollution cohort (Cohort with annual PM2.5 < 12 μg/m3), Threshold = 8 μg/m3 N = 18,144 person years = 34,429 | |
---|---|---|
All-cause mortality | 0.97 (0.90, 1.04) | 1.11 (0.97, 1.28) |
All-cause hospitalization | 1.07 (1.03, 1.10) | 1.15 (1.08, 1.23) |
Circulatory hospitalization | 1.06 (1.02, 1.09) | 1.18 (1.10, 1.27) |
Respiratory hospitalization | 1.03 (0.98, 1.08) | 1.21 (1.09, 1.34) |
Figure 4 illustrates the sensitivity of the results summarized in Table 2 with respect to omission of all the MCBS variables when estimating swi. Each panel summarizes the results for a different outcome (all-cause hospitalization, circulatory hospitalization, death, respiratory hospitalization). Within each panel, we illustrate the results for both the full cohort and LPC. Estimates in red are obtained when we use the entire set of all the available potential confounders to adjust for confounding (122 potential confounders). Estimates in blue (claims only) are obtained when we exclude the MCBS variables (p=122–73=41) in the approach for confounding adjustment. The fact that blue and red estimates are highly overlapping, indicate that our conclusions are robust to the exclusion of the MCBS variables among the confounding variables used for the adjustment.
More generally, results from the sensitivity analyses (SA1, SA2, SA3) mentioned in the Methods section and reported in the supplementary material suggest that our estimates are largely robust across different statistical methodologies, model misspecification and confounder exclusion. Importantly, as summarized in the supplemental material, our analyses using a standard Cox proportional hazards model with continuous exposure also found significant effects for hospitalizations. The exposure response curves for all-cause, circulatory, and respiratory hospitalizations indicate a slightly larger effect at low levels of PM2.5, though only circulatory hospitalizations had a nonlinear curve that was significantly different than the simpler, linear association.
DISCUSSION
Samet (NEJM 2011)33 wrote: As the NAAQS have been reset at lower and lower concentrations, the gaps between acceptable concentrations and irreducible background levels have narrowed, raising the question of how much lower the limits can be pushed. […] In promulgating the NAAQS for these pollutants, the administrator must weigh the public health burden against the uncertainty of the scientific evidence related to lower concentrations, keeping in mind the Clean Air Act’s requirement for an adequate margin of safety.
We have combined several sources of data and constructed the MCBS-Medicare cohort to address the following three questions: 1) Does increasing the level of PM2.5 from below 12 μg/m3 to above 12 μg/m3 causally increase deaths and hospitalizations; 2) Among individuals with exposure levels below 12 μg/m3, does increasing the level of PM2.5 from below 8 μg/m3 to above 8 μg/m3 causally increase deaths and hospitalizations; and 3) Does exclusion of individual level behavioral risk factors significantly affect our estimates?
The Harvard Six Cities Study5,31 and the ACS Study6,12 are two landmark epidemiological cohort studies that had an enormous impact on our understanding of the health effects of air pollution. However, these studies have limited statistical power to detect the effects of low levels of air pollution, particularly because most of their subjects reside in urban areas where pollution levels tend to be higher. The Six Cities Study5,31 and the ACS study6,12 are also limited by the fact that they are “closed” cohort studies in the sense that they do not allow enrollment of new individuals into the cohort. As such, these studies are less able to estimate the health effects of recent air pollution, nor can they track health effects over time. To overcome this challenge, more recent epidemiological studies have leveraged “open” cohort data, such as Medicare claims, which permits new enrollees to enter the cohort each year. Our study leverages Medicare claims data combined with data on individual level behavioral risk factors, an important factor missing in previous studies. Including individual level behavioral risk factors in our analysis is very important as these factors are generally hard to measure and are only available from cohort studies. To our knowledge, this is the first epidemiological study that estimates the effects of low levels of air pollution using claims data augmented with individual level behavioral risk factors, thus overcoming the common criticism that studies that rely entirely on claims data are myopic to important potential confounders.
Our study uses inverse probability weighting (IPW), enabling us to estimate “causal” effects. The results are consistent with existing literature on the adverse health effects of long-term exposure to PM2.5. We found robust evidence that increasing long-term exposure to PM2.5 (two years average) from levels lower than 12 μg/m3 to levels higher than 12 μg/m3 causally increases all-cause admissions and circulatory admission hazard rates; and among individuals with exposure levels below 12 μg/m3, exposure to PM2.5 levels above 8 μg/m3 increases all-cause, circulatory and respiratory admission hazard rates. We also found evidence that the marginal benefit is increasing at lower concentrations: in the low pollution cohort, an increase of PM2.5 from below 8 μg/m3 to above 8 μg/m3 led to a 15% increase in hospitalization rate, whereas in the full cohort an increase of PM2.5 from below 12 μg/m3 to above 12 μg/m3 led to a 7% increase in hospitalization rate. This evidence is consistent with our previous work.34 Future analyses, which will include the whole Medicare population, will be able to rely on much larger statistical power to test this hypothesis.
Our study has several strengths that can be leveraged in future studies. Previous studies assign each subject an average exposure aggregated at the county or at the larger metropolitan area level, which is a coarse indicator of a subject’s exposure to air pollution that lends itself to exposure measurement error.35,36 For this study, we estimate exposure on a 1km by 1km grid to compute exposure at the ZIP code level. These estimates, obtained from previous work,20,37–40 allow us to directly study the effects of low levels of pollution with an unprecedented scale of spatial resolution. Importantly, we also investigated the sensitivity of the results when we exclude from the confounding adjustment all of the behavioral risk factors (p=73) measured in the MCBS (e.g. smoking, BMI, etc.) and found that the results do not change. This finding indicates that claims data combined with ZIP code level data on risk factors and socioeconomic data is sufficient to rigorously estimate the health effects of air pollution when using ZIP code level exposure data. Thus, expensive and potentially time consuming collection of a large set of individual level behavioral risk factors, although potentially useful for exploring susceptibility and effect modification, is not critical to adjust for confounding bias. Furthermore, the results of this analysis add validity to air pollution epidemiological investigations that rely entirely on administrative and therefore publicly available data.
Despite robustness of results, our results have certain limitations that will be important to address in future studies. Our study population is significantly smaller than the population included in the ACS study (see Table 1). To increase our sample size, we included all individuals that had an MCBS interview at any point during the study period 2002 to 2010, thus restricting the follow up period to only one year. The limited sample size and limited follow up period might be the reason why we did not find a significant effect for mortality, only 4.95% of whom died versus 22.2% who were hospitalized. Another limitation in our study was analyzing the data assuming that exposure is binary and time invariant. These are strong assumptions but allow for simple interpretation of the results and for visual inspection of the balance across covariates before and after stratifying on the estimated propensity score, thus substantially increasing the level of confidence in our results with respect to proper adjustment for confounding.
As more data becomes available, future studies will be able to repeat these analyses routinely and with a longer follow-up period. In addition, because our cohort is open in the sense that it allows for new enrollment every year (US elderly > 65 that enters into fee-for-service Medicare), our findings allow for continued monitoring of the health effects as air pollution continues to decline. Our analyses can be repeated routinely every few years as new claims data becomes available to track the effectiveness of regulatory actions and mitigation strategies over time. Also, unlike more traditional closed cohort prospective studies, this study utilizes publicly available data, which permits other entities with access to the Medicare claims data to reproduce our results as a validity check.
Results from this study have important implications for policymakers. With data from 5,138 unique ZIP codes, spanning 1,067 unique counties over a period of nine years and measuring 122 potential confounders, this work provides very compelling evidence that compliance with the annual NAAQS and even further reductions in PM2.5 below the current NAAQS will continue to be beneficial. The number of cases avoided as a result of compliance is large compared to most public health measures and sound policy decisions will lead to significant improvements in public health.
Supplementary Material
Acknowledgments
We would like to thank Christine Choirat who assisted with data analysis and Leigh Melanson for providing editorial assistance.
Source of Funding: This work was supported by the National Institutes of Health (Grants P01 CA134294, P50 MD010428, R01 GM111339, R01 ES024332, R01 ES026217, R21 ES024012, R35 CA197449); the Environmental Protection Agency (Grant 83587201-0); the Health Effects Institute (Grant 4953-RFA14-3/16-4).
Footnotes
Competing financial interests declaration: The authors declare they have no actual or potential competing financial interests.
Sharing Health Outcomes: Medicare and Medicaid health outcomes data will be stored on a highly secure server under the supervision of Dr. Wang. To allow for our analytical health outcomes datasets to be replicated by researchers outside of our team, we will provide: 1) the list of Medicare and Medicaid files that we used; 2) SAS macros to efficiently process raw data files; and 3) simulated health outcomes datasets that represent hypothetical patients and illustrate our data formatting conventions for our tools to be used by other research groups.
Sharing the Linked National Data using Dataverse: Instead of posting data on a private web server or developing ad-hoc data management solutions, all non-sensitive datasets (e.g., EPA AQS pollution data), simulated health outcomes datasets, replication instructions, and links to open-source software will be made publicly available.
References
- 1.Regulatory Impact Analysis for the Final Revisions to the National Ambient Air Quality Standards for Particulate Matter. 474 Leaves in Various Folations: Illustrations (some Color), Charts, Maps, n.d. Web.
- 2.U.S. Environmental Protection Agency. Benefits and Costs of the Clean Air Act from 1990 to 2020: Summary Report. BiblioGov. 2012 [Google Scholar]
- 3.Portney Paul R. Policy watch: economics and the Clean Air Act. The Journal of Economic Perspectives. 1990;4:4. 173–181. [Google Scholar]
- 4.U.S. Environmental Protection Agency. The benefits and costs of the clean air act, 1990 to 2020 2011 [Google Scholar]
- 5.Dockery DW, Pope CA, Xu X, Spengler JD, Ware JH, Fay ME, et al. An association between air pollution and mortality in six US cities. N Engl J Med. 1993 Dec 9;329(24):1753–9. doi: 10.1056/NEJM199312093292401. [DOI] [PubMed] [Google Scholar]
- 6.Pope CA, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA. 2002 Mar 6;287(9):1132–41. doi: 10.1001/jama.287.9.1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Samet JM, Zeger SL, Dominici F, Curriero F, Coursac I, Dockery DW, et al. The National Morbidity, Mortality, and Air Pollution Study. Part II: Morbidity and mortality from air pollution in the United States. Res Rep Health Eff Inst. 2000 Jun;94(Pt 2):5–79. [PubMed] [Google Scholar]
- 8.Samet JM, Dominici F, Curriero FC, Coursac I, Zeger SL. Fine particulate air pollution and mortality in 20 US cities, 1987–1994. N Engl J Med. 2000 Dec 14;343(24):1742–9. doi: 10.1056/NEJM200012143432401. [DOI] [PubMed] [Google Scholar]
- 9.Dominici F, Peng RD, Bell ML, Pham L, McDermott A, Zeger SL, et al. Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. JAMA. 2006 Mar 8;295(10):1127–34. doi: 10.1001/jama.295.10.1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Dominici F, Sheppard L, Clyde M. Health Effects of Air Pollution: A Statistical Review. International Statistical Review. 2003;71:243–276. [Google Scholar]
- 11.Dockery DW, Pope CA. Acute respiratory effects of particulate air pollution. Annu Rev Public Health. 1994;15:107–32. doi: 10.1146/annurev.pu.15.050194.000543. [DOI] [PubMed] [Google Scholar]
- 12.Pope CA, Thun MJ, Namboodiri MM, Dockery DW, Evans JS, Speizer FE, et al. Particulate air pollution as a predictor of mortality in a prospective study of U.S. adults. Am J Respir Crit Care Med. 1995 Mar;151(3 Pt 1):669–74. doi: 10.1164/ajrccm/151.3_Pt_1.669. [DOI] [PubMed] [Google Scholar]
- 13.Breysse PN, Delfino RJ, Dominici F, Elder ACP, Frampton MW, Froines JR, et al. US EPA particulate matter research centers: summary of research results for 2005–2011. Air Quality, Atmosphere and Health. 2013;6(2):333–55. [Google Scholar]
- 14.Vedal Sverre, et al. Air pollution and daily mortality in a city with low levels of pollution. Environmental health perspectives. 2003;111(1):45. doi: 10.1289/ehp.5276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schwartz Joel, Bind Marie-Abele, Koutrakis Petros. Estimating Causal Effects of Local Air Pollution on Daily Deaths: Effect of Low Levels. Environmental health perspectives. 2016 doi: 10.1289/EHP232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.DeVries Rebecca, Kriebel David, Sama Susan. Low level air pollution and exacerbation of existing copd: a case crossover analysis. Environmental Health. 2016;15:1. 98. doi: 10.1186/s12940-016-0179-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Crouse Dan L, et al. Ambient PM2. 5, O3, and NO2 exposures and associations with mortality over 16 years of follow-up in the Canadian Census Health and Environment Cohort (CanCHEC) Environmental health perspectives. 2015;123(11):1180. doi: 10.1289/ehp.1409276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Medicare Current Beneficiary Survey [Da -Centers for Medicare & Medicaid Services. N.p., n.d. Web. 16 Mar. 2016
- 19.Burnett Richard T, et al. An integrated risk function for estimating the global burden of disease attributable to ambient fine particulate matter exposure. Diss. University of British Columbia; 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Di Qian, et al. Assessing PM2. 5 Exposures with High Spatiotemporal Resolution across the Continental United States. Environmental science & technology. 2016;50(9):4712–4721. doi: 10.1021/acs.est.5b06121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.“Census.gov.” Census.gov. N.p., n.d. Web. 23 Jan 2016.
- 22.Little Roderick J, Rubin Donald B. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annual review of public health. 2000;21(1):121–145. doi: 10.1146/annurev.publhealth.21.1.121. [DOI] [PubMed] [Google Scholar]
- 23.Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology. 2000 Sep;11(5):561–70. doi: 10.1097/00001648-200009000-00012. [DOI] [PubMed] [Google Scholar]
- 24.Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000 Sep;11(5):550–60. doi: 10.1097/00001648-200009000-00011. [DOI] [PubMed] [Google Scholar]
- 25.Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008 Sep 15;168(6):656–64. doi: 10.1093/aje/kwn164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006 Jul;60(7):578–86. doi: 10.1136/jech.2004.029496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Cox DR, Oakes D. Book 21. CRC Press; 1984. Analysis of survival data. [Google Scholar]
- 28.Lee BK, Lessler J, Stuart EA. Weight trimming and propensity score weighting. PLoS One. 2011 Mar 31;6(3):e18174. doi: 10.1371/journal.pone.0018174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liang K, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. [Google Scholar]
- 30.Pope CA, Dockery DW. Health effects of fine particulate air pollution: lines that connect. J Air Waste Manag Assoc. 2006 Jun;56(6):709–42. doi: 10.1080/10473289.2006.10464485. [DOI] [PubMed] [Google Scholar]
- 31.Laden F, Schwartz J, Speizer FE, Dockery DW. Reduction in fine particulate air pollution and mortality: Extended follow-up of the Harvard Six Cities study. Am J Respir Crit Care Med. 2006 Mar 15;173(6):667–72. doi: 10.1164/rccm.200503-443OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Health Effects Institute. Reanalysis of the Harvard Six Cities Study and the American Cancer Society Study of Particulate Air Pollution and Mortality: A Special Report of the Institute’s Particle Epidemiology Reanalysis Project. Health Effects Institute; Cambridge MA: 2000. [Google Scholar]
- 33.Samet JM. The Clean Air Act and health–a clearer view from 2011. N Engl J Med. 2011 Jul 21;365(3):198–201. doi: 10.1056/NEJMp1103332. [DOI] [PubMed] [Google Scholar]
- 34.Shi L, Zanobetti A, Kloog I, Coull BA, Koutrakis P, Melly SJ, et al. Low-Concentration PM2.5 and Mortality: Estimating Acute and Chronic Effects in a Population-Based Study. Environ Health Perspect. 2016 Jan;124(1):46–52. doi: 10.1289/ehp.1409111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Alexeeff SE, Schwartz J, Kloog I, Chudnovsky A, Koutrakis P, Coull BA. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data. J Expo Sci Environ Epidemiol. 2015 Mar-Apr;25(2):138–44. doi: 10.1038/jes.2014.40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Gryparis A, Paciorek CJ, Zeka A, Schwartz J, Coull BA. Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics. 2009 Apr;10(2):258–74. doi: 10.1093/biostatistics/kxn033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kloog I, Koutrakis P, Coull BA, Lee HJ, Schwartz J. Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmos Environ. 2011;45(35):6267–6275. [Google Scholar]
- 38.Kloog I, Chudnovsky A, Just AC, Nordioc F, Petros Koutrakis P, Coull BA, et al. A new hybrid spatio-temporal model for estimating daily multi-year PM2.5 concentrations across northeastern USA using high resolution aerosol optical depth data. Atmos Environ. 2014;95:581–590. doi: 10.1016/j.atmosenv.2014.07.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lee M, Kloog I, Chudnovsky A, Lyapustin A, Wang Y, Melly S, et al. Spatiotemporal prediction of fine particulate matter using high-resolution satellite images in the Southeastern US 2003–2011. J Expo Sci Environ Epidemiol. 2016 Jun;26(4):377–84. doi: 10.1038/jes.2015.41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kloog I, Nordio F, Coull BA, Schwartz J. Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the Mid-Atlantic states. Environ Sci Technol. 2012 Nov 6;46(21):11913–21. doi: 10.1021/es302673e. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.