Abstract
Epigenetic mechanisms may underlie air pollution-health outcome associations. We estimated gaseous air pollutant-DNA methylation (DNAm) associations using twelve subpopulations within Women’s Health Initiative (WHI) and Atherosclerosis Risk in Communities (ARIC) cohorts (n=8397; mean age 61.3 years; 83% female; 46% African-American, 46% European-American, 8% Hispanic/Latino). We used geocoded participant address-specific mean ambient carbon monoxide (CO), nitrogen oxides (NO2; NOx), ozone (O3), and sulfur dioxide (SO2) concentrations estimated over the 2-, 7-, 28-, and 365-day periods before collection of blood samples used to generate Illumina 450k array leukocyte DNAm measurements. We estimated methylome-wide, subpopulation- and race/ethnicity-stratified pollutant-DNAm associations in multi-level, linear mixed-effects models adjusted for sociodemographic, behavioral, meteorological, and technical covariates. We combined stratum-specific estimates in inverse variance-weighted meta-analyses and characterized significant associations (false discovery rate; FDR<0.05) at Cytosine-phosphate-Guanine (CpG) sites without among-strata heterogeneity (PCochran’s Q>0.05). We attempted replication in the Cooperative Health Research in Region of Augsburg (KORA) study and Normative Aging Study (NAS). We observed a −0.3 (95% CI: −0.4, −0.2) unit decrease in percent DNAm per interquartile range (IQR, 7.3 ppb) increase in 28-day mean NO2 concentration at cg01885635 (chromosome 3; regulatory region 290 bp upstream from ZNF621; FDR=0.03). At intragenic sites cg21849932 (chromosome 20; LIME1; intron 3) and cg05353869 (chromosome 11; KLHL35; exon 2), we observed a −0.3 (95% CI: −0.4, −0.2) unit decrease (FDR=0.04) and a 1.2 (95% CI: 0.7, 1.7) unit increase (FDR=0.04), respectively, in percent DNAm per IQR (17.6 ppb) increase in 7-day mean ozone concentration. Results were not fully replicated in KORA and NAS. We identified three CpG sites potentially susceptible to gaseous air pollution-induced DNAm changes near genes relevant for cardiovascular and lung disease. Further harmonized investigations with a range of gaseous pollutants and averaging durations are needed to determine the effect of gaseous air pollutants on DNA methylation and ultimately gene expression.
Keywords: Air pollution, DNA methylation, Epigenetics, Epigenome-wide association study, Gaseous pollutants
1. Introduction
DNA methylation (DNAm) at Cytosine-phosphate-Guanine (CpG) sites is a physiological process that can impact gene expression. Although DNAm is stable over time and can be heritable, it can also be affected by exposure to environmental pollutants. Many environmentally induced changes in DNAm may initially be small, but they can accumulate over time (Baccarelli et al., 2009). Further, experimental studies in humans (Bellavia et al., 2013) and repeated-measures occupational studies (Tarantini et al., 2009; Fan et al., 2014) have demonstrated changes in DNAm patterns after short-term exposures to air pollutants. In particular, exposure to air pollution has been associated with atypical global methylation of DNA in blood samples (De Prins et al., 2013; Madrigano et al., 2011) and atypical DNAm near specific candidate genes relevant for cardiovascular and respiratory health (Bind et al., 2014; Sofer et al., 2013; Chi et al., 2016).
Since DNAm has recently been proposed as an epigenetic mechanism by which air pollution influences health (Baccarelli et al., 2012), identifying epigenetic changes related to gaseous air pollutant exposures may inform our understanding of relevant mechanistic pathways. For example, exposure to US criteria gaseous air pollutants, including carbon monoxide (CO), oxides of nitrogen (NO2; NOx), ozone (O3), and sulfur dioxide (SO2), may be pathophysiologically linked to cardiovascular disease (CVD) through inflammatory, oxidative, and autonomic mechanisms (Franklin et al., 2015) susceptible to DNAm-induced changes. DNAm in blood cells has also been implicated as an effect measure modifier of air pollution-health associations (Lepeule et al., 2014; Fu et al., 2012; Bind et al., 2012). Thus, identifying CpGs differentially methylated in response to gaseous air pollutant exposure may provide insight into the mechanisms linking gaseous air pollutants to CVD and other health outcomes.
The literature examining associations between air pollution and methylome-wide DNAm has largely focused on long-term exposure to particulate matter (PM) and NO2 air pollution (Panni et al., 2016; de F. C. Lichtenfels et al., 2018; Sayols-Baixeras et al., 2019; Plusquin et al., 2017; Lee et al., 2019; Gondalia et al., 2019); however, other pollutants and varying durations of exposure may affect methylation patterns relevant for health outcomes like CVD (Franklin et al., 2015). Existing DNAm-air pollution research is also geographically and sociodemographically limited, with much of it completed in Boston, Massachusetts and in Europe (Bind et al., 2014; Bind et al., 2012; Panni et al., 2016; de F. C. Lichtenfels et al., 2018; Sayols-Baixeras et al., 2019; Plusquin et al., 2017) among individuals of European descent. To address these gaps in research, we leveraged data from two multi-ethnic and geographically diverse populations, the Women’s Health Initiative (WHI) and the Atherosclerosis Risk in Communities Study (ARIC), to examine methylome-wide associations with short- and long-term CO, NO2, NOx, O3, and SO2 exposure.
2. Material and methods
2.1. Study design
We conducted subpopulation- and race/ethnicity-stratified, methylome-wide discovery analyses of gaseous pollutant-DNAm associations within WHI and ARIC subpopulations (N=8,397) and completed replication analyses within the Cooperative Health Research in the Region of Augsburg study (KORA; N=2,141) and the Normative Aging Study (NAS; N=773).
2.2. Study Populations
The WHI is a large, prospective study of postmenopausal women enrolled between 1993 and 1998 at forty clinical centers in the US (The Women’s Health Initiative Study Group, 1998). Women were enrolled in either the clinical trials (CT; N=68,132) or observational study (OS; N=93,676) cohorts. In the present analyses, we included three ancillary study subpopulations of WHI participants with available genome-wide DNAm assessed in peripheral blood leukocytes (Figure 1):
Figure 1.

Diagram of discovery study populations and data analysis flow (AA=African American, ARIC=Atherosclerosis Risk in Communities, AS311=Ancillary Study 311, BAA23=Broad Agency Award 23, CHD=Coronary Heart Disease, CpG=Cytosine-phosphate-Guanine site, CT=Clinical Trials, DNAm=DNA Methylation, EA=European American, EMPC=Epigenetic Mechanisms of Particulate Matter-Mediated Cardiovascular Disease Risk, MN=Minnesota, MS=Mississippi, N=Number, NC=North Carolina, OS=Observational Studies, WHI=Women’s Health Initiative)
-
1
Epigenetic Mechanisms of Particulate Matter-Mediated Cardiovascular Disease Risk (EMPC) was based on an exam site- and race/ethnicity-stratified, minority oversample of WHI CT participants randomly selected from the screening, third annual, and sixth annual follow-up visits with contemporaneous core analyte data and an address in the contiguous 48 US states (N=2,200). A small proportion of EMPC participants had DNAm re-measured at the 3rd or 6th annual visits (N=200) and during the Long Life Study 14–19 years after the screening visit (N=43).
-
2
Broad Agency Award 23 (BAA23) is a case-control study of incident coronary heart disease among WHI CT (N = 1,546) and OS (N = 442) participants. DNAm was assessed in peripheral blood leukocytes at the screening visit, i.e. before disease diagnosis.
-
3
Ancillary Study 311 (AS311) is a matched case-control study of incident bladder cancer among WHI CT (N = 405) and OS (N = 455) participants (Jordahl et al., 2018). Bladder cancer cases were matched to controls by enrollment year, number of follow-up days, age at diagnosis (±2 years), and DNA extraction method. DNAm was assessed in peripheral blood leukocytes at the screening visit, i.e. before disease diagnosis.
In all WHI subpopulations, analyses were restricted to racial/ethnic groups including ≥100 individuals.
ARIC is a community-based, prospective cohort study of atherosclerosis in four US communities that began in 1987–1989 (The ARIC Investigators, 1989). Two sub-studies generated DNAm data for ARIC participants (Figure 1).
-
4
All African American participants from Forsyth County, NC and Jackson, MS at the second or third visit (N=2,751).
-
5
European Americans from Forsyth County, NC or Minneapolis, MN suburbs who were part of a randomly sampled ancillary study of brain magnetic resonance imaging at the third visit (N=1,139). Peripheral blood leukocyte DNAm was assayed at the second (1990–1992) or third (1993–1995) follow-up visits.
Formal replication efforts in the KORA population-based cohort from the region of Augsburg, Southern Germany included data from S3 and S4 participants of European ancestry at follow-ups F3 (N=459; years=2004–2005) and F4 (N=1,682; years 2006–2008) (Holle et al., 2005; Wichmann et al., 2005). Similar replication efforts in the NAS cohort of community-dwelling elderly male veterans of European ancestry from Boston, MA involved participants at up to four follow-up exams (N=773 participants at up to 1522 visits; years=1999–2009; mean age 73 [range: 55–92]) (Bell et al., 1996). These participants were initially recruited by the US Veterans Administration in 1963 as healthy, residentially-stable participants aged 25–75 to study the healthy aging process.
This analysis was approved by the IRB at the University of North Carolina at Chapel Hill and all participants provided written informed consent at their local WHI or ARIC clinic with de-identified data provided through data sharing agreements.
2.3. Epigenome-wide DNA methylation
Participants provided fasting blood from which peripheral blood leukocytes were isolated and DNA was extracted. DNAm at up to 485,577 CpG sites was measured using the Illumina 450K Infinium Methylation BeadChip (Illumina Inc.; San Diego, CA) and quantitatively represented by the methylation β value, the proportion of DNA methylated at each CpG site (=methylated signal divided by the sum of methylated and unmethylated signals). DNAm was Beta-MIxture Quantile (BMIQ)-normalized to correct for differences otherwise attributable to the design of Type I and II probes (Wu et al., 2014) and batch-corrected using random intercepts for plate and chip and a fixed effect for row in all subpopulations except for WHI EMPC, which used empirical Bayes methods (Johnson et al., 2007) to adjust for plate.
2.4. Air Pollution Concentrations
Primary exposures were gaseous air pollutants (in ppb) regulated by the U.S. Environmental Protection Agency (EPA) under the Clean Air Act according to National Ambient Air Quality Standards (NAAQS): CO, NOx, NO2, O3, and SO2. Measurements encompassed a range of short- and long-term concentration averages to provide insight into potential DNAm-mediated acute and chronic effects of gaseous pollutants on various measures of CVD. National-scale, log-normal ordinary kriging (Liao et al., 2006) was used to estimate geocoded participant address-specific mean pollutant concentrations averaged over the 2-, 7-, 28-, and 365-day periods before the blood draw used for DNAm quantification. The models were based on daily (or for O3, maximum 8-hour) mean concentrations from the US Environmental Protection Agency Air Quality System. Cross-validation statistics for these models are available in Supplemental Table 1.
2.5. Covariates
Covariates included:
Study design factors: WHI clinical trial arm, case-control status, matching factors for case-control studies, study center
Sociodemographic characteristics: age at blood draw, sex, race (analysis stratification variable) education (<high school vs. ≥high school), neighborhood socioeconomic status (Diez Roux et al., 2001)
Behavioral and health characteristics: smoking status and alcohol use (never, former, current), physical activity (total energy expenditure in MET-hours/week), body mass index (BMI; weight in kilograms/height in meters2)
Meteorological covariates: dew point in °Celsius, temperature in °Celsius, barometric pressure in kPa (all expressed as means estimated over the air pollution exposure averaging duration from National Climatic Data Center stations within 50 km of the participant’s home)
Season of blood draw: spring, summer, fall, winter
Estimated leukocyte proportions: CD8+ T cell, CD4+ T cell, B cell, natural killer cell, monocyte, granulocyte (imputed using the Houseman method and reference sample data (Houseman et al., 2012) for most populations). Among ARIC African American participants, 175 had complete data on measured leukocyte proportions (lymphocytes, monocytes, neutrophils, eosinophils, and basophils). These 175 measures were used as a reference sample to impute cell type proportions in the remainder using the Houseman method (Demerath et al., 2015).
Ancestry principal components: PCs (Price et al., 2006) as available (unavailable for AS311 participants)
2.6. Missing Data
Eight individuals did not have DNAm data and were excluded. To impute missing covariate and exposure data for the remaining individuals (Supplemental Tables 2 and 3), we generated ten imputed datasets for each study subpopulation with non-missing covariates and exposures using a fully conditional specification multiple imputation method implemented in SAS 9.4 (Cary, NC) (Figure 1). We applied logistic and discriminant functions to impute binary and categorical variables and predictive means matching using the k-nearest neighbor method (k=5) to impute continuous variables (see Supplemental Table 4 for a complete list of variables included in the imputation).
2.7. Statistical Analyses
To estimate gaseous air pollutant-DNAm associations, we used study subpopulation- and race/ethnicity-stratified linear mixed models (LMM) with a harmonized covariate adjustment strategy and pooled estimates across all ten imputed datasets (Figure 1) (van Buuren et al., 2011). The specification of random mixed model components varied by study subpopulation, depending on study design and repeated measure availability. Specifically, analyses of WHI EMPC’s repeated DNAm measures involved a three-level LMM while those of WHI BAA23 and WHI AS311 involved a two-level, cross-sectional LMM, and those of ARIC involved a one-level, cross-sectional LMM. In WHI EMPC, BAA23, and AS311, the models included a center-level random intercept and slope to account for within-site correlations whereas models in ARIC controlled for two centers as a fixed effect. See Supplemental Table 4 for the complete stratum-specific adjustment strategy. We fit the stratified models with maximum likelihood estimation implemented in the Julia v0.6 MixedModels package.
We then used Cochran’s Q test statistics to examine homogeneity of the pooled study- and race/ethnicity-specific gaseous air pollutant-DNAm associations. After excluding heterogeneous associations (PCochran’s Q <0.05), we combined stratum-specific estimates in fixed-effects, inverse variance-weighted meta-analyses and ranked results by FDR adjusted p-values (Figure 1). We used FDR<0.05 to identify methylome-wide significant associations and reported a ranked list of associations. We also focused on associations for CpG sites ranked in the top 5 of this list with associations for ≥ 2 pollutants or averaging durations. We reported estimates of the unit increase or decrease in percent DNAm (95% confidence interval [CI]) per interquartile range (IQR) increase in gaseous air pollution concentration.
2.8. Replication Association Analyses
KORA and NAS collaborators carried out harmonized gaseous pollutant-, averaging duration-, and CpG site-specific associations identified as significant in the discovery analyses. For replication analyses, we used a Bonferroni significance threshold [corrected for the number of CpG sites (N) carried from discovery into replication (P<0.05 ÷ N)] and considered sites meeting the significance threshold with directionally consistent estimates between discovery and replication analyses as replicated.
2.9. Quality Control
We followed established protocols for excluding low quality participant samples and CpG sites (Supplemental Table 5). To conform with LMM distribution assumptions and filter out variation due to single nucleotide polymorphisms, we identified and removed CpG sites exhibiting a multi-modal DNAm distribution by finding gaps in their density function using the gaphunter function of the minfi R package (Andrews et al., 2016). We also evaluated results by 1) graphing observed, CpG site-specific, −log10 transformed P-values against expected values from a theoretical χ2 distribution in quantile-quantile (Q-Q) plots; 2) estimating the genomic inflation factor (λ) (Devlin et al., 1999); and 3) graphing CpG site-specific, −log10 transformed P-values against corresponding CpG positions in Manhattan plots.
2.10. Functional Annotation
We functionally annotated significant CpG sites using the UCSC Genome Browser (Feb. 2009 GRCh37/hg19) (Kuhn et al., 2013) with data from the Encyclopedia of DNA elements (ENCODE) (Rosenbloom et al., 2012) and Roadmap Epigenomics Project (Bernstein et al., 2010).
3. Results
The study included data from twelve subpopulations in WHI and ARIC (Table 1). Forty-six percent of the 8,397 participants were African American, 46% were European American, and 8% were Hispanic/Latino (from WHI only). Seventeen percent of participants were male (from ARIC only). The overall mean (SD) age was 61.3 (7.4) years with subpopulation-specific mean ages ranging from 56.6 to 67.8 years. Air pollution exposures varied among subpopulations (Table 1, Supplemental Table 3). For example, the overall mean (IQR) 28-day NO2 concentration was 18.0 (7.3) ppb but ranged from 14.9 (3.6) ppb in ARIC African Americans to 21.5 (9.4) ppb in WHI BAA23 CT African Americans (Table 1).
Table 1.
Participant Characteristics by ARIC and WHI Subpopulation
| Study | Race/Ethnicity | N | % Female | Age (years) Mean (SD) | N CpGs Maximum | 28-day NO2 (ppb) Mean (IQR)a | ||
|---|---|---|---|---|---|---|---|---|
| ARIC | AA | 2,664 | 63 | 56.6 (5.9) | 463,431 | 14.9 (3.6) | ||
| EA | 1,100 | 58 | 59.9 (5.4) | 462,543 | 18.6 (3.5) | |||
| WHI | AS311 | CT | EA | 351 | 100 | 64.7 (7.1) | 461,136 | 20.0 (7.7) |
| OS | EA | 395 | 100 | 66.2 (6.9) | 461,136 | 19.9 (8.5) | ||
| BAA23 | CT | AA | 371 | 100 | 61.8 (6.3) | 461,014 | 21.5 (9.4) | |
| EA | 926 | 100 | 67.8 (6.2) | 461,014 | 19.4 (8.3) | |||
| HL | 220 | 100 | 60.7 (6.4) | 461,014 | 18.8 (11.8) | |||
| OS | AA | 259 | 100 | 62.8 (6.8) | 461,014 | 21.4 (9.4) | ||
| HL | 174 | 100 | 62.8 (7.3) | 461,014 | 19.8 (10.7) | |||
| EMPCb | CT | AA | 553 | 100 | 62.7 (6.9) | 463,916 | 21.2 (9.4) | |
| EA | 1,072 | 100 | 64.6 (7.1) | 463,916 | 18.6 (8.0) | |||
| HL | 312 | 100 | 61.5 (6.1) | 463,916 | 18.1 (11.6) | |||
| Overall | AA: 46% EA: 46% HL: 8% |
8,397 | 83 | 61.3 (7.4) | 463,916 | 18.0 (7.3) |
Abbreviations: AA, African American; ARIC, Atherosclerosis Risk in Communities Study; AS311, Ancillary Study 311; BAA23, Broad Agency Award 23; CpG, cytosine-phosphate-guanine site; CT, clinical trial; EA, European American; EMPC, Epigenetic Mechanisms of Particulate Matter Mediated Cardiovascular Disease Risk; HL, Hispanic Latino; IQR, interquartile range; N, number; NO2, nitrogen dioxide OS, observational study; ppb, parts per billion; SD, standard deviation; WHI, Women’s Health Initiative
Median imputed mean (IQR) from 10 imputed datasets
At the 1st visit. DNAm data were also available among 185 EMPC participants in visit years 3 or 6 and 43 EMPC participants in study years 14–19
Discovery analyses showed minimal evidence of inflation across pollutants and averaging durations [median (range) λ = 0.98, (0.90–1.25); Figure 2] and identified three CpG sites significant at FDR<0.05 (Table 2, Figure 3). These sites demonstrated consistency in the direction and magnitude of the change in %DNAm across subpopulations (Figure 4) and across additional exposure averaging durations for the same pollutant (see Supplemental Table 6 for ranked list).
Figure 2.

Quantile-quantile (QQ) plot of observed vs. expected −log10 p-value for each CpG site from multi-ethnic, fixed-effects meta-analyses of 2-, 7-, 28-, and 365-day mean gaseous air pollutant concentrations.
Table 2.
Gaseous air pollutant-DNAm associations significant at FDR<0.05 among ARIC and WHI Subpopulations
| Top Rank | Chr | CpG Site | Position | Gene | Pollutant | Averaging Duration (days) | Change in %DNAm (95% CI)a | FDR | N | P Cochran’s Q |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 3 | cg01885635 | 40566085; intergenic | ZNF621 | NO2 | 28 | −0.3 (−0.4, −0.2) | 0.03b | 8613 | 0.9 |
| 2 | 20 | cg21849932 | 62369462; intron 3 | LIME1 | O3 | 7 | −0.3 (−0.4, −0.2) | 0.04b | 8623 | 0.5 |
| 3 | 11 | cg05353869 | 75139544; exon 2 | KLHL35 | O3 | 7 | 1.2 (0.7, 1.7) | 0.04b | 8621 | 0.8 |
Abbreviations: ARIC, Atherosclerosis Risk in Communities; CpG, Cytosine-phosphate-Guanine; Chr, chromosome; DNAm, DNA methylation; FDR, false discovery rate; N, number; NO2, nitrogen dioxide; O3, ozone; WHI, Women’s Health Initiative
Per IQR (ppb) increase in gaseous air pollutant concentration (NO2 28-day=7.3; O3 7-day=17.6)
FDR<0.05
Figure 3.

Manhattan plot of −log10 P-value vs. chromosomal position of each CpG site from multi-ethnic, fixed-effects meta-analyses of 2-, 7-, 28-, and 365-day mean gaseous air pollutant concentrations. Sites meeting the methylome-wide significance thresholds (FDR < 0.05) were circled instead of plotting all 20 horizontal reference lines identifying the significance thresholds calculated separately for each combination of 5 pollutants and 4 averaging durations.
Figure 4.

Unit change in %DNAm (95% CI) A) per IQR (7.3 ppb) increase in 28-day NO2 at cg01885635 (РСосhгаn’s Q=0.9), B) per IQR (17.6 ppb) increase in 7-day O3 at cg21849932 (PCochran’s Q=0.5), C) per IQR (17.6 ppb) increase in 7-day O3 at cg05353869 (PCochran’s Q=0.8)
At the intergenic site cg01885635 (chromosome 3; 290 bp upstream from ZNF621), we observed a −0.3 (95% CI: −0.4, −0.2) unit decrease in percent DNAm per IQR (7.3 ppb) increase in 28-day mean NO2 concentration (FDR=0.03; Table 2, Figure 4A). At intragenic sites cg21849932 (chromosome 20; LIME1; intron 3) and cg05353869 (chromosome 11; KLHL35; exon 2), we also observed a −0.3 (95% CI: −0.4, −0.2) unit decrease (FDR=0.04, Table 2, Figure 4B) and a 1.2 (95% CI: 0.7, 1.7) unit increase (FDR=0.04, Table 2, Figure 4C), respectively, in percent DNAm per IQR (17.6 ppb) increase in 7-day mean O3 concentration.
The highest ranked pollutant-DNAm associations also had suggestive associations at other pollutant averaging durations (Supplemental Table 6). Additionally, at a third intragenic site, cg15008743 (chromosome 19; ZNF83), we note five non-FDR significant associations with 28- and 365-day mean NOx and 2-, 7-, and 28-day mean NO2 concentrations (Supplemental Table 6, Supplemental Figure 1).
The NAS, KORA F3, and KORA F4 populations have been described previously (Panni et al., 2016; Ward-Caviness et al., 2016) and basic sociodemographic characteristics are listed in Supplemental Table 7. The 28-day mean NO2-DNAm association at cg01885635 meta-analyzed across NAS, KORA F3, and KORA F4 was significant, corresponding to a 0.5 (95% CI: 0.3, 0.7) unit increase in percent DNAm per IQR increase. However, this estimate was directionally inconsistent with the −0.3 (95% CI: −0.4, −0.2) unit decrease in percent DNAm observed in the meta-analyzed WHI and ARIC populations. Examination of sex as a potential explanation for the observed difference in direction of association among the predominately female WHI and ARIC discovery populations and predominately male NAS and KORA replication populations did not explain the differences, although the discovery population was limited to 17% male, thereby limiting power. KORA could not supply replication analyses for the remaining sites due to unavailable O3 exposure data for this cohort. In NAS, the association between DNAm at cg21849932 and 7-day O3 was precisely estimated [−0.1 (95% CI: −0.2, 0.0) per IQR (13.1 ppb) increase] and consistent with that in WHI and ARIC [−0.3 (95% CI: −0.4, −0.2) per IQR (17.6 ppb) increase]. However, the association was not significant (P=0.04) at the replication threshold of P=0.02. Seven-day O3-DNAm associations did not replicate at cg05353869 in NAS (P = 0.6).
4. Discussion
This methylome-wide study of demographically and geographically diverse participants identified medium duration NO2- and O3-associated changes in DNAm at three CpG sites. The results of these multi-ethnic meta-analyses were significant, precisely estimated, and largely homogeneous across study subpopulation and racial/ethnic strata. Nevertheless, further investigation is needed to determine the effects of gaseous air pollutants on DNAm and resulting gene expression given that our results were not replicable in NAS and KORA, in part due to limited availability of ozone exposure data.
The most significant association in the present study was between 28-day NO2 exposure and decreased DNAm at a CpG site on chromosome 3. This CpG site is upstream from ZNF621 in a region that is enriched with regulatory elements including transcription factor binding, histone protein modification, and DNase hypersensitivity sites (Supplemental Figure 2) (Rosenbloom et al., 2012). ZNF621 encodes Zinc Finger Protein 621, one of many such proteins involved in e.g. transcription, apoptosis, and protein packaging (Laity et al., 2001), which is universally expressed in lung, coronary artery, and other tissues affected by air pollution and atherosclerosis (Rosenbloom et al., 2012). A study investigating loci associated with type 2 diabetes (T2D) in populations of European and African ancestry found that one such locus increased T2D risk via cis-regulation of ZNF621 expression in adipose tissue (Lau et al., 2017). As multiple studies have implicated air pollution as a risk factor for T2D (Eze et al., 2015; Park et al., 2014; Balti et al., 2014), altered DNAm in a regulatory region of ZNF621 may be a mechanistic pathway underlying this association.
Air pollution-associated changes in DNAm at the two remaining FDR-significant sites may also have biological links to cardiorespiratory disease via LIME1 and KLHL35. LIME1 encodes the Lck-interacting transmembrane adaptor 1 protein, which propagates T and B cell receptor signals (Hur et al., 2003; Hořejší et al., 2004). It is expressed in liver, spleen, brain, and whole blood (Kuhn et al., 2013) as well as hematopoietic stem cells and lung tissue (Hur et al., 2003). In particular, LIME1 has a role in T-cell activation and resulting IL-2 expression (Hur et al., 2003). Observed changes in DNAm within the third intron of LIME1 therefore suggest a plausible link between air pollution exposures, local/systemic inflammatory responses to these pollutants, and their established cardiorespiratory effects, although they also may reflect amongparticipant variation in minor lymphocyte proportions not captured by the Houseman method (Houseman et al., 2012). KLHL35 encodes Kelch-like protein 35. Kelch-like proteins are involved in posttranslational transfer of ubiquitins to other proteins (Dhanoa et al., 2013), thereby targeting them for intracellular translocation and / or degradation (Mukhopadhyay et al., 2007). Although KLHL35 is expressed in a wide variety of tissues, including the testes, brain, and arteries (Kuhn et al., 2013), its specific function is unknown. Chromatin interaction with the promoter of KLHL35 has been suggested as a mechanism by which the single nucleotide polymorphism rs590121, a susceptibility variant for coronary artery disease (Howson et al., 2017), functions (van der Harst et al., 2018).
Other CpG sites that may be of interest for cardiometabolic and respiratory disease were not statistically significant according to a priori FDR-based significance thresholds, but appeared within our top ranked list. For example, cg15008743 appeared as part of five NO2/NOx-DNAm associations in our top ranked list. This CpG lies within ZNF83, which encodes another zinc finger protein (Laity et al., 2001). ZNF83 is expressed in tissues including the female reproductive tract, brain, thyroid, and pituitary (Khun et al., 2013). Other highly ranked but non-FDR significant associations included those near genes that have frequently been implicated in cardiometabolic disease (e.g. CPT1A (Schlaepfer et al., 2020)1 and TXNIP (Parikh et al., 2007)).
We were unable to replicate the top hits for 28-day NO2 in NAS and KORA and 7-day O3 in NAS. Potentially important differences exist between the NAS and KORA replication populations and the WHI and ARIC discovery populations that may help explain the lack of replication. While NAS is a population of elderly, white men in one U.S. city and KORA of residents in one region of Germany, WHI and ARIC included a racially/ethnically diverse population of mostly women from throughout the U.S. In addition, the standard deviation of the 28-day NO2 concentration in the more geographically restricted NAS and KORA cohorts was approximately half that among WHI subgroups, despite similar means across populations. Also, the 7-day O3 concentration was lower in NAS than in WHI and ARIC (23 ppb vs 40 ppb).
Various air pollutants may have different effects on biologic processes over acute and chronic exposures (Franklin et al., 2015). Our results highlighted shorter-term (7- and 28-day average) NO2 and ozone concentrations as potentially important for DNAm. Because availability of data for some of these exposures was limited in the planned replication analyses in KORA and NAS, we also attempted to look up results of analyses in several recently published methylome-wide association studies involving gaseous air pollutants (de F.C. Lichtenfels et al., 2018; Sayols-Baixeras et al., 2019; Lee et al., 2019). However, none of the published studies included O3 or shorter-term NO2 exposures that would have enabled a relatively well-harmonized comparison of the top hits identified by the present study. Comparison of our non-significant results for longer duration (28- and 365-day) NO2 exposures with these published studies showed a similarly inverse and significant annual mean NO2-DNAm association at cg01885635 in the LifeLines cohort in the Netherlands (de F.C. Lichtenfels et al., 2018). However, the annual (Lee et al., 2019) and 10-year (Sayols-Baixeras et al., 2019) mean NO2-DNAm associations in other studies were neither significant nor inverse.
4.1. Limitations
This study may be limited by the effects of exposure measurement error, reduced power inherent in multiple testing, and inability to fully replicate results. Although twelve demographically, geographically, and environmentally diverse participant subpopulations contributed to this study, we were still able to harmonize exposure estimation, variable definitions, and quality control across them. We also used national-scale air pollution exposure estimation strategies to generate consistent estimates of ambient exposures, yet these potentially underestimate the true magnitude of personal air pollutant-outcome associations (Holliday et al., 2014). We used FDR to account for multiple testing across CpG sites, but we did not correct for multiple testing of several gaseous air pollutants and averaging durations as exposure variables. The KORA and NAS replication populations were limited to individuals of European ancestry from single geographic sites with more homogenous gaseous air pollution exposures than those among the racially diverse WHI and ARIC participants from across the U.S., but they were the only identified studies with shorter-term air pollution exposures. Ozone exposure data were unavailable in KORA, limiting replication attempts for two CpG sites to the NAS population. We were unable to compare results with those of shorter-term NO2 or O3 exposure in other studies due to lack of data, although we observed similar results with longer-term NO2 in one study. Finally, if our results regarding effects of gaseous air pollutants on DNAm are replicable in future studies, further examination of the effects of these DNAm changes on gene expression will be necessary to understand their potential impact on disease.
5. Conclusions
Gaseous air pollutants may be associated with DNA methylation of cardiovascular disease-relevant gene regions. Further harmonized replication efforts in similarly diverse cohorts with a range of shorter-term pollutant exposures is warranted.
Supplementary Material
Funding sources
The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services (contract numbers HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700004I and HHSN268201700005I). The authors thank the staff and participants of the ARIC study for their important contributions. ARIC DNAm funding was also provided by the American Recovery and Reinvestment Act of 2009 5RC2HL102419 and by R01-NS087541. Data from the ARIC study are available on request at https://www2.cscc.unc.edu/aric/distribution-agreements.
The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C. WHI-AS311 was supported by American Cancer Society award 125299-RSG-13-100-01-CCE, WHI-BAA23 by NHLBI 60442456 BAA23, and WHI-EMPC by R01-ES020836. The authors thank the WHI investigators and staff for their dedication and the study participants for making the program possible. A listing of WHI investigators can be found at https://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Long%20List.pdf. Data from the WHI are available on request at https://www.whi.org/researchers/SitePages/Write%20a%20Paper.aspx.
The KORA study was initiated and financed by the Helmholtz Zentrum München-German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. Furthermore, KORA research was supported within the Munich Center of Health Sciences (MC-Health), Ludwig-Maximilians-Universität, as part of LMUinnovativ. The KORA-Study Group consists of A. Peters (speaker), L. Schwettmann, R. Leidl, M. Heier, B. Linkohr, H. Grallert, C. Gieger, J. Linseisen, and their co-workers, who are responsible for the design and conduct of the KORA studies.
NAS co-authors were supported by the National Institute of Environmental Health Sciences (grants ES000002, R01ES015172, 5R01ES027747-02, P30ES009089, R01ES021733, R01ES025225). The VA Normative Aging Study is supported by the Cooperative Studies Program/Epidemiology Research and Information Center of the U.S. Department of Veterans Affairs and is a component of the Massachusetts Veterans Epidemiology Research and Information Center in Boston, MA.
The Lifelines Epigenetics cohort was funded by consortium grant number 4.1.13.007 of the Lung foundation Netherlands. The Lifelines initiative has been made possible by funds from FES (Fonds Economische Structuurversterking), SNN (Samenwerkingsverband Noord Nederland) and REP (Ruimtelijk Economisch Programma).
This project was also supported by NIEHS National Research Service Award T32-ES007018 and Building Interdisciplinary Research Careers in Women’s Health K12HD043446 (KMH), NHLBI National Research Service Award T32-HL007055 (RG), NCI grant R25-CA094880 (KJ), NIEHS grant R01-ES017794 (EAW), NHGRI grant R01HG010297 (KEN), and the Basic Science Research Program, National Research Foundation of Korea funded by the Ministry of Science, ICT & Future Planning (2013R1A1A1057961) (ML). The authors declare they have no actual or potential competing financial interests.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References
- Andrews SV, Ladd-Acosta C, Feinberg AP, Hansen KD, Fallin MD, 2016. “Gap hunting” to characterize clustered probe signals in Illumina methylation array data. Epigenetics & Chromatin. 9(56). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baccarelli A, Bollati V, 2009. Epigenetics and Environmental Chemicals. Curr. Opin. Pediatr 21, 243–251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baccarelli A, Ghosh S, 2012. Environmental exposures, epigenetics and cardiovascular disease. Curr. Opin. Clin. Nutr. Metab. Care 15(4), 323–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balti E, Echouffo-Tcheugui J, Yako Y, Kengne A, 2014. Air pollution and risk of type 2 diabetes mellitus: a systematic review and meta-analysis. Diab. Res. Clin. Pract 106, 161–172. [DOI] [PubMed] [Google Scholar]
- Bell B, Rose C, Damon A, 1996. The Veterans Administration longitudinal study of healthy aging. Gerontologist. 6(4), 179–184. [DOI] [PubMed] [Google Scholar]
- Bellavia A, Urch B, Speck M, et al. , 2013. DNA Hypomethylation, Ambient Particulate Matter, and Increased Blood Pressure: Findings From Controlled Human Exposure Experiments. J. Am. Heart. Assoc 2(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernstein BE, Stamatoyannopoulos JA, Costello JF, et al. , 2010. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol 28(10), 1045–1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bind MA, Baccarelli A, Zanobetti A, et al. , 2012. Air pollution and markers of coagulation, inflammation, and endothelial function: associations and epigene-environment interactions in an elderly cohort. Epidemiol. 23(2), 332–340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bind MA, Lepeule J, Zanobetti A, et al. , 2014. Air pollution and gene-specific methylation in the Normative Aging Study: association, effect modification, and mediation analysis. Epigenetics. 9(3), 448–458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chi GC, Liu Y, MacDonald JW, et al. , 2016. Long-term outdoor air pollution and DNA methylation in circulating monocytes: results from the Multi-Ethnic Study of Atherosclerosis (MESA). Environ. Health 15(1), 119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de F. C. Lichtenfels AJ, van der Plaat DA, de Jong K, et al. , 2018. Long-term Air Pollution Exposure, Genome-wide DNA Methylation and Lung Function in the LifeLines Cohort Study. Environ. Health Perspect 26(2), 027004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Prins S, Koppen G, Jacobs G, et al. , 2013. Influence of ambient air pollution on global DNA methylation in healthy adults: a seasonal follow-up. Environ. Int 59, 418–424. [DOI] [PubMed] [Google Scholar]
- Demerath EW, Guan W, Grove ML, et al. , 2015. Epigenome-wide association study (EWAS) of BMI, BMI change and waist circumference in African American adults identifies multiple replicated loci. Hum. Mol. Genet 24(15), 4464–4479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Devlin B, Roeder K, 1999. Genomic control for association studies. Biometrics. 55, 997–1004. [DOI] [PubMed] [Google Scholar]
- Dhanoa B, Cogliati T, Satish A, Bruford E, Friedman J, 2013. Update on the Kelch-like (KLHL) gene family. Hum. Genomics 7, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diez Roux AV, Merkin SS, Arnett D, et al. , 2001. Neighborhood of Residence and Incidence of Coronary Heart Disease. N. Engl. J. Med 345(2), 99–106. [DOI] [PubMed] [Google Scholar]
- Eze I, Hemkens L, Bucher H, et al. , 2015. Association between ambient air pollution and diabetes mellitus in Europe and North America: systematic review and meta-analysis. Environ. Health Perspect 123(5), 381–389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fan T, Fang S, Cavallari J, et al. , 2014. Heart rate variability and DNA methylation levels are altered after short-term metal fume exposure among occupational welders: a repeated-measures panel study. BMC Public Health. 14(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franklin BA, Brook R, Pope AC III., 2015. Air pollution and cardiovascular disease. Curr Probl Cardiol. 40(5), 207–238. [DOI] [PubMed] [Google Scholar]
- Fu A, Leaderer BP, Gent JF, Leaderer D, Zhu Y, 2012. An environmental epigenetic study of ADRB2 5’-UTR methylation and childhood asthma severity. Clin Exp Allergy. 42(11), 1575–1581. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gondalia R, Baldassari A, Holliday KM, et al. , 2019. Methylome-wide association study provides evidence of particulate matter air pollution-associated DNA methylation. Environ. Int 132, 104723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holle R, Happich M, Lowel H, Wichmann H, 2005. KORA—a research platform for population based health research. Gesundheitswesen. 67(suppl 1), S19–S25. [DOI] [PubMed] [Google Scholar]
- Holliday KM, Avery C, Poole C, et al. , 2014. Estimating Personal Exposures from Ambient Air Pollution Measures Using Meta-Analysis to Assess Measurement Error. Epidemiol. 25, 35–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hořejší V, Zhang W, Schraven B, 2004. Transmembrane Adaptor Proteins: Organizers of Immunoreceptor Signalling. Nat. Rev. Immunol 4, 603–616. [DOI] [PubMed] [Google Scholar]
- Houseman E, Accomando W, Koestler D, et al. , 2012. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 13(86). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howson JMM, Zhao W, Barnes DR, et al. , 2017. Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms. Nat. Genet 49, 1113–1119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hur EM, Son M, Lee OH, et al. , 2003. LIME, a Novel Transmembrane Adaptor Protein, Associates with p56lck and Mediates T Cell Activation. J. Exp. Med 198(10), 1463–1473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson WE, Li C, 2007. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 8(1), 118–127. [DOI] [PubMed] [Google Scholar]
- Jordahl KM, Randolph TW, Song X, et al. , 2018. Genome-Wide DNA Methylation in Prediagnostic Blood and Bladder Cancer Risk in the Women’s Health Initiative. Cancer Epidemiol. Biomarkers Prev 27(6), 689–695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuhn RM, Haussler D, Kent WJ, 2013. The UCSC genome browser and associated tools. Brief Bioinform. 14(2), 144–161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laity J, Lee B, Wright P, 2001. Zinc finger proteins: new insights into structural and functional diversity. Curr. Opin. Structural Biol 11(1), 39–46. [DOI] [PubMed] [Google Scholar]
- Lau W, Andrew T, Maniatis N, 2017. High-Resolution Genetic Maps Identify Multiple Type 2 Diabetes Loci at Regulatory Hotspots in African Americans and Europeans. Am. J. Hum. Genet 100, 803–816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee MK, Xu CJ, Carnes MU, et al. , 2019. Genome-wide DNA methylation and longterm ambient air pollution exposure in Korean adults. Clin. Epigenetics 11(37). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lepeule J, Bind MA, Baccarelli AA, et al. , 2014. Epigenetic influences on associations between air pollutants and lung function in elderly men: the normative aging study. Environ. Health Perspect 122(6), 566–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liao D, Peuquet D, Duan Y, et al. , 2006. GIS approaches for the estimation of residential-level ambient PM concentrations. Environ Health Perspect. 114(9), 1374–1380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Madrigano J, Baccarelli A, Mittleman MA, et al. , 2011. Prolonged exposure to particulate pollution, genes associated with glutathione pathways, and DNA methylation in a cohort of older men. Environ. Health Perspect 119(7), 977–982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mukhopadhyay D, Riezman H, 2007. Proteasome-Independent Functions of Ubiquitin in Endocytosis and Signaling. Science. 315(5809), 201–205. [DOI] [PubMed] [Google Scholar]
- Panni T, Mehta AJ, Schwartz JD, et al. , 2016. Genome-Wide Analysis of DNA Methylation and Fine Particulate Matter Air Pollution in Three Study Populations: KORA F3, KORA F4, and the Normative Aging Study. Environ. Health Perspect 124(7), 983–990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parikh H, Carlsson E, Chutkow W, et al. , 2007. TXNIP Regulates Peripheral Glucose Metabolism in Humans. PLOS Med. 4(5), e158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park S, Wang W, 2014. Ambient Air Pollution and Type 2 Diabetes: A Systematic Review of Epidemiologic Research. Curr. Environ. Health Rep 1(3), 275–286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plusquin M, Guida F, Polidoro S, et al. , 2017. DNA methylation and exposure to ambient air pollution in two prospective cohorts. Environ. Int 108, 127–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet 38, 904–909. [DOI] [PubMed] [Google Scholar]
- Rosenbloom KR, Dreszer TR, Long JC, et al. , 2012. ENCODE whole-genome data in the UCSC Genome Browser: update 2012. Nucleic Acids Res. 40(Database issue), D912–917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sayols-Baixeras S, Fernández-Sanlés A, Prats-Uribe A, et al. , 2019. Association between long-term air pollution exposure and DNA methylation: The REGICOR study. Environ Res. 176. [DOI] [PubMed] [Google Scholar]
- Schlaepfer I, Joshi M, 2020. CPT1A-mediated Fat Oxidation, Mechanisms, and Therapeutic Potential. Endocrinology. 161(2), 1–14. [DOI] [PubMed] [Google Scholar]
- Sofer T, Baccarelli A, Cantone L, et al. , 2013. Exposure to airborne particulate matter is associated with methylation pattern in the asthma pathway. Epigenomics. 5(2), 147–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarantini L, Bonzini M, Apostoli P, et al. , 2009. Effects of particulate matter on genomic DNA methylation content and iNOS promoter methylation. Environ. Health Perspect 117(2), 217–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- The ARIC Investigators, 1989. The Atherosclerosis Risk in Communities (ARIC) Study: Design and Objectives. Am. J. Epidemiol 129, 687–702. [PubMed] [Google Scholar]
- The Women’s Health Initiative Study Group, 1998. Design of the Women’s Health Initiative clinical trial and observational study. Control. Clin. Trials 19, 61–109. [DOI] [PubMed] [Google Scholar]
- van Buuren S, Groothuis-Oudshoorn K, 2011. Mice: Multivariate Imputation by Chained Equations in R. J. Stat. Softw 45(3), 1–67. [Google Scholar]
- van der Harst P, Verweij N, 2018. The Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease. Circ. Res 122(3), 433–443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ward-Caviness C, Nwanaji-Enwerem JC, Wolf K, Wahl S, Colicino E, Trevisi L, et al. , 2016. Long-term exposure to air pollution is associated with biological aging. Oncotarget. 7(46), 74510–74525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wichmann HE, Gieger C, Illig T, 2005. KORA-gen—resource for population genetics, controls and a broad spectrum of disease phenotypes. Gesundheitswesen. 67(suppl 1), 26–S30. [DOI] [PubMed] [Google Scholar]
- Wu M, Joubert B, Kuan P, et al. , 2014. A systematic assessment of normalization approaches for the Infinium 450K methylation platform. Epigenetics. 9(2), 318–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
