This cohort study assesses the interaction of cigarette smoking and polygenic risk score in association with reduced lung function among participants in the UK Biobank cohort.
Key Points
Question
Does cigarette smoking interact with genetic risk on the percent of forced vital capacity exhaled in the first second (FEV1/FVC)?
Findings
In this UK Biobank cohort study of 319 730 UK citizens, FEV1/FVC was associated with polygenic risk score-by-smoking interactions, and smoking was detrimental across all categories of estimated genetic risk, although it was worse for those with the highest estimated genetic risks. For every reported 20 pack-years of smoking, individuals in the top decile compared with the bottom decile of genetic risk showed nearly twice the reduction in FEV1/FVC.
Meaning
These findings suggest that elucidating mechanisms for the interaction between smoking and genetic risk could yield greater insight into the chronic obstructive pulmonary disease pathogenesis.
Abstract
Importance
The risk of airflow limitation and chronic obstructive pulmonary disease (COPD) is influenced by combinations of cigarette smoking and genetic susceptibility, yet it remains unclear whether gene-by-smoking interactions are associated with quantitative measures of lung function.
Objective
To assess the interaction of cigarette smoking and polygenic risk score in association with reduced lung function.
Design, Setting, and Participants
This UK Biobank cohort study included UK citizens of European ancestry aged 40 to 69 years with genetic and spirometry data passing quality control metrics. Data was analyzed from July 2020 to March 2021.
Exposures
PRS of combined forced expiratory volume in 1 second (FEV1) and percent of forced vital capacity exhaled in the first second (FEV1/FVC), self-reported pack-years of smoking, ever- vs never-smoking status, and current- vs former- or never-smoking status.
Main Outcomes and Measures
FEV1/FVC was the primary outcome. Models were used to test for interactions with models, including the main effects of PRS, different smoking variables, and their cross-product terms. The association between pack-years of smoking and FEV1/FVC were compared for those in the highest vs lowest decile of estimated genetic risk for low lung function.
Results
We included 319 730 individuals, of whom 24 915 (8%) had moderate-to-severe COPD cases, and 44.4% were men. Participants had a mean (SD) age 56.5 of (8.02) years. The PRS and pack-years were significantly associated with lower FEV1/FVC (PRS: β, −0.03; 95% CI, −0.031 to −0.03; pack-years: β, −0.0064; 95% CI, −0.0064 to −0.0063) and the interaction term (β, −0.0028; 95% CI, −0.0029 to −0.0026). A stepwise increment in estimated effect sizes for these interaction terms was observed per 10 pack-years of smoking exposure. The interaction of PRS with 11 to 20, 31 to 40, and more than 50 pack-years categories were β (interaction) −0.0038 (95% CI, −0.0046 to −0.0031); −0.013 (95% CI, −0.014 to −0.012); and −0.017 (95% CI, −0.019 to −0.016), respectively. There was evidence of significant interaction between PRS with ever- or never- smoking status (β, interaction; −0.0064; 95% CI, −0.0068 to −0.0060) and current or not-current smoking (β, interaction; −0.0091; 95% CI, −0.0097 to −0.0084). For any given level of pack-years of smoking exposure, FEV1/FVC was significantly lower for individuals in the tenth decile (ie, highest risk) than the first decile (ie, lowest risk) of genetic risk. For every 20 pack-years of smoking, those in the tenth decile compared with the first decile of genetic risk showed nearly a 2-fold reduction in FEV1/FVC.
Conclusions and Relevance
COPD is characterized by diminished lung function, and our analyses suggest there is substantial interaction between genome-wide PRS and smoking exposures. While smoking was associated with decreased lung function across all genetic risk categories, the associations were strongest in individuals with higher estimated genetic risk.
Introduction
Chronic obstructive pulmonary disease (COPD) is characterized by airflow obstruction, traditionally defined by a low percent of forced vital capacity exhaled in the first second (FEV1/FVC), and cigarette smoking is the greatest environmental risk factor.1,2 Only a minority of smokers develop COPD,3,4 and genetic factors are thought to account for some of this variation in susceptibility, with approximately 40% of the variability in spirometric measures of pulmonary function attributed to genetic variation.5,6,7 Therefore, it has long been thought that airflow obstruction may develop partially as the result of gene-by-smoking interactions.
Despite the important contribution of both smoking and genetic factors to lung function, compelling evidence for gene-by-smoking interactions has been limited. Genome-wide interaction studies have identified a handful of spirometric- and COPD-associated loci that appear to interact with smoking status,8,9,10,11,12,13,14 suggesting at least a portion of the variability in spirometric measures of lung function may be attributable to gene-by-smoking interactions. A major challenge of identifying gene-by-smoking interactions on lung function and risk to COPD is that individual genetic variants tend to be of small effect size and account for a low degree of phenotypic variability in lung function, diminishing the power to detect gene-by-smoking interactions.
Pooling individual genome-wide association studies (GWAS) variants into a single genetic risk score can account for a greater proportion of phenotypic variability,15,16,17,18,19,20 and should improve power to detect interactions. Genetic risk scores have been used to investigate gene-by-environment interactions in psychiatric21 and cardiovascular diseases.22 Aschard et al23 were unable to detect individual single nucleotide variation (SNV, formerly single-nucleotide polymorphism [SNP])-by-smoking interactions for FEV1/FVC for 26 variants identified as significant in a genome-wide joint meta-analysis of SNV-by-smoking associations of pulmonary function14; however, when the authors summed these variants to create a genetic risk score, they found evidence of interaction between the genetic risk score and ever-smoking status.23 By contrast, Shrine et al19 performed the largest GWAS of lung function to date, developed a genetic risk score including estimated effects of 279 variants showing significant effects on lung function, and reported no evidence of interaction between this genetic risk score and ever-smoking status, although the authors did observe an interaction of the genetic risk score with ever- smoking status on moderate-to-severe COPD. We used a polygenic risk score (PRS) based on GWASs of FEV1 and FEV1/FVC,19 previously constructed for COPD that explained more of the variability in lung function than seen with the 279-variant risk score used by Shrine et al19 (approximately 30% vs less than 10%).20
We hypothesized that multiple measures of smoking exposure would significantly interact with this genome-wide PRS on FEV1/FVC (ie, because it is associated with lower lung function) in the UK Biobank population-based cohort. We chose FEV1/FVC as the primary outcome, as it is a measure used to define COPD according to the Global Initiative for Chronic Lung Disease (GOLD) criteria, and the ratio as a continuous measure, is inversely associated with COPD-related events.2
Methods
This cohort study followed Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines. All participants provided written informed consent, and study protocols were approved by North West Multi-centre Research Ethics Committee and ethical procedures were controlled by the UK Biobank Ethics Advisory Committee.
Study Population
We included participants from the UK Biobank, a cohort recruiting more than 500 000 individuals from the UK aged 40 to 69 years from 2006 to 2010.24 Participants were excluded if spirometry or genetic data did not meet quality control standards; further details on the impact of these inclusion and exclusion criteria are shown in eFigure 1 in the Supplement. Quality control of spirometric data has been previously described.18,19,24 Briefly, to determine lung function, FEV1 and FVC were derived from the spirometry volume-time series data at the time of study enrollment, as previously reported.19
Genotyping was performed as previously described,19 using Axiom UK Biobank Lung Exome Variant Evaluation array and Axiom Biobank array (Affymetrix) and imputed to the Haplotype Reference Consortium version 1.1 panel (accepting imputation accuracy r2 > 0.5). We dropped variants with minor allele frequency < 0.01 and those showing deviation from Hardy-Weinberg equilibrium (P < 1×10-6). We used only participants of European ancestry based on a combination of self-reported ethnicity and k-means clustering of principal components of genetic ancestry, as previously reported.19
Overview of Study Design
The primary outcome was the FEV1/FVC ratio, as clinical COPD is characterized by airflow obstruction (FEV1/FVC < 0.7), and severity graded based on decrements in FEV1% predicted.1,2 We first assessed whether 3 measures of smoking exposure interacted with PRS on quantitative measures of FEV1/FVC. We then considered the joint associations of smoking exposures and being in the highest (tenth) decile vs lowest (first) decile of the PRS (ie, highest vs lowest categories of estimated genetic risk). We examined norms of reaction for the association between pack-years of smoking and FEV1/FVC for those in the highest (tenth) decile compared with the lowest (first) decile and middle (fifth) decile of estimated genetic risk.
Smoking Exposures
We examined 3 measures of cigarette smoking exposure: pack-years of smoking, ever- vs never-smoking status, and current smoker vs former- and never-smoking status. All smoking information was obtained by self-report. Pack-years of smoking was examined as continuous and categorical (ie, pack-year categories: ≤10, 10.1-20, 20.1-30, 30.1-40, 40.1-50 and >50; where the reference group is ≤10 pack-years) variables. The category ever-smokers included individuals reporting current smoking, smoking most days, smoking occasionally, or former smoking. Never smokers included those who smoked less than 100 cigarettes in their lifetime. Current smokers included those who reported current smoking, and former smokers included noncurrent smokers who smoked 100 or more cigarettes in their lifetime.
Polygenic Risk Score for Lung Function
A polygenic risk score (PRS) for lung function was calculated as previously described (eMethods in the Supplement).20 Briefly, this PRS was based on GWAS results for FEV1 and FEV1/FVC in UK Biobank and SpiroMeta,19 and was developed using a penalized regression framework accounting for linkage disequilibrium.25 PRSs were calculated for FEV1 and FEV1/FVC and then summed into a composite PRS, which was scaled and centered. The PRS was oriented such that a higher PRS was associated with lower FEV1 and FEV1/FVC.
Statistical Analyses
All analyses were done in R version 4.0.3 (R Project for Statistical Computing). The normality of continuous variables was assessed by visual inspection of histograms. Results are reported as mean (SD) or median (IQR), as appropriate. Differences in continuous variables were assessed with t tests or Wilcoxon tests, and categorical variables were compared by analysis of variance or Kruskal-Wallis tests, as appropriate. We used α = .05 as a priori level of statistical significance. All hypothesis tests were 2-sided, and data were analyzed from July 2020 to March 2021.
Interaction Analyses
We performed multivariable linear regressions of FEV1/FVC on the main associations of the combined PRS, smoking exposure, and cross-product interaction terms. We included covariates age, age × age, sex, height, genotyping array, and the first 10 principal components of genetic ancestry in the linear regression model. Age was scaled and centered before squaring. We also performed stratified analyses among those in the lowest and highest deciles of the PRS, separately for never- and ever-smokers.
Investigation of gene-by-environment interactions has been considered to be a deviation from either an additive or multiplicative model. Therefore, we additionally examined the joint effects of smoking and PRS to assess a departure of the observed joint effect from the expected effect under an additive model. We focused on comparing the first (lowest risk) decile and tenth (highest risk) decile of PRS as previously done.20 We created a categorical variable with mutually exclusive strata formed by the cross-classification of smoking and PRS (tenth decile vs first decile). The reference category was the group with the lowest relative smoking exposure (eg, pack-years ≤ 10) in the first PRS decile. We then constructed multivariable linear regression models to evaluate the effects of this categorical variable on FEV1/FVC, adjusting for age, age × age, sex, height, genotyping array, and the first 10 principal components of genetic ancestry. The expected association for those in the highest decile with the highest smoking exposure was estimated under an additive model and calculated by summing the estimated effect size for the lowest decile vs the highest smoking exposure group and the highest decile vs the lowest smoking exposure group.
Norms of Reaction
A norm of reaction describes the association between a phenotype and environmental exposure for a given genotype.26 We assessed norms of reaction for pack-years of smoking and FEV1/FVC for those in the lowest (first) decile, middle (fifth) decile, and highest (tenth) decile of estimated genetic risk. We plotted pack-years of smoking vs FEV1/FVC, stratifying by lowest, middle, and highest deciles of genetic risk. We then compared the slopes of the lowest and highest deciles of genetic risk lines with an analysis of covariance (ANCOVA) using the rstatix R package (R Project for Statistical Computing).27 For clinical interpretability, we trained multivariable linear regression models to assess the association of 20 pack-years of smoking with FEV1/FVC for those in the highest, middle, and lowest deciles of estimated genetic risk, adjusting for the covariates detailed as previously stated.
As sensitivity analyses, we repeated these analyses in ever-smokers and in a data set excluding all related individuals; to select unrelated individuals, we removed at least 1 individual from each related pair with a kinship coefficient greater than 0.0625, favoring the inclusion of COPD cases. We also transformed reported pack-years of smoking (ie, log, scaling and centering, and rank normalization) and measures of PRS (adding a quadratic term, ie, PRS2) to ensure that the effects of interaction terms were not because of misspecification of the main effects of smoking or PRS. To ensure the robustness of our results to the normality of the outcome, we repeated our analyses after log-transforming FEV1/FVC.
Results
Characteristics of Study Participants
We included 319 730 participants; 24 915 participants met criteria for moderate-to-severe COPD cases (GOLD spirometry grades 2 to 4)1; 38 713 had preserved ratio with impaired spirometry (PRISm);28 and 256 102 met criteria for GOLD spirometry grades 0 and 1. Participants had a mean (SD) age of 56.5 (8.02), and 141 864 (44.4%) were male. Characteristics of study participants are shown in Table 1.
Table 1. Characteristics of Study Participants.
Characteristics | Overall, No. (%) |
---|---|
No. | 319 730 |
Age, mean (SD) | 56.45 (8.02) |
Sex | |
Female | 177 866 (55.6) |
Male | 141 864 (44.4) |
Pack-years of smoking, median (IQR) | 0 (0-11.00) |
Smoking status | |
Former/never | 287 445 (89.9) |
Current | 32 242 (10.1) |
Evera | 146 679 (45.9) |
FEV1% predicted, mean (SD) | 92.17 (16.01) |
FEV1/FVC, mean (SD) | 0.76 (0.06) |
Decile of polygenic risk score | |
Lowest | 31 973 (10.0) |
Highest | 31 973 (10.0) |
Abbreviations: FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity.
43 individuals were ever-smokers and did not provide any details regarding current or former smoking.
Interaction of a Polygenic Risk Score With Smoking
The PRS was weakly correlated with pack-years of smoking (r = 0.041; P < .001) (eFigure 2 in the Supplement). The association between PRS and FEV1/FVC stratified by pack-years of smoking categories is illustrated in Figure 1. The PRS was associated with lower FEV1/FVC across all pack-years of smoking categories, and the magnitude of the association of PRS on reduced FEV1/FVC increased with higher pack-years of smoking. In multivariable analyses, PRS (β = −0.0304; 95% CI, −0.0307 to −0.0302) and pack-years of smoking categories were associated with FEV1/FVC (P < .001), and estimated effect sizes increased with each incremental category of smoking exposure (Table 2). The same incremental trend for interaction terms between PRS and pack-years of smoking categories was observed (all tests for interaction yielded P < .001). The PRS and pack-years were significantly associated with lower FEV1/FVC (PRS: β, −0.03; 95% CI, −0.031 to −0.03; pack-years: β, −0.0064; 95% CI, −0.0064 to −0.0063). Considering pack-years of smoking as a continuous variable (eTable 1 in the Supplement), the cross-product interaction term was also associated with FEV1/FVC (β [interaction] = −0.0028; 95% CI, −0.0029 to −0.0026; P < .001). We also performed transformations of pack-years of smoking, PRS, or FEV1/FVC, and the PRS × pack-years interaction term was significant in each analysis (all P < .001) (eTable 2 to eTable 4 in the Supplement).
Table 2. Regression of FEV1/FVC on Main Associations of PRS and Pack-Years of Smoking and Their Cross-Product Term, Adjusting for Covariatesa.
Variable | β (95% CI) | P value |
---|---|---|
PRS | −0.03 (−0.031 to −0.03) | <.001 |
Pack-years | ||
11-20 | −0.011 (−0.012 to −0.01) | <.001 |
21-30 | −0.014 (−0.015 to −0.014) | <.001 |
31-40 | −0.026 (−0.027 to −0.025) | <.001 |
41-50 | −0.034 (−0.035 to −0.033) | <.001 |
>50 | −0.04 (−0.041 to −0.038) | <.001 |
PRS × pack-years category | ||
PRS × 11-20 | −0.0038 (−0.0046 to −0.0031) | <.001 |
PRS × 21-30 | −0.0068 (−0.0077 to −0.006) | <.001 |
PRS × 31-40 | −0.013 (−0.014 to −0.012) | <.001 |
PRS × 41-50 | −0.015 (−0.017 to −0.014) | <.001 |
PRS × >50 | −0.017 (−0.019 to −0.016) | <.001 |
Abbreviations: FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; PRS, polygenic risk score.
Covariates include age, age×age, sex, height, genotyping array, and principal components of genetic ancestry. Pack-years of smoking is included as a categorical variable with 10 or more pack-years as the reference group.
The association between the PRS and FEV1/FVC stratified by ever-smoking vs never-smoking and current-smoking vs former- or never-smoking statuses are shown in eFigure 3A and eFigure 3B in the Supplement, respectively. Ever-smoking and the PRS × ever-smoking status interaction term were significantly associated with FEV1/FVC (both P < .001) (eTable 5 in the Supplement). Similarly, current smoking status and the PRS × current-smoking status interaction term were significantly associated with FEV1/FVC (both P < .001) (eTable 6 in the Supplement). In stratified analyses, we observed similar results between PRS and smoking exposures (eTable 7 and eFigure 4 in the Supplement). Additionally, ever-smoking status and the PRS × ever-smoking status interaction term were significantly associated with FEV1/FVC in the lowest (β [interaction] = −.0033; 95% CI, −0.0058 to −0.00085; P = .0082) and highest (β [interaction] = −.0095; 95% CI, −0.013 to −0.0056; P < .001) deciles of estimated genetic risk (eTable 7 in the Supplement).
Being in the lowest decile of estimated genetic risk and having more than 50 pack-years of smoking exposure (β = −.022; 95% CI, −0.026 to −0.018; P < .001) had a similar estimated effect size as being in the highest decile of genetic risk and having 11 to 20 pack-years of smoking exposure (β = −0.024; 95% CI, −0.026 to −0.023; P < .001). The joint effects of pack-years of smoking categories and PRS are shown in Figure 2. We observed a greater effect size of being in the highest decile of genetic risk and having more than 50 pack-years of smoking exposure (β = −0.051; 95% CI, −0.054 to −0.047) than would be expected, confirming possible interaction between the PRS and pack-years of smoking. We observed a similar association for those in the highest genetic risk decile who were current smokers (eTable 8 in the Supplement), but a nonsignificant difference for ever-smokers in the highest genetic risk decile (eTable 9 in the Supplement).
Norms of Reaction for Highest vs Lowest Estimated Genetic Risk Deciles
In Figure 3, we show different norms of reaction for the associations of pack-years of smoking on FEV1/FVC among those in the highest (tenth), middle (fifth), and lowest (first) deciles of estimated genetic risk. For any given level of pack-years of smoking, those in the highest PRS decile had lower FEV1/FVC compared with those in the lowest decile of PRS. Analysis of covariance confirmed that the slopes of the lines for the highest and lowest decile of PRS are significantly different (P < .001) (Figure 3). We observed similar results in ever-smokers (eFigure 5 in the Supplement). For every 20 pack-years of smoking, those in the first (ie, lowest risk) decile had a change of β = −0.0084 (95% CI, −0.0091 to −0.0076) in FEV1/FVC, while those in the tenth (ie, highest risk) decile of estimated genetic risk had a change of β = −0.017 (95% CI, −0.019 to −0.016), representing an approximately 2-fold reduction in FEV1/FVC for every 20 pack-years of smoking for those in the highest compared with the lowest decile of estimated genetic risk.
Discussion
In this study of more than 300 000 UK Biobank participants, we found that 3 measures of smoking exposure interacted with PRS on the quantitative measure of lung function (FEV1/FVC). As expected, smoking was detrimental to lung function across all categories of estimated genetic risk. However, for any given level of pack-years of smoking exposure, those at the highest genetic risk showed lower FEV1/FVC than those with the lowest estimated genetic risk. Furthermore, the outcomes associated with heavy smoking and being in the highest decile of estimated genetic risk were greater than would be expected based on the additive effects of both risk factors. These results support the idea that diminished pulmonary function (ie, a measure of airflow obstruction) is, at least partially, due to gene-by-smoking interactions, and those in higher genetic risk categories are more susceptible to the deleterious effects of smoking.
Compared with previous studies, our study included more participants, leveraged a more effective measure of the genetic predisposition for low lung function (ie, PRS), examined 3 different measures of smoking exposure (ie, pack-years, ever-smoking, current smoking), and examined norms of reactions for those in the highest decile compared with the lowest decile of estimated genetic risk. Our findings are consistent with Aschard et al23 who reported an interaction between ever-smoking status and a genetic risk score for FEV1/FVC based on 26 different variants. In a family-based study of the rs28929474 variant (Z allele) in SERPINA1, which leads to alpha-1 antitrypsin deficiency (AATD) and greatly increased risk of emphysema, there was a significant genotype-by-smoking interaction on FEV1.29 A strong smoking interaction of rs28929474 heterozygote status with ever-smoking on lung function and COPD has been reported in UK Biobank.30 GWAS have confirmed the interaction of smoking with rs28927474 on risk to COPD,11 and identified gene-by-smoking interactions at several other loci.8,9,10,11,12,13,14 Recently, a study of incident COPD found evidence for a gene-by-smoking interaction, further supporting our findings and suggesting that the relationship of the PRS to other related phenotypes or outcomes may also interact with smoking or other measures.31 In addition, our study quantifies this interaction with respect to lung function rather than COPD status, provides finer resolution regarding level of smoking exposure and genetic interactions, demonstrates robust interactions despite a range of model specifications, and provides a clinical framework for considering estimated genetic risk and susceptibility to the damaging effects of cigarette smoke (ie, the association between 20 pack-years of smoking and FEV1/FVC reduction).
By contrast, Shrine et al19 constructed a genetic risk score from 279 variants associated with lung function but did not observe any evidence of interaction with ever-smoking status on FEV1/FVC; further, the authors reported an interaction with moderate-to-severe COPD status in the opposite direction as expected. A prior GWAS selected approximately 50 000 individuals with low, average, and high FEV1 and reported no gene-smoking interactions at genome-wide significance.32 The reasons for disparate findings between these studies and the current study are unclear but may represent the effects of different loci with varying degrees of the interaction effect.4,19,20,33 While the PRS used in the current study was derived from the Shrine et al19 GWAS results, not all variants reached genome-wide significance, and consequently included many more variants (approximately 2.5 M); this more refined level of estimated genetic risk may have provided the power to detect gene-by-smoking interactions. The particular PRS used in these analyses likely influences the findings of studies evaluating gene-by-smoking interactions.
Smoking was associated with poor outcomes even to those with low estimated genetic risk, and the effects were greater for those with high estimated genetic risk. For any given level of pack-years of smoking, those in the highest decile had lower FEV1/FVC compared with those in the lowest decile of estimated genetic risk. These findings are in contrast to observations in cardiovascular disease, where the association between smoking and coronary heart disease was greater for those in the lowest compared with the highest tertile of a PRS.22 This difference may reflect that many individuals can develop coronary disease in the absence of cigarette smoking and that smoking is a greater risk factor for those with low polygenic risk for coronary disease. Meanwhile, airflow obstruction primarily occurs in the setting of cigarette smoking exposure. Current smokers with high estimated genetic risk demonstrated a greater reduction in lung function than expected, but this association was not observed in individuals who had ever smoked, suggesting gene-by-smoking interactions may be dose-dependent and that current smoking has a direct effect on lung function. Furthermore, those with low estimated genetic risk and high smoking exposure had similar risk for low FEV1/FVC as those with high genetic risk and low smoking exposure. Taken together, these results emphasize that abstaining from smoking is crucial to preventing obstructive lung disease regardless of an individual’s estimated genetic risk and that those in the highest risk groups might benefit from intensive smoking cessation measures with respect to the phenotypes examined in this study.
The PRS used in our study was based on a prior GWAS of lung function.19 In this GWAS, along with a GWAS of COPD status, significant variants are involved in the pathways related to lung growth, as well as elastic fiber and extracelluar matrix, ciliogenesis, and transforming growth factor-β.18,19 We have previously shown PRS is associated with lung structure, such as emphysema and airway measures, as well as reduced lung function growth patterns that can lead to spirometric COPD in early adulthood.20,33 Thus, the interactions with this PRS could reflect interaction with a number of different biologic processes influenced by these genetic variants. Our results suggest that the PRS includes variants that represent biological pathways by which smoking exerts deleterious effects. Some of these variants may act to confer resilience34,35 or susceptibility to the effects of cigarette smoke. Further research to elucidate the potential contributions of key variants used in the PRS and their biological mechanisms underlying the interaction between genetics and smoking on lung function is needed, which could be facilitated by functional studies, examination of other related phenotypes, and multiomics follow up studies. For example, the effect of occupational exposures was modified by rs9931086 in SLC38A8 on FEV1, and network analyses suggested inflammatory processes involving CTLA-4, HDAC, and PPAR-α, may provide mechanistic links for the observed interaction36; however, this was a small study that needs replication.
Strengths of this study include use of a large volunteer cohort, using the most powerful measure for genetic risk for low lung function available to date (ie, a genome-wide PRS), and comparing individuals at extremes of estimated genetic risk. Our study finds that cigarette smoking has a detrimental association with lung function across all levels of genetic risk but has a particularly deleterious effect on those at highest genetic risk of reduced lung function. Polygenic risk scores can contribute to the assessment of COPD risk at all levels of smoking exposure. Knowledge of a person’s genetic risk could allow for earlier diagnosis at lower levels of smoking exposure. Our results also provide a framework for identifying those most susceptible to the harmful effects of smoking who could be targeted for individualized and public health smoking cessation or prevention programs. The effectiveness of targeted genetically informed smoking cessation interventions is unclear, although there is evidence that knowledge of genetic risk for AATD can increase smoking cessation.37 Finally, further investigation into the biological mechanisms by which high genetic risk groups exhibit greater susceptibility to cigarette smoking exposure may identify targets for personalized therapeutics. Clinical use of the PRS will depend on dissecting biological mechanisms of susceptibility to the harmful effects of smoking.
Limitations
Limitations, inherent to study design, include that the UK Biobank is a single cohort observed in cross-section. Examining the effects of gene-by-smoking interactions on incident COPD should be pursued. We were not able to model the time-varying effects of smoking exposure. We used self-reported smoking measures, which is prone to recall bias. We considered only smoking exposure to study gene-by-environment interactions on lung function. Besides cigarette smoking, other environment risk factors, such as occupational exposure and air pollution, should be considered in future interaction studies. We could not examine the effects of gene-by-smoking interactions on lung function in early life. A substantial number of COPD cases occur among individuals whose lung function fails to reach optimal levels in early adulthood.38 Future studies can be done to investigate whether similar gene-by-smoking interactions occur among those who develop airflow obstruction at an early age. The PRS was partially developed using samples from UK Biobank, leading to overfitting of the PRS with respect to spirometric measures; while this issue should not affect interaction assessments, these results should ideally be replicated in future studies. However, the strong effect sizes and robustness to stratified and transformed analyses do lend confidence to our results. We included European-ancestry participants only because the PRS was derived solely from Europeans. Identification of causal variants and genetic prediction in single ancestry populations demonstrate limited portability to multiancestry populations.39,40 Thus, the generalizability of the findings of our study to other populations is not certain. The 279 lung function variants from Shrine et al19 was curated to ensure variants for smoking behavior were excluded, but the PRS used in the current study included approximately 2.5 million variants and was not similarly curated. Including variants that are causal for smoking behavior could bias the interaction term.41 We observed a very weak correlation between this PRS and smoking exposure in UK Biobank, which might be driven by our large sample size, or since we did not exclude smoking related regions, there could be some smoking-related genetic variants included in the PRS. Previously no correlation with smoking in case-control cohorts was observed,20 suggesting that the PRS used in the current study largely reflects the genetics of lung function.
Conclusions
In conclusion, diminished FEV1/FVC and airflow obstruction, which are characteristic of COPD, may be partially attributable to gene-by-smoking interactions. As expected, smoking was harmful across all genetic risk groups but worse for those in the highest decile of estimated genetic risk. Large-scale replication and further investigations into mechanisms of interaction are needed.
References
- 1.Vogelmeier CF, Criner GJ, Martinez FJ, et al. Global strategy for the diagnosis, management, and prevention of chronic obstructive lung disease 2017 report: GOLD executive summary. Am J Respir Crit Care Med. 2017;195(5):557-582. doi: 10.1164/rccm.201701-0218PP [DOI] [PubMed] [Google Scholar]
- 2.Bhatt SP, Balte PP, Schwartz JE, et al. Discriminative accuracy of FEV1:FVC thresholds for COPD-related hospitalization and mortality. JAMA. 2019;321(24):2438-2447. doi: 10.1001/jama.2019.7233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rennard SI, Vestbo J. COPD: the dangerous underestimate of 15%. Lancet. 2006;367(9518):1216-1219. doi: 10.1016/S0140-6736(06)68516-4 [DOI] [PubMed] [Google Scholar]
- 4.Fletcher,C,Peto R, Tinker C SF. The Natural History of Chronic Bronchitis and Emphysema: An Eight-Year Study of Early Chronic Obstructive Lung Disease in Working Men in London. Oxford University Press; 1976. [Google Scholar]
- 5.Wilk JB, Chen TH, Gottlieb DJ, et al. A genome-wide association study of pulmonary function measures in the Framingham Heart Study. PLoS Genet. 2009;5(3):e1000429. doi: 10.1371/journal.pgen.1000429 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Palmer LJ, Knuiman MW, Divitini ML, et al. Familial aggregation and heritability of adult lung function: results from the Busselton Health Study. Eur Respir J. 2001;17(4):696-702. Accessed November 19, 2017. https://www.ncbi.nlm.nih.gov/pubmed/11401066. doi: 10.1183/09031936.01.17406960 [DOI] [PubMed] [Google Scholar]
- 7.Zhou JJ, Cho MH, Castaldi PJ, Hersh CP, Silverman EK, Laird NM. Heritability of chronic obstructive pulmonary disease and related phenotypes in smokers. Am J Respir Crit Care Med. 2013;188(8):941-947. doi: 10.1164/rccm.201302-0263OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.de Jong K, Boezen HM, ten Hacken NHT, Postma DS, Vonk JM; LifeLines cohort study . GST-omega genes interact with environmental tobacco smoke on adult level of lung function. Respir Res. 2013;14(83). doi: 10.1186/1465-9921-14-83 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.de Jong K, Vonk JM, Imboden M, et al. Genes and pathways underlying susceptibility to impaired lung function in the context of environmental tobacco smoke exposure. Respir Res. 2017;18(142). doi: 10.1186/s12931-017-0625-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kim S, Kim H, Cho N, et al. Identification of FAM13A gene associated with the ratio of FEV1 to FVC in Korean population by genome-wide association studies including gene-environment interactions. J Hum Genet. 2015;60(3):139-145. doi: 10.1038/jhg.2014.118 [DOI] [PubMed] [Google Scholar]
- 11.Kim W, Prokopenko D, Sakornsakolpat P, et al. Genome-wide gene-by-smoking interaction study of chronic obstructive pulmonary disease. Am J Epidemiol. 2021;190(5):875-885. doi: 10.1093/aje/kwaa227 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Park B, Koo S-MM, An J, et al. Genome-wide assessment of gene-by-smoking interactions in COPD. Sci Rep. 2018;8(1):9319. doi: 10.1038/s41598-018-27463-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Park B, An J, Kim W, et al. Effect of 6p21 region on lung function is modified by smoking: a genome-wide interaction study. Sci Rep. 2020;10(13075). doi: 10.1038/s41598-020-70092-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hancock DB, Soler Artigas M, Gharib SA, et al. Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function. PLoS Genet. 2012;8(12):e1003098. doi: 10.1371/journal.pgen.1003098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Busch R, Hobbs BD, Zhou J, et al. ; National Emphysema Treatment Trial Genetics; Evaluation of COPD Longitudinally to Identify Predictive Surrogate End-Points; International COPD Genetics Network; COPDGene Investigators . Genetic association and risk scores in a chronic obstructive pulmonary disease meta-analysis of 16,707 subjects. Am J Respir Cell Mol Biol. 2017;57(1):35-46. doi: 10.1165/rcmb.2016-0331OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wain LV, Shrine N, Artigas MS, et al. ; Understanding Society Scientific Group; Geisinger-Regeneron DiscovEHR Collaboration . Genome-wide association analyses for lung function and chronic obstructive pulmonary disease identify new loci and potential druggable targets. Nat Genet. 2017;49(3):416-425. doi: 10.1038/ng.3787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hobbs BD, de Jong K, Lamontagne M, et al. ; COPDGene Investigators; ECLIPSE Investigators; LifeLines Investigators; SPIROMICS Research Group; International COPD Genetics Network Investigators; UK BiLEVE Investigators; International COPD Genetics Consortium . Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis. Nat Genet. 2017;49(3):426-432. doi: 10.1038/ng.3752 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sakornsakolpat P, Prokopenko D, Lamontagne M, et al. ; SpiroMeta Consortium; International COPD Genetics Consortium . Genetic landscape of chronic obstructive pulmonary disease identifies heterogeneous cell-type and phenotype associations. Nat Genet. 2019;51(3):494-505. doi: 10.1038/s41588-018-0342-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Shrine N, Guyatt AL, Erzurumluoglu AM, et al. ; Understanding Society Scientific Group . New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries. Nat Genet. 2019;51(3):481-493. doi: 10.1038/s41588-018-0321-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Moll M, Sakornsakolpat P, Shrine N, et al. ; International COPD Genetics Consortium; SpiroMeta Consortium . Chronic obstructive pulmonary disease and related phenotypes: polygenic risk scores in population-based and case-control cohorts. Lancet Respir Med. 2020;8(7):696-708. doi: 10.1016/S2213-2600(20)30101-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Peyrot WJ, Van der Auwera S, Milaneschi Y, et al. ; Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium . Does childhood trauma moderate polygenic risk for depression: a meta-analysis of 5765 subjects from the Psychiatric Genomics Consortium. Biol Psychiatry. 2018;84(2):138-147. doi: 10.1016/j.biopsych.2017.09.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hindy G, Wiberg F, Almgren P, Melander O, Orho-Melander M. Polygenic risk score for coronary heart disease modifies the elevated risk by cigarette smoking for disease incidence. Circ Genom Precis Med. 2018;11(1):e001856. doi: 10.1161/CIRCGEN.117.001856 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Aschard H, Tobin MD, Hancock DB, et al. ; Understanding Society Scientific Group . Evidence for large-scale gene-by-smoking interaction effects on pulmonary function. Int J Epidemiol. 2017;46(3):894-904. doi: 10.1093/ije/dyw318 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bycroft C, Freeman C, Petkova D, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203-209. doi: 10.1038/s41586-018-0579-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Mak TSH, Porsch RM, Choi SW, Zhou X, Sham PC. Polygenic scores via penalized regression on summary statistics. Genet Epidemiol. 2017;41(6):469-480. doi: 10.1002/gepi.22050 [DOI] [PubMed] [Google Scholar]
- 26.Gupta AP, Lewontin RC. A study of reaction norms in natural populations of drosophila pseudoobscura. Evolution. 1982;36(5):934-948. doi: 10.1111/j.1558-5646.1982.tb05464.x [DOI] [PubMed] [Google Scholar]
- 27.Kassambara A. Pipe-friendly framework for basic statistical tests. rstatix. Accessed November 10, 2021. https://rdrr.io/github/kassambara/rstatix/
- 28.Wan ES, Castaldi PJ, Cho MH, et al. ; COPDGene Investigators . Epidemiology, genetics, and subtyping of preserved ratio impaired spirometry (PRISm) in COPDGene. Respir Res. 2014;15(1):89. doi: 10.1186/s12931-014-0089-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Silverman EK, Chapman HA, Drazen JM, et al. Genetic epidemiology of severe, early-onset chronic obstructive pulmonary disease: risk to relatives for airflow obstruction and chronic bronchitis. Am J Respir Crit Care Med. 1998;157(6 Pt 1):1770-1778. doi: 10.1164/ajrccm.157.6.9706014 [DOI] [PubMed] [Google Scholar]
- 30.Fawcett KA, Song K, Qian G, et al. Pleiotropic effects of heterozygosity for the SERPINA1 Z allele in the UK Biobank. medRxiv. 2021;7:00049-2021. doi: 10.1101/2020.06.04.20115923 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhang P-D, Zhang X-R, Zhang A, et al. Associations of genetic risk and smoking with incident chronic obstructive pulmonary disease. Eur Respir J. 2021:2101320. doi: 10.1183/23120541.00049-2021 [DOI] [PubMed] [Google Scholar]
- 32.Wain LV, Shrine N, Miller S, et al. ; UK Brain Expression Consortium (UKBEC); OxGSK Consortium . Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank. Lancet Respir Med. 2015;3(10):769-781. doi: 10.1016/S2213-2600(15)00283-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.McGeachie MJ, Yates KP, Zhou X, et al. Patterns of growth and decline in lung function in persistent childhood asthma. N Engl J Med. 2016;374(19):1842-1852. doi: 10.1056/NEJMoa1513737 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Elbau IG, Cruceanu C, Binder EB. Genetics of resilience: gene-by-environment interaction studies as a tool to dissect mechanisms of resilience. Biol Psychiatry. 2019;86(6):433-442. doi: 10.1016/j.biopsych.2019.04.025 [DOI] [PubMed] [Google Scholar]
- 35.Tuder RM. Bringing light to chronic obstructive pulmonary disease pathogenesis and resilience. Ann Am Thorac Soc. 2018;15(Supplement 4):S227-S233. doi: 10.1513/AnnalsATS.201808-583MG [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Liao S-YY, Lin X, Christiani DC. Gene-environment interaction effects on lung function: a genome-wide association study within the Framingham heart study. Environ Health. 2013;12(1):1-9. doi: 10.1186/1476-069X-12-101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Carpenter MJ, Strange C, Jones Y, et al. Does genetic testing result in behavioral health change: changes in smoking behavior following testing for alpha-1 antitrypsin deficiency. Ann Behav Med. 2007;33(1):22-28. doi: 10.1207/s15324796abm3301_3 [DOI] [PubMed] [Google Scholar]
- 38.Lange P, Celli B, Agustí A, et al. Lung-function trajectories leading to chronic obstructive pulmonary disease. N Engl J Med. 2015;373(2):111-122. doi: 10.1056/NEJMoa1411532 [DOI] [PubMed] [Google Scholar]
- 39.Wyss AB, Sofer T, Lee MK, et al. Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function. Nat Commun. 2018;9(1):2976. doi: 10.1038/s41467-018-05369-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Duncan L, Shen H, Gelaye B, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat Commun. 2019;10(1):3328. doi: 10.1038/s41467-019-11112-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dudbridge F, Fletcher O. Gene-environment dependence creates spurious gene-environment interaction. Am J Hum Genet. 2014;95(3):301-307. doi: 10.1016/j.ajhg.2014.07.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.