Summary
We conducted gene–smoking interaction analysis in GWAS data of pancreati cancer. We found a possible interaction of axon guidance pathway genes with smoking in modifying the risk of pancreatic cancer. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
Abstract
Cigarette smoking is the best established modifiable risk factor for pancreatic cancer. Genetic factors that underlie smoking-related pancreatic cancer have previously not been examined at the genome-wide level. Taking advantage of the existing Genome-wide association study (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study in 2028 cases and 2109 controls to examine gene–smoking interactions at pathway/gene/single nucleotide polymorphism (SNP) level. Using the likelihood ratio test nested in logistic regression models and ingenuity pathway analysis (IPA), we examined 172 KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, 3 manually curated gene sets, 3 nicotine dependency gene ontology pathways, 17 912 genes and 468 114 SNPs. None of the individual pathway/gene/SNP showed significant interaction with smoking after adjusting for multiple comparisons. Six KEGG pathways showed nominal interactions (P < 0.05) with smoking, and the top two are the pancreatic secretion and salivary secretion pathways (major contributing genes: RAB8A, PLCB and CTRB1). Nine genes, i.e. ZBED2, EXO1, PSG2, SLC36A1, CLSTN1, MTHFSD, FAT2, IL10RB and ATXN2 had P interaction < 0.0005. Five intergenic region SNPs and two SNPs of the EVC and KCNIP4 genes had P interaction < 0.00003. In IPA analysis of genes with nominal interactions with smoking, axonal guidance signaling and α-adrenergic signaling genes were significantly overrepresented canonical pathways. Genes contributing to the axon guidance signaling pathway included the SLIT/ROBO signaling genes that were frequently altered in pancreatic cancer. These observations need to be confirmed in additional data set. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
Introduction
Pancreatic cancer is the fourth leading cause of cancer death resulting in >37 600 human deaths each year in the USA (1). Epidemiological studies have identified cigarette smoking as the most consistent modifiable risk factor for this lethal disease with a relative risk of ~2.0 and contribution to 25% of the total cases (2). Because only a small portion of smokers is affected by pancreatic cancer, genetic factors and other unidentified non-genetic factors may play a crucial role in determining the risk of smoking-related pancreatic cancer. Although a few case–control studies have previously found some associations of carcinogen metabolic genes or DNA repair genes with risk of pancreatic cancer among smokers, these findings were either not replicated or could not be replicated (3). Genome-wide association studies (GWAS) have identified novel susceptible loci or chromosome regions in pancreatic cancer (4,5), and these data have also provided an opportunity to uncover genes that have not been previously thought to be related to this disease. However, genetic factors that influence individual susceptibility to smoking-induced pancreatic cancer have previously not been examined at the genome-wide level.
We have previously analyzed the interactions of two other known risk factors for pancreatic cancer, obesity and diabetes, at the pathway, gene and single nucleotide polymorphism (SNP) levels using the GWAS data and exposure information from the Pancreatic Cancer Case Control Consortium (6). A significant interaction of obesity and chemokine signaling pathway was observed. Analogous to scenarios in these analyses, we expect that cigarette smoking may significantly interact with a group of functionally related genes (pathway) in modifying pancreatic cancer risk. In this study, we explored hierarchal interactions of genetic factors with smoking at pathway/gene/SNP level using an agnostic approach. In addition, we used the candidate gene approach and examined GWAS top hits that have been identified in smoking-related cancers, i.e. cancers of the lung, bladder and head and neck. Furthermore, based on the previous observation that nicotine dependence genes may modify the risk of smoking-related diseases, e.g. lung cancer and chronic obstructive pulmonary disease (7,8), we also examined nicotine-dependence-related gene ontology (GO) pathways in this study.
Materials and methods
Study population and data set
The study population was a subset of seven studies participating in the previously conducted GWAS of the Pancreatic Cancer Cohort Consortium (PanScan) and the Pancreatic Cancer Case Control Consortium (4,5) containing six case–control studies conducted at MD Anderson Cancer Center, Yale University, Mayo Clinic, Memorial Sloan-Kettering Cancer Center, University of California at San Francisco and University of Toronto, and one nested case–control study from the European Prospective Investigation into Cancer and Nutrition cohort. Cases were patients pathologically diagnosed with pancreatic adenocarcinoma; controls were free of pancreatic cancer on recruitment and frequency matched to cases in each institution on sex, birth year and self-reported race/ethnicity.
GWAS scanning was conducted at the National Cancer Institute Core Genotyping Facility using the Illumina HumanHap550-Duo SNP arrays and Illumina Human 610-Quad arrays (4,5). We downloaded the genotype data of 562 000 and 621 000 SNPs for 4195 study individuals (2163 cases and 2232 controls) from Database of Genotypes and Phenotypes (dbGaP) website (http://www.ncbi.nlm.gov/gap) with the approval of MD Anderson Institutional Review Board. The quality control process removed SNPs deviating from Hardy–Weinberg equilibrium or with minor allele frequency < 5%, resulting in a final data set containing 468 114 SNPs that are common to both data sets. With the International HapMap Project genotype data (phase 3 release #3, National Center for Biotechnology Information build 36, dbSNP b126, 2010-05-28, minor allele frequency > 5%) as references for CEU (population with ancestry from northern and western Europe), JPT/CHB (Japanese/Chinese) and YRI (Yoruba in Ibadan, Nigeria) (9), we seeded 10 155 high-quality markers (r 2 < 0.004) in STRUCTURE (10) and identified 4137 individuals (2028 cases and 2109 controls) as the study subjects (0.75–1.00 similarity to CEU) in current analysis. Then the top five principal components for population stratification were derived from Caucasian subjects using the EIGENSTRAT (11).
Definition of pathways and genes
The pathways analyzed in this study were either defined by KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) databases or manually created according to the information in the literature. As described in our previous study (12), we identified 197 pathways each with 10–500 genes from KEGG database (13). Prior to G × E analysis, we performed principal component analysis (PCA) to refine the contributing genes in each pathway. The genetic variation in each gene was decomposed into orthogonal components (eigenSNPs) through PCA. We tested association of each gene within a pathway with disease in likelihood ratio test (LRT) nested in logistic regression model based on principal components (eigenSNPs) accounting for at least 85% variation of a gene (12,14), and only those genes with marginal association with disease (P ≤ 0.10) were retained in the pathway for interaction analysis (PCA–LRT). This approach is in line with the two-step gene–environment interaction analysis approach by Kooperberg et al. (15). Finally, a total of 172 KEGG pathways each having ≥2 genes surviving PCA–LRT screening (Supplementary Table 1, available at Carcinogenesis Online) remained for pathway–smoking interaction analysis. From GO database (16), we retrieved three nicotine dependence pathways, i.e. ‘response to nicotine’ (GO:0035094), ‘cellular response to nicotine’ (GO:0071316) and ‘behavioral response to nicotine’ (GO:0035095; Supplementary Table 1, available at Carcinogenesis Online). Furthermore, we curated three additional gene sets using the Phenotype-Genotype Integrator (PheGenI; http://www.ncbi.nlm.nih.gov/gap/phegeni) based on GWAS top hits for smoking-related cancers: lung cancer, head and neck cancer and bladder cancer (Supplementary Table 1, available at Carcinogenesis Online). Due to relatively small pathway size, the PCA–LRT screening was not applied for the three GO pathways and the self-curated gene sets.
We retrieved 19 058 genes from the human genome database version 18 (hg18) using the University of California at Santa Cruz Genome Browser data retrieval tool (17). We further extended a gene region to include SNPs within 20kb upstream or downstream. There were 17 912 genes each genotyped for ≥2 SNPs in current GWAS data set, resulting in a total of 468 114 SNPs for the G × E analysis.
Exposure variables
Exposure information without personal identifiers was provided by each collaborating institution to MD Anderson under Institutional Review Board approvals and Material Transfer Agreement (MTA) agreements. Exposure variables included sex, age, race/ethnicity, history of cigarette smoking, pack-years, adulthood body mass index (BMI), history of diabetes and family history of cancer. All variables were coded following the same data coding dictionary. Missing values for pack-years in 228 smokers were imputed using the mean values of study-age-sex specific pack-years (18). In this G × E analysis, the exposure variable was defined as 0, <20 and ≥20 pack-years. Other exposure variables that are adjusted in the multivariable models included sex, age (continuous), BMI (categorical: ≤25, 25−29.9 and ≥30 kg/m2 and diabetes (yes versus no). Due to a large number of missing values, family history of cancer was not adjusted in the model and the former and current smoking status was not examined in this analysis.
Statistical methods
A LRT nested in logistic regression model was applied throughout all G × E analyses. The full model included sex, age (continuous), study sites (categorical), five principal components (quantitative) capturing population stratification, diabetes (yes versus no), BMI (categorical), genetic factors (eigenSNPs), smoking (0, <20 versus ≥20 pack-years) and the interaction terms (the products of smoking and genetic factors). The reduced model was the same as the full model except excluding the interaction term(s). Varied smoking intensity may display different G × E spectrum (19). Balancing statistical power and exposure level, as an exploratory effort, we also examined the gene–smoking interactions using 30 pack-years as the cutoff (0, <30 versus ≥30 pack-years) or using pack-years as a continuous variable at pathway level.
We conducted G × E analysis at the pathway and gene level based on the eigenSNPs accounting for at least 85% genetic variation of a pathway or a gene. We identified contributing genes to a pathway or contributing SNPs to a gene using the criterion of P G × E value ≤ 0.05 at the gene or SNP level. SNPs were coded as 0, 1 or 2 based on the number of the minor allele. We further explored the marginal effects of genes or SNPs with nominal significance in a subgroup analysis stratified by 20 pack-years [<20 (including non-smokers), ≥20]. In addition, we analyzed the genes with nominal interactions with smoking using the ingenuity pathway analysis (IPA) (Ingenuity® Systems www.ingenuity.com). The null hypothesis to be tested in the IPA is that nominally significant interacting genes are equally represented in a given pathway, compared with the set of all other genes not in the pathway, which can be tested using Fisher’s exact test based on the hypergeometric distribution. This type of pathway analysis belongs to the category of ‘competitive’ pathway-based tests; in contrast, the LRT-based pathway interaction test belongs to the category of ‘self-contained’ tests, whose null hypothesis is that none of the genes/SNPs in a given biological pathway interacts with smoking on the disease risk (20). As the IPA and LRT pathway-based methods test different null hypotheses, they may not identify the same pathways and we employed both in our analysis to complement each other.
To control the false-positive findings incurred by multiple testing, we applied both the Bonferroni correction and q value method with false discovery rate (FDR) at 0.10 for G × E analysis at pathway/gene/SNP level (21). P values (0.05/178), (0.05/17912) and (0.05/468,114) after Bonferroni correction were considered statistically significant for G × E at pathway, gene and SNP levels, respectively. q values < 0.10 were considered statistically significant at FDR = 10%. IPA automatically provides Benjamini–Hochberg (B-H) adjusted P values for overrepresented pathways, and B-H adjusted P values < 0.10 were considered statistically significant at FDR = 10%.
Results
The demographics and exposure variables for the study population is described in Table I. The distributions of age, race and sex across case and control groups were balanced (all ). Self-reported non-Hispanic whites made up >99% of the study population. Case–control association plot did not imply presence of population stratification (genomic control lambda = 0.99) (22). Smoking, obesity and diabetes were positively associated with increased risk of pancreatic cancer after adjusting for other factors (Ps < 0.001). A clear dose–response relationship was observed for pack-years smoked and risk of pancreatic cancer.
Table I.
Variable | Case (n = 2028) n (%) | Control (n = 2109) n (%) | P (χ2) | Adjusted odds ratio (95% CI)a |
---|---|---|---|---|
Age group | ||||
≤50 | 199 (9.81) | 236 (11.19) | ||
51–60 | 563 (27.76) | 575 (27.26) | ||
61–70 | 710 (35.01) | 713 (33.81) | ||
>70 | 556 (27.42) | 585 (27.74) | 0.49 | |
Raceb | ||||
Non-Hispanic whites | 2008 (99.26) | 2092 (99.19) | ||
Hispanics | 8 (0.40) | 13 (0.62) | ||
Blacks | 0 (0) | 2 (0.09) | ||
Others | 7 (0.35) | 2 (0.09) | 0.12 | |
Sex | ||||
Female | 920 (45.36) | 968 (45.90) | ||
Male | 1108 (54.64) | 1141 (54.10) | 0.73 | |
Smokingc | ||||
Never | 801 (39.63) | 1,008 (47.91) | 1.00 | |
Ever | 1220 (60.37) | 1096 (52.09) | <0.001 | 1.43 (1.26–1.63) |
Pack-yearsc | ||||
Never smoker | 801 (39.63) | 1008 (47.91) | 1.00 | |
0– | 463 (22.91) | 485 (23.05) | 1.23 (1.04–1.45) | |
20– | 225 (11.13) | 201 (9.55) | 1.45 (1.16–1.80) | |
30– | 165 (8.16) | 135 (6.42) | 1.55 (1.20–2.00) | |
>40 | 367 (18.16) | 275 (13.07) | <0.001 | 1.75 (1.44–2.13) |
History of diabetesd | ||||
No | 1583 (79.71) | 1877 (90.50) | 1.00 | |
Yes | 403 (20.29) | 197 (9.50) | <0.001 | 2.35 (1.94–2.84) |
BMI (kg/m2)e | ||||
≤25 | 764 (37.95) | 885 (42.45) | 1.00 | |
25−29.9 | 824 (40.93) | 854 (40.96) | 1.07 (0.93–1.24) | |
≥30 | 425 (21.11) | 346 (16.59) | <0.001 | 1.22 (1.02–1.47) |
Abbreviation: CI, confidence interval.
aOdds ratio was adjusted for age sex smoking/pack-years history of diabetes or BMI (categorical) and study sites.
bMissing values from five cases.
cMissing values from seven cases and five controls.
dMissing values from 42 cases and 35 controls.
eMissing values from 15 cases and 24 controls.
G × E interactions at pathway level
Ten out of 172 KEGG pathways but none of the six candidate pathways showed nominal interactions with smoking (P < 0.05). After adjustment for multiple testing, none of them reached the significance level (Table II). The top two pathways were pancreatic secretion (P = 0.0054) and salivary secretion (P = 0.012) pathways, with major contributing genes RAB8A, PLCB and CTRB1 (Table II). When smoking variable was categorized as 0, <30 and ≥30 pack-years, or pack-years used as a continuous variable, the P values for these two pathways were 0.0089 and 0.026, and 0.000039 and 0.0025, respectively (Supplementary Table 1, available at Carcinogenesis Online).
Table II.
KEGG pathway code | Pathway description | P valuea | No. of SNPs/No. of eigenSNPsb | Contributing genes |
---|---|---|---|---|
hsa04972 | Pancreatic secretion | 0.0054 | 714/175 | RAB8A PLCB1 CTRB1 |
hsa04970 | Salivary secretion | 0.012 | 724/162 | PLCB1 |
hsa05010 | Alzheimer’s disease | 0.013 | 552/139 | PLCB1 NDUFS2 MME |
hsa05322 | Systemic lupus erythematosus | 0.015 | 72/18 | ACTN1 |
hsa04010 | Mitogen-activated protein kinase signaling pathway | 0.021 | 523/154 | ELK4 DUSP3 GNG12 FGF6 |
hsa03020 | RNA polymerase | 0.023 | 37/11 | POLR2E |
hsa04540 | Gap junction | 0.025 | 738/154 | PLCB1 |
hsa04912 | GnRH signaling pathway | 0.038 | 634/149 | PLCB1 |
hsa00562 | Inositol phosphate metabolism | 0.039 | 405/87 | PLCB1 |
hsa04270 | Vascular smooth muscle contraction | 0.048 | 707/154 | PLCB1 PPP1R12B |
aObtained from LRT adjusted for age, race, sex, study, five principal components (quantitative) for population substructure, BMI (categorical) and diabetes.
bNo. of surviving SNPs is the number of SNPs in the ‘new’ pathway; No. of eigenSNPs refers to the number of principal components.
IPA analysis of 915 genes nominally interacting with smoking identified two significant pathways: axonal guidance signaling and α-adrenergic signaling pathway (Table III). The B-H adjusted P values were, respectively, and . The results remained the same when the IPA analysis was restricted to genes with P values of ≤0.04 or ≤0.03 (data not shown). Furthermore, removing 13 smoking- or nicotine-associated genes defined in previous studies (23–25) and three nicotine-response GO pathways (16) from the gene list did not significantly change the results of IPA analysis, excluding the possibility that the significance was inflated by these genes. Notably, one of the three nicotine dependency pathways ‘behavioral response to nicotine’ (GO:0035095) showed a nominal association with heavy smoking (30 or more pack-years) in combined data set of cases and controls.
Table III.
Biological process | P valuea | Ratiob | Contributing genes | |
---|---|---|---|---|
Axonal guidance signaling |
|
44/469 (0.094) | KLC1 ITSN1 SLIT1 MAPK3 BMP3 SEMA6B KRAS EPHA4 PLXNA2 ROBO1 BCAR1 PRKCZ EPHA8 LNPEP TUBA8 SUFU ABLIM3 ADAM19 FIGF PLCB1 WNT4 ARPC1A CHMP1A GNG12 PRKD1 WNT5B NGEF PAK6 BMP5 PDGFB GNG10 EFNA1 SRGAP3 ADAMTS6 PRKAR2B WNT10A WAS RTN4 ADAM10 PRKAG2 BMP7 PIK3CD ADAM9 OPN1SW | |
α-Adrenergic signaling |
|
14/105 (0.13) | ADCY2 CAMK4 MAPK3 ADCY3 KRAS PRKCZ GNG10 CALM1 PRKAR2B PRKAG2 ADCY7 PRKD1 GNG12 OPN1SW |
aCalculated using Fisher’s exact test (right-tailed).
bNumber of genes interacting with a risk factor of interest in a given pathway divided by total number of genes making up that pathway.
As sensitivity analysis, we conducted G × E analysis at pathway level based on the data set with complete pack-years only (without imputed pack-years). The LRT results were almost identical to those obtained from analysis including the imputed pack-years (Supplementary Table 1, available at Carcinogenesis Online). We redid IPA analysis using two P value cutoffs for top smoking-interacting genes. At 0.05 level, two most overrepresented canonical pathways were α-adrenergic signaling ( , B-H adjusted ) and axon guidance signaling ( , B-H adjusted ); at 0.10 level, two most overrepresented canonical pathways were axon guidance signaling ( , B-H adjusted ) and α-adrenergic signaling ( , B-H adjusted ).
G × E interactions at gene level
A total of 915 of the 17 912 tested genes showed nominal interaction with smoking (Supplementary Table 2, available at Carcinogenesis Online) but none reached significance after adjusting for multiple testing. The top nine genes with P G × E < 0.0005 are ZBED2, EXO1, PSG2, SLC36A1, CLSTN1, MTHFSD, FAT2, IL10RB and ATXN2 (Table IV). Subgroup analysis by smoking status [<20 (including non-smokers) versus ≥20 pack-years] found that some genes had differential marginal associations with risk of pancreatic cancer. For example, ZBED2 and CLSTN1 had P G × E ≤ 0.0005 in those with ≥20 pack-years of smoking but P > 0.1 in those with <20 pack-years. In contrast, ATXN2 had a P G × E value of 0.0029 and 0.21 among those with <20 or ≥20 pack-years of smoking, respectively (Table IV). We also analyzed marginal effects for the axon guidance genes by smoking status (≥20 versus <20 pack-years). Eleven genes (PLCB1, GNG12, WNT10A, PLXNA2, PRKCZ, ITSN1, PIK3CD, ADAMTS6, ROBO1, KLC1 and BMP5) in individuals with ≥20 pack-years, four genes (NGEF, WNT4, BCAR1 and PRKAR2B) in those with <20 pack-years, and one gene (ADAM9) in both subgroups showed nominal effect on risk of pancreatic cancer (P < 0.05; Supplementary Table 3, available at Carcinogenesis Online). These observations are in line with the interaction of axon guidance pathway and smoking in modifying the risk of pancreatic cancer.
Table IV.
Gene | Gene description | No. of SNPs/ eigenSNPsa | P (G × E) b | P value for marginal associationb | |
---|---|---|---|---|---|
≥20 Pack-years | <20 Pack-yearsc | ||||
ZBED2 | Zinc finger BED domain-containing protein 2 | 4/3 | 0.00013 | 0.00053 | 0.38 |
EXO1 | Exonuclease 1 | 19/7 | 0.00017 | 0.17 | 0.033 |
PSG2 | Pregnancy-specific beta-1-glycoprotein 2 | 3/3 | 0.00019 | 0.035 | 0.15 |
SLC36A1 | Proton-coupled amino acid transporter 1 | 28/5 | 0.00022 | 0.089 | 0.047 |
CLSTN1 | Calsyntenin-1 | 10/4 | 0.00023 | 0.00088 | 0.64 |
MTHFSD | Methenyltetrahydrofolate synthase domain-containing protein | 24/7 | 0.00036 | 0.03 | 0.057 |
FAT2 | Protocadherin Fat 2 | 35/8 | 0.00038 | 0.047 | 0.036 |
IL10RB | Interleukin-10 receptor subunit beta | 17/7 | 0.00043 | 0.016 | 0.035 |
ATXN2 | Ataxin-2 | 7/2 | 0.00047 | 0.21 | 0.0029 |
aNumber of SNPs in the gene/number of principal components used for the analysis.
bObtained from LRT in logistic regression with adjustment for age, sex, study, diabetes, and five principal components for population substructure.
cIncluding non-smokers.
When the G × E analysis at gene level was conducted on the data set excluding missing pack-years, the results were almost the same as that obtained from analysis on data set including the imputed pack-years (Supplementary Table 2, available at Carcinogenesis Online).
G × E interactions at SNP level
Among the 468 114 SNPs tested, there were 22 509 SNPs nominally interacting with smoking (P < 0.05), and these SNPs were considered as potential contributing SNPs to the genes interacting with smoking (Supplementary Table 4, available at Carcinogenesis Online). After correction for multiple testing, none of them remained significant at FDR of 0.10. Stratified analysis also observed differential effects on cancer risk by smoking status [≥20 (including non-smokers) versus <20 pack-years] or a number of SNPs (Supplementary Tables 4, available at Carcinogenesis Online). Table V summarizes the top seven SNPs with P G × E < 0.00003; two of these SNPs belong to the EVC and KCNIP4 genes.
Table V.
SNP ID | Gene | Genotype | ≥20 Pack-years | <20 Pack-yearsa | All subjects | P (G × E) d | ||
---|---|---|---|---|---|---|---|---|
n (case/control) | Odds ratio (95% CI)b | n (case/control) | Odds ratio (95% CI)b | Odds ratio (95% CI)c | ||||
rs1383180 | EVC | CC | 265/257 | 1.00 | 551/569 | 1.00 | 1.00 | |
TC | 334/271 | 1.20 (0.94–1.51) | 609/718 | 0.87 (0.75–1.03) | 0.98 (0.86–1.12) | |||
TT | 122/71 | 1.69 (1.20–2.38) | 143/226 | 0.66 (0.52–0.83) | 0.90 (0.74–1.09) | 0.000012 | ||
rs11248542 | Inter-genic | TT | 612/550 | 1.00 | 1161/1320 | 1.00 | 1.00 | |
CT | 105/49 | 1.92 (1.34–2.75) | 134/177 | 0.87 (0.69–1.11) | 1.17 (0.96–1.43) | |||
CC | 4/0 | — | 6/16 | 0.44 (0.17–1.14) | 0.64 (0.28–1.45) | 0.000013 | ||
rs11195244 | Inter-genic | TT | 373/256 | 1.00 | 619/751 | 1.00 | 1.00 | |
CT | 293/264 | 0.77 (0.61–0.97) | 559/651 | 1.03 (0.88–1.21) | 0.94 (0.83–1.07) | |||
CC | 56/79 | 0.51 (0.35–0.74) | 124/111 | 1.35 (1.02–1.78) | 0.94 (0.75–1.17) | 0.000014 | ||
rs2043385 | KCNIP4 | CC | 301/290 | 1.00 | 615/638 | 1.00 | 1.00 | |
TC | 334/255 | 1.28 (1.01–1.61) | 569/676 | 0.88 (0.75–1.03) | 0.98 (0.86–1.12) | |||
TT | 87/54 | 1.56 (1.07–2.29) | 119/199 | 0.62 (0.48–0.80) | 0.84 (0.68–1.04) | 0.000017 | ||
rs7141385 | Inter-genic | GG | 523/375 | 1.00 | 860/1059 | 1.00 | 1.00 | |
AG | 175/204 | 0.61 (0.48–0.77) | 397/417 | 1.17 (0.99–1.38) | 0.96 (0.84–1.11) | |||
AA | 24/20 | 0.79 (0.43–1.47) | 46/37 | 1.51 (0.97–2.35) | 1.25 (0.87–1.81) | 0.000024 | ||
rs17256483 | Inter-genic | CC | 518/373 | 1.00 | 854/1050 | 1.00 | 1.00 | |
AC | 179/205 | 0.62 (0.49–0.79) | 400/426 | 1.15 (0.98–1.36) | 0.96 (0.84–1.10) | |||
AA | 24/21 | 0.75 (0.41–1.38) | 48/36 | 1.60 (1.03–2.50) | 1.28 (0.89–1.84) | 0.000027 | ||
rs2470697 | Inter-genic | CC | 377/255 | 1.00 | 608/767 | 1.00 | 1.00 | |
TC | 277/267 | 0.69 (0.55–0.88) | 564/600 | 1.19 (1.02–1.39) | 1.02 (0.90–1.17) | |||
TT | 68/77 | 0.56 (0.39–0.81) | 131/146 | 1.14 (0.88–1.48) | 0.92 (0.74–1.14) | 0.000028 |
aIncluding non-smokers.
bOdds ratio (95% CI) adjusted for age, sex, and five principal components for population substructure.
cIn addition to the above covariates, BMI (categorical), diabetes and study was further adjusted.
d P value for interaction term obtained by LRT in logistic regression analysis of all study subjects.
Discussion
This study for the first time comprehensively explored gene–smoking interactions in modifying the risk of pancreatic cancer at the pathway, gene and SNP levels using GWAS data. We identified significant interactions of the axonal guidance and α-adrenergic signaling pathways with smoking in modifying the risk of pancreatic cancer by IPA analysis. We also observed a possible interaction of pancreatic secretion pathway with smoking using LRT test. We did not find supporting evidences that nicotine dependence genes/pathways or genes identified in other smoking-related cancers modify the risk of pancreatic cancer. These preliminary findings, if confirmed, may reveal novel molecular mechanisms underlying the development of smoking-related pancreatic cancer.
Our observation on the axon guidance genes in smoking-related pancreatic cancer is novel and interesting. Axon guidance pathway is largely involved in neuronal extension and location during embryogenesis. Recently, there is increasing evidence supporting a role of axon guidance genes in cell proliferation, migration, adhesion, invasiveness, apoptosis, survival, metastasis and angiogenesis in various cancers including pancreatic cancer (26–28). In fact, a recently reported whole exome sequencing analysis of human pancreatic adenocarcinoma has found frequent alterations of the axon guidance genes including the SLIT/ROBO signaling genes (29). Careful examination of the reported data (Fig. 1 of ref. 29) revealed a higher frequency of the axon guidance gene alterations in never smokers (26/49, 53%) than in smokers (14/46, 29%). Findings from other studies seem to suggest that these observations may not be simply made by chance. For example, SEMA5A, one of axon guidance genes, has also been identified as a novel biomarker for lung cancer in non-smokers (30). Although polymorphic variants of the axon guidance genes have been shown to influence the success of cigarette smoking cessation (31,32), we did not find any evidence that the nicotine addiction genes were involved in pancreatic cancer. Apparently, both the mutation data and GWAS data showing a possible interaction of smoking with axon guidance genes in pancreatic cancer need to be confirmed in future studies. Illustrating the biological mechanisms underlying these associations may shed new lights on the molecular mechanisms of pancreatic cancer in non-smokers.
One of the major components for cigarettes smoking is nicotine, which by itself does not cause cancer, but its metabolites, i.e. tobacco specific nitrosamines (NNK) are known pancreatic carcinogens (33). NNK not only covalently bind to DNA and contribute to K-ras gene mutations but also activate protumorigenic signaling pathways. For example, NNK interact with β-adrenergic receptors (β-ARs) to stimulate COX-2-mediated inflammatory response and activate epidermal growth factor receptor and its downstream Raf/mitogen-activated protein (MAP) kinase kinase (MEK)/extracellular signal-regulated kinase pathway (34). Although the experimental evidence linking to α-ARs is missing, α-ARs are essential to activation of mitogen-activated protein kinase in vitro and α2-AR antagonists can completely suppress mitogen-activated protein kinase activity (35,36). In addition, α2-ARs may trans-activate PI3K and Akt pathways, which are very important for cell proliferation and survival (37).
The interaction of smoking with pancreatic secretion pathway was nominal but was reproducible. It remained as the top pathway when different smoking variables were used in the interaction analyses. The major contributing genes to this pathway include RAB8A, PLCB1 and CTRB1. The protein encoded by RAB8A gene is a member of the RAS superfamily, which may play a role in the transport of proteins from the endoplasmic reticulum to the Golgi and the plasma membrane. PLCB1 (phospholipase C, beta 1) was a major contributing gene to seven of the top ten pathways nominally interacting with smoking (Table II). The enzyme encoded by this gene catalyzes the formation of inositol 1,4,5-trisphosphate and diacylglycerol from phosphatidylinositol 4,5-bisphosphate. This reaction plays an important role in the intracellular transduction of many extracellular signals. CTRB1 (chymotrypsinogen) is a serine protease that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. Although we do not know how these genes contribute to smoking-related pancreatic carcinogenesis, the impact of cigarette smoking on exocrine and endocrine functions of the pancreas has been well documented (38). Long-term treatment with nicotine reduces pancreatic secretion (39) and increases acinar cell proliferation (40). Upon secretin stimulation, serum levels of secreted pancreatic proteins were significantly elevated in smokers but not in non-smokers (41). Among chronic pancreatitis patients, smokers had significantly lower levels of insulin and glucagon in pancreas than that of non-smokers (42). Further study on the role of pancreatic secretion pathway genes in smoking-induced pancreatic tumor models is needed.
Five intergenic region SNPs and two SNPs of the EVC (Ellis van Creveld syndrome) and KCNIP4 (Kv channel interacting protein 4) genes showed interactions with smoking at P < 0.00003. At gene level, EVC (P = 0.0034) but not KCNIP4 (P = 0.22) showed nominal interaction with smoking. Neither of the two genes has been linked to pancreatic secretion, axon guidance signaling, alpha-adrenergic signaling pathways or smoking-related disease. Whether the observed associations are due to chance alone or represents unknown mechanisms underlying smoking-induced pancreatic cancer remains to be investigated.
In this study, we employed both the Bonferroni procedure to control the family-wise error rate and q value/B-H methods to control the FDR. The family-wise error rate is the probability of making at least one false positive in all comparisons, whereas the FDR is the expected proportion of false positives among all tests determined to be significant. For large-scale genetic and genomic testing problems, controlling FDR is less conservative than controlling the family-wise error rate, i.e. the FDR method often leads to more true discoveries while allowing some false positives (21). In the current analysis, we performed interaction tests at the SNP, gene and pathway levels. We expected that the FDR method might lead to more true discoveries at the SNP and gene levels as there were, respectively, 468 114 and 17 912 interaction tests. It, however, turned out that no matter we employed the Bonferroni procedure or the q value/B-H-based FDR methods, the conclusions were the same for all levels of analyses. We provided both P values and FDR q values for pathway/gene/SNP interactions in Supplementary Tables 1, 2 and 4, available at Carcinogenesis Online, to render the readers the freedom to choose between the two false-positive control methods.
The requirement of large statistical power remains a daunting challenge for G × E analysis of the GWAS data. Previous studies have shown that G × E analyses restricted to genes with marginal effect may increase the statistical power (15). However, we observed that some SNPs conferred differential effects between ever and never smokers but exhibited no marginal effect when smoker and non-smokers were combined. A similar scenario was observed in recent studies on the gene–alcohol interactions in esophageal squamous-cell carcinoma (43,44). Thereby, G × E analysis on genes with marginal effects only may miss those without main effects but truly interacting with an exposure. Therefore, we suggest that G × E analysis should make use of combined methodologies with complementary strengths, as used here and suggested by other investigators (45), to discover the missing heritability of pancreatic cancer (46).
This study has its strengths and limitations. This is by far the largest G × E analysis of all biological pathways defined by KEGG and IPA in pancreatic cancer employing an agnostic approach. We applied PCA approach to lower the magnitude of the GWAS data, increasing the sensitivity of finding true signals. Quality control was permeated into each step of genotyping and exposure measurement and data collection. Stringent statistical threshold was applied to reduce false-positive discovery. Nevertheless, relatively small sample size curbed the G × E GWAS analysis. Our analysis was based on pack-years rather than former and current smokers, which may limit generalization of our observations. Misclassification of pack-years due to imputation may impact our results. Unavailability of external data sets limited validation of our findings and the generalization of the results. Despite of those, the pathways identified in this study are highly relevant to pancreatic cancer and are supported by other studies. G × E analysis in GWAS provides us with an unprecedented opportunity to discover genetic factors bridging smoking and pancreatic cancer.
Supplementary material
Supplementary Tables 1–4 can be found at http://carcin.oxford journals.org/.
Funding
National Institutes of Health (RO1 CA98380-05 to D.L.); MD Anderson Cancer Center Support Grant CA016672; R01 grants (CA169122 and HL116720 to P.W.); PO1 grant (5P30CA023108-28) and UO1 grant (7U19CA148127-03 (to C.A.); Sheikh Ahmed Center for Pancreatic Cancer Research at The University of Texas MD Anderson Cancer Center Funds (to D.L.).
Conflict of Interest Statement: None declared.
Supplementary Material
Glossary
Abbreviations:
- AR
adrenergic receptors
- B-H
Benjamini–Hochberg
- BMI
body mass index
- CI
confidence interval
- FDR
false discovery rate
- GO
gene ontology
- GWAS
genome-wide association study
- IPA
ingenuity pathway analysis
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- LRT
likelihood ratio test
- PCA
principal component analysis
- SNP
single nucleotide polymorphism.
References
- 1. ACS. (2013). Cancer Facts and Figures 2013. American Cancer Society, Atlanta, GA [Google Scholar]
- 2. Bosetti C., et al. (2012). Cigarette smoking and pancreatic cancer: an analysis from the International Pancreatic Cancer Case-Control Consortium (Panc4). Ann. Oncol., 23, 1880–1888 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Jiao L., et al. (2013). Molecular Genetics of Pancreatic Cancer. Springer Science+Business Media, New York, NY [Google Scholar]
- 4. Amundadottir L., et al. (2009). Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nat. Genet., 41, 986–990 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Petersen G.M., et al. (2010). A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nat. Genet., 42, 224–228 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Tang H., et al. (2013). Genes-environment interactions in obesity- and diabetes-associated pancreatic cancer:a GWAS data analysis. Cancer Epidemiol. Biomarkers Prev. 23, 98–106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Thorgeirsson T.E., et al. (2008). A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature, 452, 638–642 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Improgo M.R., et al. (2010). The nicotinic acetylcholine receptor CHRNA5/A3/B4 gene cluster: dual role in nicotine addiction and lung cancer. Prog. Neurobiol., 92, 212–226 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Frazer K.A., et al. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature, 449, 851–861 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Pritchard J.K., et al. (2000). Inference of population structure using multilocus genotype data. Genetics, 155, 945–959 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Price A.L., et al. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet., 38, 904–909 [DOI] [PubMed] [Google Scholar]
- 12. Wei P., et al. (2012). Insights into pancreatic cancer etiology from pathway analysis of genome-wide association study data. PLoS One, 7, e46887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Kanehisa M., et al. (2010) KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res., 38, D355–D360 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Chen L.S., et al. (2010). Insights into Colon Cancer Etiology via a Regularized Approach to Gene Set Analysis of GWAS Data. Am. J. Hum. Genet. 86, 860–871 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Kooperberg C., et al. (2008) Increasing the power of identifying gene x gene interactions in genome-wide association studies. Genet. Epidemiol., 32, 255–263 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gene Ontology Consortium (2008) The Gene Ontology project in 2008. Nucleic Acids Res., 36, D440–D444 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Karolchik D., et al. (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res., 32, D493–D496 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Garshick E., et al. (2006). Smoking imputation and lung cancer in railroad workers exposed to diesel exhaust. Am. J. Ind. Med., 49, 709–718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Moore L.E., et al. (2011). GSTM1 null and NAT2 slow acetylation genotypes, smoking intensity and bladder cancer risk: results from the New England bladder cancer study and NAT2 meta-analysis. Carcinogenesis, 32, 182–189 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Wang K., et al. (2010). Analysing biological pathways in genome-wide association studies. Nat. Rev. Genet., 11, 843–854 [DOI] [PubMed] [Google Scholar]
- 21. Storey J.D., et al. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. U. S. A., 100, 9440–9445 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. de Bakker P.I., et al. (2008). Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet., 17(R2), R122–R128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Uhl G.R., et al. (2008). Molecular genetics of successful smoking cessation: convergent genome-wide association study results. Arch. Gen. Psychiatry, 65, 683–693 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Uhl G.R., et al. (2007). Molecular genetics of nicotine dependence and abstinence: whole genome association using 520,000 SNPs. BMC Genet., 8, 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Picciotto M.R., et al. (2013) Molecular mechanisms underlying behaviors related to nicotine addiction. Cold Spring Harb. Perspect. Med., 3, a012112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Mehlen P., et al. (2011). Novel roles for Slits and netrins: axon guidance cues as anticancer targets? Nat. Rev. Cancer, 11, 188–197 [DOI] [PubMed] [Google Scholar]
- 27. Capparuccia L., et al. (2009) Semaphorin signaling in cancer cells and in cells of the tumor microenvironment–two sides of a coin. J. Cell Sci., 122(Pt 11), 1723–1736 [DOI] [PubMed] [Google Scholar]
- 28. Pasquale E.B. (2010). Eph receptors and ephrins in cancer: bidirectional signalling and beyond. Nat. Rev. Cancer, 10, 165–180 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Biankin A.V., et al. ; Australian Pancreatic Cancer Genome Initiative. (2012) Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature, 491, 399–405 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Uhl G.R., et al. (2010) Genome-wide association for smoking cessation success in a trial of precessation nicotine replacement. Mol. Med., 16, 513–526 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Lu T.P., et al. (2010) Identification of a novel biomarker, SEMA5A, for non-small cell lung carcinoma in nonsmoking women. Cancer Epidemiol. Biomarkers Prev., 19, 2590–2597 [DOI] [PubMed] [Google Scholar]
- 32. Rose J.E., et al. (2010). Personalized smoking cessation: interactions between nicotine dose, dependence and quit-success genotype score. Mol. Med., 16, 247–253 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Pandol S.J., et al. (2012). The burning question: why is smoking a risk factor for pancreatic cancer? Pancreatology, 12, 344–349 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Askari M.D., et al. (2005). The tobacco-specific carcinogen, 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone stimulates proliferation of immortalized human pancreatic duct epithelia through beta-adrenergic transactivation of EGF receptors. J. Cancer Res. Clin. Oncol., 131, 639–648 [DOI] [PubMed] [Google Scholar]
- 35. Pierce K.L., et al. (2000). Role of endocytosis in the activation of the extracellular signal-regulated kinase cascade by sequestering and nonsequestering G protein-coupled receptors. Proc. Natl. Acad. Sci. U. S. A., 97, 1489–1494 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Taraviras S., et al. (2002). Subtype-specific neuronal differentiation of PC12 cells transfected with alpha2-adrenergic receptors. Eur. J. Cell Biol., 81, 363–374 [DOI] [PubMed] [Google Scholar]
- 37. Karkoulias G., et al. (2006). alpha(2)-Adrenergic receptors activate MAPK and Akt through a pathway involving arachidonic acid metabolism by cytochrome P450-dependent epoxygenase, matrix metalloproteinase activation and subtype-specific transactivation of EGFR. Cell. Signal., 18, 729–739 [DOI] [PubMed] [Google Scholar]
- 38. Wittel U.A., et al. (2012). The pathobiological impact of cigarette smoke on pancreatic cancer development (review). Int. J. Oncol., 41, 5–14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Lindkvist B., et al. (2008). Long-term nicotine exposure causes increased concentrations of trypsinogens and amylase in pancreatic extracts in the rat. Pancreas, 37, 288–294 [DOI] [PubMed] [Google Scholar]
- 40. Chowdhury P., et al. (2007). Nicotine-induced proliferation of isolated rat pancreatic acinar cells: effect on cell signalling and function. Cell Prolif., 40, 125–141 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Balldin G., et al. (1980). Elevated serum levels of pancreatic secretory proteins in cigarette smokers after secretin stimulation. J. Clin. Invest., 66, 159–162 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Milnerowicz H., et al. (2007). Dysfunction of the pancreas in healthy smoking persons and patients with chronic pancreatitis. Pancreas, 34, 46–54 [DOI] [PubMed] [Google Scholar]
- 43. Wu C., et al. (2013). The case-only test for gene–environment interaction is not uniformly powerful: an empirical example. Gen. Epidemiol., 37, 402–407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Wu C., et al. (2012). Genome-wide association analyses of esophageal squamous cell carcinoma in Chinese identify multiple susceptibility loci and gene-environment interactions. Nat. Genet., 44, 1090–1097 [DOI] [PubMed] [Google Scholar]
- 45. Hsu L., et al. (2012). Powerful cocktail methods for detecting genome-wide gene-environment interaction. Genet. Epidemiol., 36, 183–194 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Manolio T.A., et al. (2009). Finding the missing heritability of complex diseases. Nature, 461, 747–753 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.