Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2021 Mar 5;17(3):e1009254. doi: 10.1371/journal.pgen.1009254

Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers

Corina Lesseur 1,2, Aida Ferreiro-Iglesias 1, James D McKay 3, Yohan Bossé 4, Mattias Johansson 1, Valerie Gaborieau 1, Maria Teresa Landi 5, David C Christiani 6, Neil C Caporaso 5, Stig E Bojesen 7, Christopher I Amos 8, Sanjay Shete 9, Geoffrey Liu 10, Gadi Rennert 11, Demetrius Albanes 5, Melinda C Aldrich 12, Adonina Tardon 13, Chu Chen 14, Liloglou Triantafillos 15, John K Field 15, Marion Dawn Teare 16, Lambertus A Kiemeney 17, Brenda Diergaarde 18,19,20, Robert L Ferris 20, Shanbeh Zienolddiny 21, Stephen Lam 22, Andrew F Olshan 23, Mark C Weissler 24, Martin Lacko 25, Angela Risch 26,27,28, Heike Bickeböller 29, Andy R Ness 30,31, Steve Thomas 31, Loic Le Marchand 32, Matthew B Schabath 33, Victor Wünsch-Filho 34, Eloiza H Tajara 35, Angeline S Andrew 36, Gary M Clifford 37, Philip Lazarus 38, Kjell Grankvist 39, Mikael Johansson 40, Susanne Arnold 41, Olle Melander 42,43, Hans Brunnström 44, Stefania Boccia 45,46, Gabriella Cadoni 47,48, Wim Timens 49,50, Ma’en Obeidat 51, Xiangjun Xiao 8, Richard S Houlston 52, Rayjean J Hung 53,*, Paul Brennan 1,*
Editor: Stephen J Chanock54
PMCID: PMC7968735  PMID: 33667223

Abstract

Squamous cell carcinomas (SqCC) of the aerodigestive tract have similar etiological risk factors. Although genetic risk variants for individual cancers have been identified, an agnostic, genome-wide search for shared genetic susceptibility has not been performed. To identify novel and pleotropic SqCC risk variants, we performed a meta-analysis of GWAS data on lung SqCC (LuSqCC), oro/pharyngeal SqCC (OSqCC), laryngeal SqCC (LaSqCC) and esophageal SqCC (ESqCC) cancers, totaling 13,887 cases and 61,961 controls of European ancestry. We identified one novel genome-wide significant (Pmeta<5x10-8) aerodigestive SqCC susceptibility loci in the 2q33.1 region (rs56321285, TMEM273). Additionally, three previously unknown loci reached suggestive significance (Pmeta<5x10-7): 1q32.1 (rs12133735, near MDM4), 5q31.2 (rs13181561, TMEM173) and 19p13.11 (rs61494113, ABHD8). Multiple previously identified loci for aerodigestive SqCC also showed evidence of pleiotropy in at least another SqCC site, these include: 4q23 (ADH1B), 6p21.33 (STK19), 6p21.32 (HLA-DQB1), 9p21.33 (CDKN2B-AS1) and 13q13.1(BRCA2). Gene-based association and gene set enrichment identified a set of 48 SqCC-related genes rel to DNA damage and epigenetic regulation pathways. Our study highlights the importance of cross-cancer analyses to identify pleiotropic risk loci of histology-related cancers arising at distinct anatomical sites.

Author summary

Squamous cell carcinomas are specific type of cancer that can arise in multiple organs of the aerodigestive tract including the lung, oral cavity, oropharynx, larynx and esophagus. Previous studies have shown that aerodigestive squamous cell carcinomas share common environmental risk factors (tobacco smoking and alcohol intake). Here, we investigate genetic factors involved in the risk of aerodigestive squamous cell carcinomas as a group in a large genetic association study involving 13,887 cancer cases and 61,961 controls. We identified one genome-wide significant region within 2q33.1 and 3 other suggestive regions at 1q32.1, 5q31.2 and 19p13.11. Gene-based analyses also identify a list of SqCC-related genes that are involved in DNA damage response and epigenetic regulation. Our results suggest some overlap in the genetic factors influencing the risk of aerodigestive squamous cell carcinomas in European populations and highlights the importance of cross-cancer studies.

Introduction

The squamous cell carcinomas (SqCC) of the aerodigestive tract [1], lung squamous cell carcinoma (LuSqCC) and head and neck cancers (HNC, >90% SqCCs) including; oral/pharyngeal SqCC (OSqCC), larynx SqCC (LaSqCC), and esophageal SqCC (ESqCC); are strongly associated with common risk factors such as tobacco smoking, alcohol consumption and human papilloma virus (HPV) infection [2]. Similarly, recent molecular characterization studies across anatomically distinct SqCCs have shown that histology is more important than tissue of origin in defining tumor molecular profiles determined by shared features including somatic mutations, copy number alternations, deregulation of DNA methylation and/or gene expression[24].

Along with behavioral risk factors, it is increasingly recognized that inherited factors also play a role in aerodigestive SqCC risk. Previous genome-wide association studies (GWAS) have identified multiple genetic risk variants for individual aerodigestive SqCC types; notably variants in smoking-related genes at 15q25.1 for LuSqCC [57] and 4q23 variants in alcohol-related genes for upper aerodigestive tract (UADT) cancers [8]. Importantly, candidate-gene and GWAS studies have previously described rare genetic variants linked to aerodigestive SqCC risk; including variants near BRCA2 (13q13.1), first identified as a risk factor for ESqCC in Middle Eastern populations [9] and later described to increase risk of LuSqCC [10] and UADT SqCC in Europeans [11]. Similarly, at 22q12.1 another rare missense variant within CHEK2 (rs17879961, p.Ile157Thr) has been linked to reduced risk of lung and UADT SqCCs [1013]. Such studies provide evidence of genetic pleiotropy across aerodigestive SqCCs, as these variants exert cross-cancer effects possibly related to similar underlying mechanisms (i.e. DNA repair). Furthermore, a recent large-scale genome-wide genetic correlation analysis across six solid tumors (breast, colorectal, head/neck, lung, ovary and prostate cancer), highlighted that the strongest genetic correlation was between lung and head and neck cancers [14].

Collectively aerodigestive SqCC are an important public health issue; not only because as a group are amongst the most common type of solid tumors [2], but also due to the increasing global incidence of HPV-related head and neck SqCCs [15]. Identifying genetic risk loci that can have pleiotropic effects across aerodigestive SqCC sites is important for gaining insight into shared or divergent molecular basis of different tumors. To further examine pleotropic risk genomic regions across LuSqCC [16], OSqCC [17], LaSqCC [8] and ESqCC [8] and to identify novel associations not detected in single-cancer analyses, we performed a GWAS meta-analysis combining data from the largest existent GWAS in Europeans for these malignancies.

Results

Overview

We performed GWAS meta-analysis on aerodigestive SqCC risk including 13,887 cancer cases and 61,961 non-overlapping controls of European ancestry. The SqCC cases comprised 7,426 LuSqCC, 5,452 OSqCC, part of the OncoArray Consortium [1618], and additional 693 LaSqCC and 316 ESqCC previously included in a upper aerodigestive cancer GWAS [8] (Table 1). Summary associations statistics were used to perform fixed-effects (F-E) and a subset-based meta-analyses using the ASSET software [19]. This approach allows exploration of all possible subsets of studies to identify the strongest association signal, while accounting for subset search multiple testing, and adjusting standard errors to account for overlapping controls between analyses; partial overlap (N = 2,500) between LuSqCC and OSqCC and complete overlap between the ESqCC and LaSqCC. After quality control steps, 8,468, 885 genetic variants with summary statistics in at least three of the four interrogated SqCC types were used for analyses. The quantile–quantile plot (F-E meta-analysis, S1 Fig) shows little evidence of genomic inflation after correcting for sample size (λ = 1.006). Loci that reached Pmeta < 5x10-7 were considered noteworthy; meta-analysis results for all SNPs below P<5x10-5 are shown in S1 Table. From noteworthy loci, those not previously reported in the lung [16] or oral/pharyngeal [17] analyses (single-cancer P > 5x10-7) were considered as novel SqCC regions. We identified one novel aerodigestive SqCC loci at genome-wide significance (F-E meta-analysis, Pmeta < 5x10-8) within 2q33.1. We detected suggestive associations with SqCC risk at 1q32.1, 5q31.2 and 19p13.11, not detected in previous analyses (Fig 1 and Table 2). Other loci that reached Pmeta <5x10-7 were considered pleiotropic if these had at least two cancer sites at P<5x10-4 and the same effect direction in all tumor sites. Using these criteria, the loci categorized as pleiotropic (4q23, 6p21.32, 6p21.33, 6p22, 9p21.3 and 13q13.1) included 108 SNPs (S2 Table), the lead SNP (lowest Pmeta) for each of these regions is shown in Table 3. In contrast, other known cancer regions that reached the GWAS threshold (12p13.33, 15q25, 19q13.2) or Pmeta < 5x10-7 (4p14, 9q34.1, 10q24.31, 11q21, 15q15.3) in the SqCC meta-analysis were not pleiotropic (Fig 1). We did not observe additional associations reaching the GWAS threshold or suggestive significance in the ASSET subset-based meta-analysis, indicating that at least for the strongest associations, the effects have consistent direction across the examined aerodigestive SqCC types. For noteworthy SNPs, we performed expression quantitative trait (eQTL) analyses with normal lung tissues from the multicenter Lung Microarray Study (S3 Table). We also query these variants in multiple public genomic annotation databases. The Genotype-Tissue Expression (GTEx) for lung and esophageal eQTLs (S4 Table). ROADMAP and the Encyclopedia of DNA Elements (ENCODE) for epi/genomic annotations (S5 Table). The NHGRI-EBI GWAS Catalog (S6 Table) for disease/phenotype associations and the COSMIC catalogue for cancer somatic mutation information (S7 Table). Lastly, we performed a genome-wide gene-based association analysis (GWGAS) of the SqCC meta-analyses results using MAGMA (Multi-marker Analysis of GenoMic Annotation)[20] (S8 Table). To map individual SNPs to genes we used the Functional Mapping and Annotation (FUMA, S9 Table) [21]. Overlapping genes from these were used to assemble a list of aerodigestive SqCC genes (S10 Table) used in enrichment analyses (S11 Table).

Table 1. Summary of studies included in the aerodigestive SqCC meta-analysis.

Study Tumor site Cases Controlsa Array Imputation panel Imputation Quality Number of Variants Covariates Ancestry Publication
Lung cancer OncoArray Lung 7,426 55,630 Illumina OncoArray 1000 Genomes v3 R2>0.3 10,439,017 Age, sex, PCs European McKay, Hung et al 2017
Oral and oropharynx cancer OncoArray Oral and Oropharynx 5,452 5,984 Illumina OncoArray HRC R2>0.3 7,542,495 Age, sex, PCs European Lesseur et al 2016
Oral 2,698
Oropharynx 2,414
Otherb 340
UADT cancer GWAS Larynx 693 2,847 Illumina Human- Hap300 HRC R2>0.3 8,840,446 Age, sex, PCs European McKay et al 2011
Esophageal 316 2,847

R2 = imputation quality measure; MAF = minor allele frequency; UADT = upper aerodigestive tract; HRC = Haplotype Reference Consortium panel;

a Overlapping controls N = 2,500 (lung and oral/oropharynx) and N = 2847 (larynx and esophageal).

b Cases with overlapping oral and oropharyngeal tumors.

Fig 1. Manhattan plot of aerodigestive SqCC genome-wide association fixed-effects meta-analysis results.

Fig 1

The y-axis corresponds to −log10 P-values, and x-axis to genomic positions. Horizontal red dashed line (P = 5x10−8) and black dashed line (P = 5x10−7). Highlighted in red are the newly identified pleotropic aerodigestive SqCC loci (P<5x10-8), in blue new loci at P<5x10-7 and in purple previously identified pleotropic loci (at least 2 cancer sites). Loci labeled in black are loci that reached P<5x10-7 but were not pleiotropic associated only in single-cancer analyses.

Table 2. Novel genomic regions with pleiotropic aerodigestive SqCC associations.

Regiona EA/OAb Gene EAF SqCC site OR 95%CI P ORmeta 95%CImeta Pmeta c
Genome-wide significant loci
2q33.1 A/G TMEM237 0.31 Lung 0.92 0.88–0.96 2.51E-04 0.902 0.87–0.94 6.99E-09
rs56321285 Oral/oropharynx 0.89 0.83–0.94 2.34E-04
2:202505545 Larynx 0.79 0.67–0.93 3.83E-03
Esophagus 0.8 0.64–0.98 3.56E-02
Suggestive loci
1q32.1 G/T MDM4 0.35 Lung 1.07 1.03–1.11 4.63E-04 1.08 1.05–1.12 2.16E-07
rs12133735 Oral/oropharynx 1.14 1.08–1.21 9.82E-06
1:204556836 Larynx 1.05 0.92–1.20 5.14E-01
Esophagus 1 0.83–1.19 9.74E-01
5q31.2 G/A TMEM173 0.28 Lung 1.09 1.05–1.14 5.29E-05 1.09 1.05–1.13 1.74E-07
rs13181561 Oral/oropharynx 1.1 1.04–1.17 1.74E-03
5:138850905 Larynx 1.12 0.96–1.3 1.44E-01
Esophagus 1.06 0.87–1.3 5.62E-01
19p13.11 A/G ABHD8 0.3 Lung 1.09 1.05–1.14 2.05E-05 1.09 1.05–1.12 9.86E-08
rs61494113 Oral/oropharynx 1.09 1.03–1.16 4.91E-03
19:17401859 Larynx 1.15 1.00–1.31 5.22E-02
Esophagus 1.05 0.87–1.27 5.90E-01

a Lead SNP (lowest P, F-E meta-analysis), regions at P<5x10-7

b EA = Effect allele/OA = other allele; EAF = average effect allele frequency between sites.

c F-E meta-analysis (ASSET) accounting for control overlap between Lung SCC and oral/pharynx analysis and larynx and esophageal analysis.

Table 3. Variants within known genomic loci with pleiotropic aerodigestive SqCC associations.

Regiona EA/RAb Gene EAF SqCC site OR 95%CI P ORmeta 95%CImeta Pmetac
4q23 T/C ADH1B 0.05 Lung 0.92 0.84–1.01 7.35E-02 0.80 0.74–0.86 1.89E-09
rs1229984 Oral/oropharynx 0.58 0.50–0.67 8.32E-13
4:100239319 Larynx 0.67 0.49–0.92 1.42E-02
Esophagus 0.28 0.15–0.53 9.28E-05
6p21.33 G/A STK19 0.09 Lung 1.29 1.22–1.37 1.41E-16 1.26 1.19–1.32 2.42E-19
rs389884 Oral/oropharynx 1.21 1.09–1.35 4.41E-04
6:31940897 Larynx 1.21 0.95–1.52 1.19E-01
Esophagus 1.24 0.92–1.68 1.65E-01
6p21.32 G/A HLA-DQA1 0.41 Lung 1.17 1.12–1.21 3.01E-14 1.16 1.12–1.19 4.82E-19
rs9271611 Oral/oropharynx 1.15 1.08–1.22 1.23E-05
6:32591609 Larynx 1.27 1.07–1.49 5.14E-03
Esophagus 1.02 0.82–1.28 8.36E-01
9p21.3 T/C CDKN2B-AS1 0.30 Lung 1.09 1.05–1.14 1.34E-05 1.11 1.07–1.14 5.55E-10
rs7857345 Oral/oropharynx 1.14 1.07–1.21 3.73E-05
9:22087473 Larynx 1.18 1.03–1.36 1.91E-02
Esophagus 1.07 0.89–1.29 4.63E-01
13q13.1 A/G BRCA2 0.01 Lung 2.12 1.77–2.55 1.10E-15 2 1.73–2.32 2.30E-21
rs11571815 Oral/oropharynx 1.67 1.28–2.18 1.67E-04
13:32968550 Larynx 3.04 1.41–6.57 4.53E-03
Esophagus 4.73 2.14–10.5 1.24E-04

a Lead SNP (lowest P, F-E meta-analysis), regions at P<5x10-7;

bEA = Effect allele/OA = other allele; EAF = average allele frequency between sites;

c F-E meta-analysis (ASSET) accounting for control overlap.

Novel loci with pleiotropic aerodigestive SqCC associations

At 2q33.1, the intronic variant rs56321285[A] within the transmembrane protein 237 (TMEM237) gene was associated with reduced risk of aerodigestive SqCC (OR = 0.90, Pmeta = 6.99x10-9). This association showed little heterogeneity across cancer sites LuSqCC: OR = 0.92, P = 2.51x10-4; OSqCC: OR = 0.89, P = 2.34x10-4; LaSqCC: OR = 0.79, P = 3.83x10-3; ESqCC: OR = 0.80, P = 3.56x10-2 (Fig 2A). rs56321285 is in low linkage disequilibrium (LD) with other variants in the region (S2 Fig) including rs10931936 (r2 = 0.02, 1000 Genomes (1KG), Europeans), the lead SNP of a weaker 2q33.1 association (SqCC ORmeta = 1.08, Pmeta = 1.83x10-6). rs10931936 is in LD with nearby CASP8-ALS2CR12 variants which have been previously linked with risk of multiple cancers in Europeans [22], as well as esophageal and lung cancer in Chinese populations [23,24]. CASP8 plays an important role in apoptosis; mutations in this gene have been described in 2% of LuSqCC and 6% of UADT SqCC tumors (S7 Table) [25]. However, the 2q33.1 genome-wide significant SNP (rs56321285) associated with aerodigestive SqCC risk, seems independent from CASP8-ALS2CR12 variants. In eQTL analyses using lung tissues Lung Microarray Study (S3 Table), rs56321285 is a nominally significant cis-eQTL for AOX2P (Laval and Groningen datasets) and CDK15 (Laval and UBC datasets). However, rs56321285 is not a lung or esophageal eQTL in the GTEx catalog [26]. Regulatory annotations from ENCODE [27] and ROADMAP [28] are consistent with rs56321285 mapping to a H3K4me1 enhancer in lung fibroblasts (S5 Table).

Fig 2. Forest plots of ORs for new aerodigestive SqCC-related loci.

Fig 2

a) rs56321285 at 2q33.1 (TMEM23). Subset refers to subset based meta-analyses (ASSET). b) rs12133735 at 1q32.1 (near MDM4).

The lead SNP (rs12133735) at 1q32.1; the G allele was associated with increased risk of aerodigestive SqCC (ORmeta = 1.08, Pmeta = 2.16x10-7, Table 2 and Fig 2B), predominantly driven by the LuSqCC (OR = 1.07, P = 4.63x10-4) and OSqCC (OR = 1.14, P = 9.82x10-6) results. rs12133735 is located 3’ of MDM4 (S3 Fig) and is a MDM4 eQTL in all datasets from the lung eQTL study (S3 Table and S4 Fig), and in lung and esophageal GTEx tissues [26] (S4 Table). MDM4 is a crucial negative regulator of p53 and its upregulation has been described as a common p53 inactivation mechanism in tumors [29,30]. In contrast, in our analyses rs12133735-G is associated with lower MDM4 expression in lung tissues and increased SqCC risk. However, the regulation of MDM4 expression and interaction with p53 involves complex mechanisms (including alternative splicing) and are reported to differ between normal and cancer tissues[29]. In Europeans, rs12133735[G] is in moderate LD (r2 = 0.61, 1KG) with rs4245739 [C] (SqCC OR meta = 1.07, P = 7.62x10-6) which has been associated with increased risk of triple negative breast cancer [31,32] and ovarian cancer [33] (S6 Table). Intriguingly, rs4245739 [C] has also been associated with reduced prostate cancer risk (Europeans) [34] and lower risk of all cancers in Asians [35,36]. A candidate-gene study [37] also described associations between risk of HPV16-associated OSqCC and 1q32.1 MDM4 SNPs including rs11801299 (r2 = 0.12, with rs12133735, 1KG, Europeans), which was marginally associated with SqCC risk in our analysis (ORmeta = 0.91, Pmeta = 1.34x10-5).

The lead variant at 5q31.2 rs13181561[G] (ORmeta = 1.09, P meta = 1.74x10-7, Fig 3A) near TMEM173 (S5 Fig) showed homogenous associations across tumor sites but only significant in LuSqCC and OSqCC. rs13181561 is associated with DNAJC18 and SPATA24 gene expression in lung tissues (Laval, Groningen and UBC; S3 Table), and of DNAJC18 in esophagus (GTEx, S4 Table). rs13181561 overlaps with an enhancer in esophageal and lung tissues (S5 Table). Additionally, rs13181561[G] is highly correlated with rs7447927[C] (r2 = 0.94, 1KG, Europeans), the latter (ORmeta = 1.08, P = 5.45x10-7) has been previously linked to increased ESqCC risk in Chinese populations[38] (S6 Table).

Fig 3. Forest plots of ORs for new aerodigestive SqCC-related loci.

Fig 3

a) rs13181561 at 5q31.2 (TMEM173); b) rs61494113 at 19p13.1 (ABHD8). Subset refers to subset based meta-analyses (ASSET).

Another suggestive SqCC association was detected at rs61494113[A] within 19p13.11; (ORmeta = 1.09, Pmeta = 9.9x10-8); showed similar odds ratios across SqCCs sites albeit with limited power for larynx and esophageal SqCCs (Table 1 and Fig 3B). Lung eQTL analysis showed rs61494113 as a significant eQTL for OCEL1 (Laval and Groening, S3 Table). However, the GTEx catalog shows rs61494113 as an esophageal ABHD8 eQTL and a BABAM1 splice-QTL (lung and esophagus) but not for OCEL1 (S4 Table). rs61494113 maps within H3K4me1 histone and DNase marks in normal lung tissues and in lung carcinoma cells (S5 Table). rs61494113[A] is in complete LD with rs56069439[A] (r2 = 1, 1KG, Europeans), also associated with a SqCC risk (S1 Table and S6 Fig) and previously linked with increased risk of ER-negative breast [39] and ovarian [39,40] cancers (S6 Table). The 19p13.11 region of LD contains multiple genes involved in DNA damage repair including BABAM1 a BRCA1-interacting protein[41] and ANKLE1 [42].

Known risk loci with pleiotropic aerodigestive SqCC associations

Chromosome 6 showed a large aerodigestive SqCC association signal overlapping the human leukocyte antigen (HLA), region previously identified in the LuSqCC[16] and oral/pharyngeal SqCC[17] cancer analyses. rs389884 near STK19 was the top pleotropic SNP at 6p21.33 (SqCC ORmeta = 1.26; P = 2.4x10-19, Table 3 and S2 Table). 6p21.33 SNPs are in moderate LD (rs389884 and rs115785414, r2>0.4, 1KG, Europeans) with variants at 6p22.1, suggesting a common haplotype. We also detected associations at 6p21.32; rs9271611 near HLA-DQA1 (class II) is not correlated (S7 Fig) with 6p21.33 SNPs, pairwise LD between rs9267123 and rs9271611 r2 = 0.09 (1 KG, Europeans). This second association reduced aerodigestive SqCC risk (ORmeta = 0.85; P = 1x10-17) mainly for OSqCC and LuSqCC (Table 3). These observations are in concert with our previous findings [17,43]; of at least two haplotypes within the HLA region with different effects on cancer risk. Genetic variants at 9p21.33 have also been associated with multiple malignancies including lung adenocarcinoma [16] and OSqCC [17]. rs7857345 (9p21.33 lead SNP mapped to CDKN2B-AS1, S8 Fig) was associated with a slight increase in SqCC risk (ORmeta = 1.11, Pmeta = 5.55x10-10, Table 3). rs7857345 is in LD with rs61271866 (r2 = 0.51) previously associated with ESqCC in Chinese populations [38]. However, rs7857345 is not in LD with rs885518 (r2 = 0.006, 1KG, Europeans), suggesting that this is a different signal to that previously reported for lung adenocarcinoma risk in Europeans[16].

Expectedly, two other previously reported rare variants at 4q23 (rs1229984, ADH1B) and 13q13.1 (rs11571815, BRCA2) also displayed pleiotropy in this analyses [1012] [44] (Fig 1, Table 3). Of note, we did not observe pleiotropy for variants at 15q25.1, a known locus related to lung cancer[5] and smoking behavior [45]. The lead SNP in this region rs55781567 (CHRNA5) was prominent for SqCC (ORmeta = 1.19; Pmeta = 1.74x10-29), but this result was primarily driven by the LuSqCC (OR = 1.3; P = 4.6x10-41) with no effect in any other SqCC site (OSqCC P = 0.26; LaSqCC P = 0.68 and ESqCC P = 0.7, S1 Table and S9 Fig). Variants at 15q25 have been interrogated before in relation to upper aerodigestive cancer risk (including samples used in this study) [46]; significant associations were found only in women and unrelated to smoking behavior suggesting that 15q25.1 SNPs relate differently to LuSqCC compared to oral/oropharyngeal SqCCs. Importantly, none of the published HNC GWAS in Chinese or Europeans have reported associations with 15q25 variants. Thus, while the relation between this locus with smoking behavior and lung cancer is unequivocal; to date there is no evidence of a clear link between 15q25 and head and neck cancer. Future studies including more cases and stratified analyses should examine this further.

Aerodigestive SqCCs risk genes and pathways

To gain further functional insight into aerodigestive SqCC genetic susceptibility, we used the results of the F-E meta-analyses to map risk variants to genes (FUMA) and to perform a genome-wide gene-based association analysis (GWGAS) using MAGMA. The SNP to gene analyses highlighted 182 genes within 21 genomic regions (S8 Table) and the gene-based analysis identified 51 significant genes related to aerodigestive SqCC (Bonferroni correction P<2.67x10-6, S9 Table). Next, we overlapped the results from the two analyses and obtained a list of 48 SqCC-related genes (S10 Table), which includes TMEM237 (2q33.1), MDM4 (1q32.1), AC138517.1 (5q31.2) and BABAM1 (19p13.11) located within the pleotropic SqCC risk regions identified in the meta-analyses. Expectedly, the HLA region in chromosome 6 had the highest number of genes mapped (24), however and interestingly the most prominent signals were located within 6p22 and included >10 histone genes. Gene-set enrichment analyses using these SqCC-related genes and canonical pathways resulted in 63 significant gene sets (S11 Table) most of which mapped to DNA damage pathways (telomere, checkpoint, oxidative stress and strand break response) as well as epigenetic regulation pathways related to histones and DNA methylation.

Discussion

This study identified one novel genome-wide significant loci associated with aerodigestive SqCC risk (2q33.1). Four other loci (1q32.1, 5q31.2 and 19p13.11) showed suggestive associations with SqCC. Amongst known SqCC loci, four showed evidence of pleiotropy across cancer sites. Our results demonstrate the power of cross-cancer analyses of histologically-related tumors to identify genetic risk loci.

It is notable that many of the detected associations are plausibly related to cancer risk. Our results from 2q33.1 and 5q31.2 combined with previous evidence [23,24,38] indicate that these loci relate to SqCC risk across distinct ancestries (Asians and Europeans). Moreover, the signal at 2q33.1 was also proximal to CASP8-ALS2CR12, a region previously associated with other cancers [47,48]. Likewise, the 1q32.1 and 19p13.11 genomic regions implicate genes like MDM4 and BABAM1, which have been previously associated with risk of other epithelial malignancies including breast, prostate and ovary [31,36,39,49]. Lastly, these observations not only make our findings more plausible but also expand our understanding of cross-cancer genetic susceptibility and complex biology behind these associations.

The top associations displayed homogenous effect direction across SqCC sites and stronger associations in the F-E meta-analyses compared to the subset-based meta-analyses. This could relate to the shared histology and risk factors of aerodigestive SqCCs. Nonetheless, we cannot rule out heterogeneous SqCC associations that we were not able to detect in our data. However, and not surprisingly, for these loci effect sizes were small (range of ORmeta 0.89–1.09) which limited association detection in the smaller single cancer-analysis using commonly applied GWAS P-values thresholds. In contrast, most of the known loci that exhibited pleiotropy in our analysis have larger effects sizes, particularly true for the less common variants within BRCA2 and CHEK2.

Our study has several major strengths. Firstly, we leveraged available European GWAS data sets to perform a large-scale meta-analysis of aerodigestive SqCC risk. Secondly, we analyzed tissue-specific gene expression data from multiple studies and integrated these data with publicly available information on epigenetic regulatory profiles of relevant tissues to aerodigestive SqCCs. Thirdly, for the newly discovered loci we also integrated our results with existent data from genetic susceptibility studies in other populations as well as available tumor repository information. However, this study has a number of limitations. The sample sizes for laryngeal and esophageal SqCC were very limited; this impacted our power to identify more signals at the GWAS threshold. The described associations (particularly those at P > 5x10-8) could be spurious due to the high testing burden and lack of replication; other studies should examine these regions further to replicate these results. Our criteria to identify pleiotropic loci tried to capture robust loci across multiple aerodigestive SqCC while accommodating for the sample size imbalances across tumor sites. We recognize that this approach did not fully account for multiple testing and could have missed some pleiotropic regions. Pleiotropic studies are limited by sample size of existent GWAS data, as well as the frequency of variants in these regions. Future studies should investigate this further using lager samples, different methodology, and if possible, including SqCCs from other sites (e.g. cervical, anal and bladder). Also, our analyses were restricted to individuals of European ancestry, performing a similar analysis including other genetic backgrounds offers the potential to pinpoint loci that exert effects across ethnicities. In summary, we provide evidence for one new locus (2q33.1) influencing aerodigestive SqCC risk, and highlight loci for future investigation. Future work should investigate the biological mechanisms underscoring these associations to unearth shared and divergent molecular features of these histologically similar tumors.

Methods

Ethics statement

Informed written consent was obtained from all participants, and all contributing studies have been approved by the IARC Institutional Review Board (IRB; references: 14–03, 13–17, 07–02) which requires to obtain local ethics committees approvals prior to their enrolment and evaluation.

Study population

This meta-analysis includes data from three previous studies of lung squamous cell[16], oral/pharyngeal[17] and upper aerodigestive tract (UADT) cancers[8], totaling 13,887 cases and 61,961 non-overlapping controls. The characteristics and references for each study are summarized in Table 1. The SqCC cases comprise 7,426 LuSqCC, 5,452 OSqCC, part of the OncoArray Consortium (http://epi.grants.cancer.gov/oncoarray/) [16,17], and additional 693 LaSqCC and 316 ESqCC previously included in a upper aerodigestive cancer GWAS[8]. Controls partially overlapped (N = 2,500) between the LuSqCC and OSqCC analyses, and completely overlap (N = 2,847) between the ESqCC and LaSqCC analyses. For this analysis, GWAS summary-statistics for single-site SqCCs were derived using only individuals of European ancestry across multiple epidemiological studies from Europe, North and South America.

Genotyping and imputation

For each of the studies, genomic DNA samples were previously isolated from blood or buccal cells. Genotyping for the lung and oral/pharyngeal cancers OncoArray Consortium[18] studies, was performed at the Center for Inherited Disease Research (CIDR) using the Illumina OncoArray custom designed for cancer studies. Genotype calls were made in GenomeStudio software (Illumina) using a standardized cluster file for OncoArray studies. The esophageal and larynx cancer cases and controls from the upper aerodigestive tract GWA study[8] were genotyped using the Illumina Sentrix HumanHap300 BeadChip at the Centre d’Etude du Polymorphisme Humain (CEPH) and the Centre National Genotypage (CNG Evry France) as previously described[8]. Genotype data have been deposited dbGaP (https://www.ncbi.nlm.nih.gov/gap/) accession number phs001202.v1.p1 for the oral and pharyngeal study[17] and for the lung data[16] accession numbers phs001273.v3.p2 and phs000876.v2.p1. The lung cancer GWA study[16] was imputed using the 1000 genomes reference panel (phase3) (http://phase3browser.1000genomes.org/index.html/) and the oral/pharyngeal cancer, larynx and esophageal cancer GWAS were imputed using the Haplotype Reference Consortium Panel[50] (http://www.haplotype-reference-consortium.org/) using the University of Michigan Imputation Server [51] (https://imputationserver.sph.umich.edu/). Only variants with imputation quality of R2 > 0.3 were used in the meta-analysis.

Summary association statistics and meta-analyses

Cancer risk association results from two previous OncoArray Consortia studies (LuSqCC[16] and OSqCC[17]) and the esophageal and laryngeal analyses ORs, P-values and standard errors for each SNP for each cancer site were obtained using logistic regression with a log additive models adjusted for age, sex and principal components using plink2[52] (https://www.cog-genomics.org/plink2/) and R[53] (http://www.r-project.org/). Summary statistics for the lung SqCC data are deposited in dbGaP (phs001273.v3.p2). The oral and pharyngeal GWAS summary statistics by cancer site and world region have been deposited in the IEU Open GWAS platform (https://gwas.mrcieu.ac.uk/) under the GWAs IDs: ieu-b-89, ieu-b-90, ieu-b-94, ieu-b-96, ieu-b-93, ieu-b-97, ieu-b-91, ieu-b-95 and 98. Meta-analyses were performed using a fixed-effects (F-E) and subset-based meta-analysis using the ASSET software tool [19] (https://dceg.cancer.gov/tools/analysis/asset/). ASSET allows exploration of all possible subsets of studies to identify the strongest association signal, while accounting for subset search multiple testing, and adjust standard errors to account for overlapping controls between analyses; partial overlap (N = 2,500) between the LuSqCC and OSqCC and complete overlap between the ESqCC and LaSqCC analyses. Meta-analysis for a SNP was performed when at least three cancer sites had association results. P-values from both analyses were two-sided. Meta-analyses results for fixed-effects and subset-based were considered noteworthy if these reach P<5x10-7. Loci were considered as new if these had not been previously reported in the single SqCC cancers analysis (P>5x10-7 for any single site). Loci with previously reported LuSqCC or OSqCC were characterized a pleiotropic if: 1) Pmeta<5x10-7; 2) two single cancer association results at P<5x10-4 and consistent effect direction across all cancer sites. All analyses were performed using the R statistical environment version 3.4.3[53]. Linkage disequilibrium (LD) calculations (R2) were performed using the LDlink[54] tool and the 1000 Genomes Project European ancestry populations. Regional association plots were generated using stand-alone LocusZoom v1.4[55] (https://github.com/statgen/locuszoom-standalone/).). Forest plots of association results were produce using the metafor R package[56].

Lung and esophageal cis-eQTLs

To investigate the association between lead SCC variants and mRNA expression, we used three lung eQTL data sets from the Microarray eQTL study. In the Microarray eQTL study[57], lung tissues for eQTL analysis were obtained from patients who underwent lung surgery at three academic sites: Laval University, the University of British Columbia (UBC) and the University of Groningen. Whole-genome gene expression profiling in the lung was performed on a custom Affymetrix array and is available through GEO (https://www.ncbi.nlm.nih.gov/geo/) accession number GSE23546. Genotyping was carried out on the Illumina Human 1M-Duo BeadChip array, data is accessible in dbGaP (phs001745.v1.p1). Genotypes and gene expression levels were available for 408 (Laval University), 342 (Groningen) and 287 (UBC) patients. Microarray and genotypes preprocessing, quality control and eQTL mapping have been described previously[58]. We also investigated top aerodigestive SqCC associations in the public GTEx catalog (V8)[26] for lung and esophageal tissue eQTLs and sQTLs, summary statistics based on RNAseq and genotypes analyses obtained via the GTEx data portal (https://www.gtexportal.org).

Functional genomic annotation and gene-based analyses

To functionally annotate newly identified aerodigestive SqCC regions, we leveraged multiple resources: the Encyclopedia of DNA Elements (ENCODE)[27] (https://www.encodeproject.org/) and ROADMAP Epigenomics[28] (http://www.roadmapepigenomics.org/) catalogs to obtain epi/genomic regulatory annotations (chromatin states, histones, enhancers, promoters and transcription binding sites) for lung and esophageal tissues and cell-types obtained through HaploReg 4.1 using the HaploR R package[59]); the NHGRI-EBI GWAS Catalog (v1.0 e98, https://www.ebi.ac.uk/gwas/) [60] for previously reported disease/phenotype associations and the COSMIC catalogue (v90, https://cancer.sanger.ac.uk/cosmic) for cancer somatic mutation information. To provide additional insight into functional and biological mechanisms underlying aerodigestive SqCC genetic susceptibility, we performed a genome-wide gene-based association analysis (GWGAS) of the SqCC meta-analyses results using MAGMA (Multi-marker Analysis of GenoMic Annotation)[20]. We also used the Functional Mapping and Annotation (FUMA, https://fuma.ctglab.nl/)[21] which maps individually significant SNPs to genes. We selected overlapping genes from the MAGMA (Bonferroni-corrected P-value <2.7x10-6) and FUMA results were used to assemble a list of genes implicated in aerodigestive SqCC genetic risk. This gene list was used to perform a gene-set analysis for curated canonical biological pathways (containing between 10 and 500 genes) from MSigDB collections[61]; including GO[62], KEGG[63], REACTOME[64] and BIOCARTA[61]. Pathway analyses were performed using MAGMA default settings of 10,000 permutations and applied a Bonferroni correction.

Supporting information

S1 Fig. SqCC F-E meta-analyses.

Quantile-quantile plot of the p-values for ASSET F-E meta-analyses results including lung, oral/oropharyngeal, larynx and esophageal SqCCs. (corrected λ = 1.006).

(TIFF)

S2 Fig. Regional association plot at 2q33.1.

Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC meta-analysis at 2q33.1. Genetic variants colored red according to their LD with rs56321285 (2q33.1 lead SNP) and colored in blue according to LD values with second lead SNP rs1830298. rs563321285 and rs1830298 r2 = 0.02.

(TIF)

S3 Fig. Regional association plot at 1q32.1.

Chromosome positions (x-axis) and -log10 P-value (y-axis) of SqCC F-E meta-analysis at 1q32.1. Genetic variants are colored according to their LD with the rs12133735 (red) and with rs4245739 (blue) a variant previously associated with cancer risk; rs12133735 and rs4245739 (r2 = 0.63).

(TIF)

S4 Fig. rs12133735 MDM4 lung eQTL.

Boxplots for rs12133735 and MDM4 gene expression in 3 datasets from the Microarray eQTL study, from left to right: Laval University, University of British Columbia (UBC) and 3. University of Groningen.

(TIF)

S5 Fig. Regional association plot at 5q31.2.

Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 5q31.2. Genetic variants colored according to their LD with the labeled SNP (purple diamond). rs13181561 and rs7447927 (r2 = 0.94).

(TIF)

S6 Fig. Regional association plot at 19p13.11.

Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 19p13.11. Genotyped and imputed variants colored according to their LD with the labeled SNP (purple diamond). rs61494113 and rs56069439 r2 = 1.

(TIF)

S7 Fig. Regional association plot at 6p22.1- 6p21.33.

Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 6p22.1- 6p21.33. Variants colored according to their LD with SNP rs9267123 (lead variant at 6p21.33). rs3116813 (6p22.1) is in moderate LD with rs9267123 (r2 = 0.5). rs1049213 at 6p21.33 is not correlated with rs9267123 (r2 = 0.01).

(TIF)

S8 Fig. Regional association plot at 9p21.3.

Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC meta-analysis at 9p21.3. Variants colored according to their LD with SNP rs7857345 (9p21.3 lead variant).

(TIF)

S9 Fig. Regional association plot at 15p25.1.

Regional association plot at 15q25 Chromosome positions (x-axis) and -log10 P-value (y-axis). A. aerodigestive SqCC P-values; B. Lung SqCC P-values; C. Oral and oropharyngeal cancer SqCC P-values. Genetic variants colored red according to their LD with rs55781567 (lowest P-value at 15q5 in the meta-analysis).

(TIF)

S1 Table. Results with P<5x10-5 aerodigestive SqCC meta-analyses.

All variants with P<5x10-5 in the fixed-effects (F-E) ASSET meta-analyses of aerodigestive SqCC. Results for each SqCC site are also shown.

(XLSX)

S2 Table. Pleiotropic aerodigestive SqCC risk variants.

108 variants with Pmeta<5x10-7; in the fixed-effects (F-E) meta-analyses; two single-cancer analyses at P<5x10-4 and consistent effect direction across cancer sites.

(CSV)

S3 Table. Lung Cis-eQTLs for aerodigestive SqCC loci.

Cis-eQTLs for novel SqCC loci in the lung Microarray eQTL study datasets.

(XLSX)

S4 Table. Cis-eQTLs and cis-sQTLs for new SqCC loci.

Lung and esophageal Cis-eQTLs and cis-sQTLs in the GTEx catalog V8 for new SqCC loci.

(XLSX)

S5 Table. Chromatin states and histone marks in lung and esophageal tissues or cells for new SqCC loci.

Chromatin and histone annotations for new SqCC loci from the Roadmap and ENCODE projects.

(XLSX)

S6 Table. Summary of reported cancer risk associations within the newly SqCC risk loci.

NHGRI-EBI Catalog (v1.0 e98 2020-02-08) reported cancer risk associations for lead SNP (or proxies r2>0.6) within the new SqCC loci.

(XLSX)

S7 Table. Aerodigestive loci genes with somatic mutations.

Genes within SqCC new loci with somatic mutations in the COSMIC catalogue. (release v90, 5th September 2019).

(XLSX)

S8 Table. Significant results from the gene-based aerodigestive SqCC associations.

Analyses performed with MAGMA with 18669 protein-coding genes.

(XLSX)

S9 Table. Aerodigestive SqCC results from the FUMA SNPs to genes mapping.

(XLSX)

S10 Table. Genes overlapping between FUMA and MAGMA analyses.

(XLSX)

S11 Table. Aerodigestive SqCC results from gene set enrichment analyses.

(XLSX)

Acknowledgments

We acknowledge all the participants involved in this research and the funders and support.

The authors would like to thank the staff at the Respiratory Health Network Tissue Bank of the FRQS for their valuable assistance with the lung eQTL data set at Laval University. The lung eQTL study at Laval University was supported by the Fondation de l’Institut Universitaire de Cardiologie et de Pneumologie de Québec, the Respiratory Health Network of the FRQS and the Canadian Institutes of Health Research (MOP-123369). Y. Bossé holds a Canada Research Chair in Genomics of Heart and Lung Diseases.

We thank the ARCAGE study investigators and team including: Pagona Lagiou, Tatiana V. Macfarlane, Franco Merletti, Jerry Polesel, Kristina Kjaerheim, Max Robinson, Wolfgang Ahrens, Lorenzo Simonato, Ariana Znaor, Xavier Castellsague (deceased June 2016), David I. Conway, Ivana Holcátová, Claire M. Healy and Peter Thomson. We thank L. Fernandez for her contribution to the IARC ORC multicenter study. We are also grateful to S. Koifman for his contribution to the IARC Latin America multicenter study (S. Koifman passed away in May 2014).

Where authors are identified as personnel of the International Agency for Research on Cancer / World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer / World Health Organization.

Data Availability

Genotype data have been deposited dbGaP accession number phs001202.v1.p1 for the oral and pharyngeal study [17] and for the lung data [16] accession numbers phs001273.v3.p2 and phs000876.v2.p1. The summary statistics for the lung squamous dataset are deposited in dbGaP (phs001273.v3.p2). The oral and pharyngeal GWAS summary statistics by cancer site and world region have been deposited in the IEU Open GWAS platform (https://gwas.mrcieu.ac.uk/) under the GWAs IDs: ieu-b-89, ieu-b-90, ieu-b-94, ieu-b-96, ieu-b-93, ieu-b-97, ieu-b-91, ieu-b-95 and 98.

Funding Statement

The INTEGRAL-ILCCO OncoArray was supported by the Centre for Inherited Disease Research (26820120008i-0-26800068-1). Genotyping for the oral and oropharyngeal cancer OncoArray was funded through the U.S. National Institute of Dental and Craniofacial Research (NIDCR) grant 1X01HG007780-0. The Integrative Analysis of Lung Cancer Risk and (INTEGRAL) of the International Lung Cancer Consortium (ILCCO) was supported by grants U19-CA148127 and CA148127S1 and more recently by the INTEGRAL grant U19CA203654. ILCCO data harmonization is supported by the Canada Research Chair to R.J.H. and U19 CA203654. C.I.A. is a Research Scholar of the Cancer Prevention Institute of Texas and supported by RR170048. The work of the Houlston Laboratory is funded by Cancer Research UK. The CAPUA study was supported by FIS-FEDER/Spain grant numbers FIS-01/310, FIS-PI03-0365, and FIS-07-BI060604, FICYT/Asturias grant numbers FICYT PB02-67 and FICYT IB09-133, and the University Institute of Oncology (IUOPA), of the University of Oviedo and the Ciber de Epidemiologia y Salud Pública. CIBERESP, SPAIN. The work performed in the CARET study was supported by the National Institute of Health /National Cancer Institute: UM1 CA167462 (PI: Goodman), National Institute of Health UO1-CA6367307 (PIs Omen, Goodman); National Institute of Health R01 CA111703 (PI Chen), National Institute of Health 5R01 CA151989-01A1(PI Doherty). The Liverpool Lung project is supported by the Roy Castle Lung Cancer Foundation. The Harvard Lung Cancer Study was supported by the NIH (National Cancer Institute) grants CA092824, CA090578, CA074386. The Multiethnic Cohort Study was partially supported by NIH Grants CA164973, CA033619, CA63464 and CA148127. The work performed in MSH-PMH study was supported by The Canadian Cancer Society Research Institute (020214), Ontario Institute of Cancer and Cancer Care Ontario Chair Award to R.J.H. and G.L. and the Alan Brown Chair and Lusi Wong Programs at the Princess Margaret Hospital Foundation. The Norway study was supported by Norwegian Cancer Society, Norwegian Research Council. The work in TLC study has been supported in part the James & Esther King Biomedical Research Program (09KN-15), National Institutes of Health Specialized Programs of Research Excellence (SPORE) Grant (P50 CA119997), and by a Cancer Center Support Grant (CCSG) at the H. Lee Moffitt Cancer Center and Research Institute, an NCI designated Comprehensive Cancer Center (grant number P30-CA76292).The Vanderbilt Lung Cancer Study – BioVU dataset used for the analyses described was obtained from Vanderbilt University Medical Center’s BioVU, which is supported by institutional funding, the 1S10RR025141-01 instrumentation award, and by the Vanderbilt CTSA grant UL1TR000445 from NCATS/NIH. Dr. Aldrich was supported by NIH/National Cancer Institute K07CA172294 (PI: Aldrich) and Dr. Bush was supported by NHGRI/NIH U01HG004798 (PI: Crawford). The Copenhagen General Population Study (CGPS) was supported by the Chief Physician Johan Boserup and Lise Boserup Fund, the Danish Medical Research Council and Herlev Hospital. The NELCS study: Grant Number P20RR018787 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). The Kentucky Lung Cancer Research Initiative was supported by the Department of Defense [Congressionally Directed Medical Research Program, U.S. Army Medical Research and Materiel Command Program] under award number: 10153006 (W81XWH-11-1-0781). Views and opinions of, and endorsements by the author(s) do not reflect those of the US Army or the Department of Defense. This research was also supported by unrestricted infrastructure funds from the UK Center for Clinical and Translational Science, NIH grant UL1TR000117 and Markey Cancer Center NCI Cancer Center Support Grant (P30CA177558) Shared Resource Facilities: Cancer Research Informatics, Biospecimen and Tissue Procurement, and Biostatistics and Bioinformatics. The M.D. Anderson Cancer Center study was supported in part by grants from the NIH (P50CA070907, R01 CA176568) (to X. Wu), Cancer Prevention & Research Institute of Texas (RP130502) (to X. Wu), and The University of Texas MD Anderson Cancer Center institutional support for the Center for Translational and Public Health Genomics. Head and Neck studies included in the VOYAGER consortium were supported by NIDCR RO1 DE025712-01. The University of Pittsburgh head and neck cancer case–control study is supported by US National Institutes of Health grants P50CA097190 and P30CA047904. The Carolina Head and Neck Cancer Study (CHANCE) was supported by the National Cancer Institute (R01CA90731). The Head and Neck Genome Project (GENCAPO) was supported by the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP; grants 04/12054-9 and 10/51168-0). The authors thank all the members of the GENCAPO team. The HN5000 study was funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research scheme (RP-PG-0707-10034); the views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the NIHR or the UK Department of Health. The Toronto study was funded by the Canadian Cancer Society Research Institute (020214) and the National Cancer Institute (U19CA148127) and by the Cancer Care Ontario Research Chair. The Rome Study was supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC). The Alcohol-Related Cancers and Genetic Susceptibility Study in Europe (ARCAGE) was funded by the European Commission’s fifth framework programme (QLK1-2001-00182), the Italian Association for Cancer Research, Compagnia di San Paolo/FIRMS, Region Piemonte and Padova University (CPDA057222). The funders did not participate in study design, data collection and analysis, decision to publish or preparation of the manuscript.

References

  • 1.Yan W, Wistuba II, Emmert-Buck MR, Erickson HS. Squamous Cell Carcinoma—Similarities and Differences among Anatomical Sites. Am J Cancer Res. 2011;1(3):275–300. Epub 2011/09/23. 10.1158/1538-7445.am2011-275 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Dotto GP, Rustgi AK. Squamous Cell Cancers: A Unified Perspective on Biology and Genetics. Cancer Cell. 2016;29(5):622–37. Epub 2016/05/12. 10.1016/j.ccell.2016.04.004 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Campbell JD, Yau C, Bowlby R, Liu Y, Brennan K, Fan H, et al. Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas. Cell Rep. 2018;23(1):194–212 e6. Epub 2018/04/05. 10.1016/j.celrep.2018.03.063 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hoadley KA, Yau C, Hinoue T, Wolf DM, Lazar AJ, Drill E, et al. Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer. Cell. 2018;173(2):291–304.e6. Epub 2018/04/07. 10.1016/j.cell.2018.03.022 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T, et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nature genetics. 2008;40(5):616–22. Epub 2008/04/04. 10.1038/ng.109 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008;452(7187):633–7. Epub 2008/04/04. 10.1038/nature06885 . [DOI] [PubMed] [Google Scholar]
  • 7.Truong T, Hung RJ, Amos CI, Wu X, Bickeboller H, Rosenberger A, et al. Replication of lung cancer susceptibility loci at chromosomes 15q25, 5p15, and 6p21: a pooled analysis from the International Lung Cancer Consortium. Journal of the National Cancer Institute. 2010;102(13):959–71. Epub 2010/06/16. 10.1093/jnci/djq178 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.McKay JD, Truong T, Gaborieau V, Chabrier A, Chuang S-C, Byrnes G, et al. A genome-wide association study of upper aerodigestive tract cancers conducted within the INHANCE consortium. PLoS genetics. 2011;7(3):e1001333. 10.1371/journal.pgen.1001333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Akbari MR, Malekzadeh R, Nasrollahzadeh D, Amanian D, Islami F, Li S, et al. Germline BRCA2 mutations and the risk of esophageal squamous cell carcinoma. Oncogene. 2008;27(9):1290–6. Epub 2007/08/29. 10.1038/sj.onc.1210739 . [DOI] [PubMed] [Google Scholar]
  • 10.Wang Y, McKay JD, Rafnar T, Wang Z, Timofeeva MN, Broderick P, et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nature genetics. 2014;46(7):736–41. Epub 2014/06/02. 10.1038/ng.3002 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Delahaye-Sourdeix M, Anantharaman D, Timofeeva MN, Gaborieau V, Chabrier A, Vallee MP, et al. A rare truncating BRCA2 variant and genetic susceptibility to upper aerodigestive tract cancer. J Natl Cancer Inst. 2015;107(5). Epub 2015/04/04. 10.1093/jnci/djv037 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Brennan P, McKay J, Moore L, Zaridze D, Mukeria A, Szeszenia-Dabrowska N, et al. Uncommon CHEK2 mis-sense variant and reduced risk of tobacco-related cancers: case–control study. Human molecular genetics. 2007;16(15):1794–801. 10.1093/hmg/ddm127 [DOI] [PubMed] [Google Scholar]
  • 13.Cybulski C, Masojc B, Oszutowska D, Jaworowska E, Grodzki T, Waloszczyk P, et al. Constitutional CHEK2 mutations are associated with a decreased risk of lung and laryngeal cancers. Carcinogenesis. 2008;29(4):762–5. Epub 2008/02/19. 10.1093/carcin/bgn044 . [DOI] [PubMed] [Google Scholar]
  • 14.Jiang X, Finucane HK, Schumacher FR, Schmit SL, Tyrer JP, Han Y, et al. Shared heritability and functional enrichment across six solid cancers. Nature communications. 2019;10(1):431. Epub 2019/01/27. 10.1038/s41467-018-08054-4 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.de Martel C, Plummer M, Vignat J, Franceschi S. Worldwide burden of cancer attributable to HPV by site, country and HPV type. Int J Cancer. 2017;141(4):664–70. Epub 2017/04/04. 10.1002/ijc.30716 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.McKay JD, Hung RJ, Han Y, Zong X, Carreras-Torres R, Christiani DC, et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nature genetics. 2017;49(7):1126–32. Epub 2017/06/13. 10.1038/ng.3892 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lesseur C, Diergaarde B, Olshan AF, Wunsch-Filho V, Ness AR, Liu G, et al. Genome-wide association analyses identify new susceptibility loci for oral cavity and pharyngeal cancer. Nature genetics. 2016;48(12):1544–50. Epub 2016/11/01. 10.1038/ng.3685 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA, et al. The OncoArray Consortium: A Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomarkers Prev. 2017;26(1):126–35. Epub 2016/10/05. 10.1158/1055-9965.EPI-16-0106 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bhattacharjee S, Rajaraman P, Jacobs KB, Wheeler WA, Melin BS, Hartge P, et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am J Hum Genet. 2012;90(5):821–35. Epub 2012/05/09. 10.1016/j.ajhg.2012.03.015 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.de Leeuw CA, Mooij JM, Heskes T, Posthuma D. MAGMA: generalized gene-set analysis of GWAS data. PLoS computational biology. 2015;11(4):e1004219. Epub 2015/04/18. 10.1371/journal.pcbi.1004219 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nature communications. 2017;8(1):1826. Epub 2017/12/01. 10.1038/s41467-017-01261-5 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Fehringer G, Kraft P, Pharoah PD, Eeles RA, Chatterjee N, Schumacher FR, et al. Cross-Cancer Genome-Wide Analysis of Lung, Ovary, Breast, Prostate, and Colorectal Cancer Reveals Novel Pleiotropic Associations. Cancer Res. 2016;76(17):5103–14. Epub 2016/05/20. 10.1158/0008-5472.CAN-15-2980 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Abnet CC, Wang Z, Song X, Hu N, Zhou FY, Freedman ND, et al. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies. Human molecular genetics. 2012;21(9):2132–41. Epub 2012/02/11. 10.1093/hmg/dds029 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhao XK, Mao YM, Meng H, Song X, Hu SJ, Lv S, et al. Shared susceptibility loci at 2q33 region for lung and esophageal cancers in high-incidence areas of esophageal cancer in northern China. PLoS One. 2017;12(5):e0177504. Epub 2017/05/26. 10.1371/journal.pone.0177504 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic acids research. 2019;47(D1):D941–d7. Epub 2018/10/30. 10.1093/nar/gky1015 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.The Genotype-Tissue Expression (GTEx) project. Nature genetics. 2013;45(6):580–5. Epub 2013/05/30. 10.1038/ng.2653 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. Epub 2012/09/08. 10.1038/nature11247 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518(7539):317–30. Epub 2015/02/20. 10.1038/nature14248 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Haupt S, Mejía-Hernández JO, Vijayakumaran R, Keam SP, Haupt Y. The long and the short of it: the MDM4 tail so far. Journal of molecular cell biology. 2019;11(3):231–44. Epub 2019/01/29. 10.1093/jmcb/mjz007 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Marine JC, Jochemsen AG. MDMX (MDM4), a Promising Target for p53 Reactivation Therapy and Beyond. Cold Spring Harbor perspectives in medicine. 2016;6(7). Epub 2016/07/03. 10.1101/cshperspect.a026237 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Garcia-Closas M, Couch FJ, Lindstrom S, Michailidou K, Schmidt MK, Brook MN, et al. Genome-wide association studies identify four ER negative-specific breast cancer risk loci. Nature genetics. 2013;45(4):392–8, 8e1-2. Epub 2013/03/29. 10.1038/ng.2561 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Milne RL, Kuchenbaecker KB, Michailidou K, Beesley J, Kar S, Lindstrom S, et al. Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer. Nature genetics. 2017;49(12):1767–78. Epub 2017/10/24. 10.1038/ng.3785 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Gansmo LB, Bjornslett M, Halle MK, Salvesen HB, Dorum A, Birkeland E, et al. The MDM4 SNP34091 (rs4245739) C-allele is associated with increased risk of ovarian-but not endometrial cancer. Tumour Biol. 2016;37(8):10697–702. Epub 2016/02/13. 10.1007/s13277-016-4940-2 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Schumacher FR, Al Olama AA, Berndt SI, Benlloch S, Ahmed M, Saunders EJ, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nature genetics. 2018;50(7):928–36. Epub 2018/06/13. 10.1038/s41588-018-0142-8 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Zhai Y, Dai Z, He H, Gao F, Yang L, Dong Y, et al. A PRISMA-compliant meta-analysis of MDM4 genetic variants and cancer susceptibility. Oncotarget. 2016;7(45):73935–44. Epub 2016/10/16. 10.18632/oncotarget.12558 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wang MJ, Luo YJ, Shi ZY, Xu XL, Yao GL, Liu RP, et al. The associations between MDM4 gene polymorphisms and cancer risk. Oncotarget. 2016;7(34):55611–23. Epub 2016/10/16. 10.18632/oncotarget.10877 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yu H, Sturgis EM, Liu Z, Wang LE, Wei Q, Li G. Modifying effect of MDM4 variants on risk of HPV16-associated squamous cell carcinoma of oropharynx. Cancer. 2012;118(6):1684–92. Epub 2011/08/09. 10.1002/cncr.26423 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wu C, Wang Z, Song X, Feng XS, Abnet CC, He J, et al. Joint analysis of three genome-wide association studies of esophageal squamous cell carcinoma in Chinese populations. Nature genetics. 2014;46(9):1001–6. Epub 2014/08/19. 10.1038/ng.3064 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lawrenson K, Kar S, McCue K, Kuchenbaeker K, Michailidou K, Tyrer J, et al. Functional mechanisms underlying pleiotropic risk alleles at the 19p13.1 breast-ovarian cancer susceptibility locus. Nature communications. 2016;7:12675. Epub 2016/09/08. 10.1038/ncomms12675 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Phelan CM, Kuchenbaecker KB, Tyrer JP, Kar SP, Lawrenson K, Winham SJ, et al. Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer. Nature genetics. 2017;49(5):680–91. Epub 2017/03/28. 10.1038/ng.3826 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Vikrant, Sawant UU, Varma AK . Role of MERIT40 in stabilization of BRCA1 complex: a protein-protein interaction study. Biochem Biophys Res Commun. 2014;446(4):1139–44. Epub 2014/03/29. 10.1016/j.bbrc.2014.03.073 . [DOI] [PubMed] [Google Scholar]
  • 42.Brachner A, Foisner R. Lamina-associated polypeptide (LAP) 2alpha and other LEM proteins in cancer biology. Adv Exp Med Biol. 2014;773:143–63. Epub 2014/02/25. 10.1007/978-1-4899-8032-8_7 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ferreiro-Iglesias A, Lesseur C, McKay J, Hung RJ, Han Y, Zong X, et al. Fine mapping of MHC region in lung cancer highlights independent susceptibility loci by ethnicity. Nature communications. 2018;9(1):3927. Epub 2018/09/27. 10.1038/s41467-018-05890-2 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Meijers-Heijboer H, van den Ouweland A, Klijn J, Wasielewski M, de Snoo A, Oldenburg R, et al. Low-penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nature genetics. 2002;31(1):55–9. Epub 2002/04/23. 10.1038/ng879 . [DOI] [PubMed] [Google Scholar]
  • 45.Hancock DB, Reginsson GW, Gaddis NC, Chen X, Saccone NL, Lutz SM, et al. Genome-wide meta-analysis reveals common splice site acceptor variant in CHRNA4 associated with nicotine dependence. Transl Psychiatry. 2015;5:e651. Epub 2015/10/07. 10.1038/tp.2015.149 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Chen D, Truong T, Gaborieau V, Byrnes G, Chabrier A, Chuang S-c, et al. A sex-specific association between a 15q25 variant and upper aerodigestive tract cancers. Cancer Epidemiology Biomarkers & Prevention. 2011;20(4):658–64. 10.1158/1055-9965.EPI-10-1008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Stacey SN, Helgason H, Gudjonsson SA, Thorleifsson G, Zink F, Sigurdsson A, et al. New basal cell carcinoma susceptibility loci. Nature communications. 2015;6:6825. Epub 2015/04/10. 10.1038/ncomms7825 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Couch FJ, Kuchenbaecker KB, Michailidou K, Mendoza-Fandino GA, Nord S, Lilyquist J, et al. Identification of four novel susceptibility loci for oestrogen receptor negative breast cancer. Nature communications. 2016;7:11375. Epub 2016/04/28. 10.1038/ncomms11375 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Eeles RA, Olama AA, Benlloch S, Saunders EJ, Leongamornlert DA, Tymrakiewicz M, et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nature genetics. 2013;45(4):385–91, 91e1-2. Epub 2013/03/29. 10.1038/ng.2560 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. Epub 2015/10/04. 10.1038/nature15393 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Das S, Forer L, Schonherr S, Sidore C, Locke AE, Kwong A, et al. Next-generation genotype imputation service and methods. Nature genetics. 2016;48(10):1284–7. Epub 2016/08/30. 10.1038/ng.3656 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. Epub 2015/02/28. 10.1186/s13742-015-0047-8 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.R Core Development Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2019. [Google Scholar]
  • 54.Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31(21):3555–7. Epub 2015/07/04. 10.1093/bioinformatics/btv402 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2010;26(18):2336–7. Epub 2010/07/17. 10.1093/bioinformatics/btq419 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Viechtbauer W. Conducting meta-analyses in R with the metafor package. Journal of statistical software. 2010;36(3). [Google Scholar]
  • 57.Lamontagne M, Berube JC, Obeidat M, Cho MH, Hobbs BD, Sakornsakolpat P, et al. Leveraging lung tissue transcriptome to uncover candidate causal genes in COPD genetic associations. Human molecular genetics. 2018;27(10):1819–29. Epub 2018/03/17. 10.1093/hmg/ddy091 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hao K, Bosse Y, Nickle DC, Pare PD, Postma DS, Laviolette M, et al. Lung eQTLs to help reveal the molecular underpinnings of asthma. PLoS Genet. 2012;8(11):e1003029. Epub 2012/12/05. 10.1371/journal.pgen.1003029 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zhbannikov IY, Arbeev K, Ukraintseva S, Yashin AI. haploR: an R package for querying web-based annotation tools. F1000Res. 2017;6:97. Epub 2017/06/20. 10.12688/f1000research.10742.2 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic acids research. 2017;45(D1):D896–D901. Epub 2016/12/03. 10.1093/nar/gkw1133 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Liberzon A, Birger C, Thorvaldsdottir H, Ghandi M, Mesirov JP, Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell systems. 2015;1(6):417–25. Epub 2016/01/16. 10.1016/j.cels.2015.12.004 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Gene Ontology Consortium: going forward. Nucleic acids research. 2015;43(Database issue):D1049–56. Epub 2014/11/28. 10.1093/nar/gku1179 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic acids research. 2016;44(D1):D457–62. Epub 2015/10/18. 10.1093/nar/gkv1070 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, et al. The reactome pathway knowledgebase. Nucleic acids research. 2020;48(D1):D498–d503. Epub 2019/11/07. 10.1093/nar/gkz1031 . [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Peter McKinnon, Stephen J Chanock

17 Jul 2020

Dear Dr Brennan,

Thank you very much for submitting your Research Article entitled 'Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review again a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see our guidelines.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Stephen J. Chanock

Guest Editor

PLOS Genetics

Peter McKinnon

Section Editor: Cancer Genetics

PLOS Genetics

The editors would be willing to consider a new manuscript in which the current work serves as a starting point, and that includes a more thorough and rigorous presentation of the statistical rigor, including the rationale and interpretation.

There was a substantial difference of opinion between reviewers about the statistical analyses and interpretation.

A major revision should address these issues.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: This manuscript presents a meta-analysis of four aerodigestive squamous cell carcinoma (SqCC) GWAS. The study primarily searches for new loci that reach significance via meta-analysis and for evidence of pleiotropy at previously-identified loci (i.e., effects on two or more cancer subtypes). While power is limited, the analysis looks generally well-done. The authors find one new genome-wide significant locus (TMEM237) and three new loci with suggestive significance (P<5e-7); of the three suggestive loci, the MDM4 locus seems quite plausible. However, I have a concern about statistical significance in the pleiotropy analysis and have a few other questions and comments.

1. In their pleiotropy analyses, the authors used a significance threshold of P<5e-3 in at least one other cancer site, but they do not provide justification for this threshold. I am concerned that P<5e-3 does not sufficiently correct for multiple hypotheses tested, which at least include 10 loci x 3 other cancer sites. Also, was only one SNP tested per locus in these pleiotropy analyses, or were multiple SNPs tested? Line 343 "The lead SNP exhibiting pleiotropy..." suggests the latter; if so, then even more hypotheses were tested.

2. Table 2 is titled "Novel genomic regions..." but only the first of the four listed associations is actually genome-wide significant. The authors should separate significant vs. suggestive results in the table. They could also consider moving some of the main text describing suggestive results (which is quite long) to a supplementary note.

3. The overview of analyses indicates that the pleiotropy analyses considered loci that previously reached P<5e-7 in a single-cancer analysis, but the 9p21.3 locus in Table 3 does not appear to satisfy this criterion. Perhaps they should just use a criterion of loci reaching genome-wide significance in their meta-analysis?

4. The authors note that their lead SNP at the MDM4 locus is an eQTL for MDM4 and that MDM4 upregulation can inactivate p53, leading to cancer. Is the eQTL effect direction consistent with this proposed mechanism?

5. A few software packages mentioned in the main text (e.g., ASSET and FUMA) are not defined until Methods. Providing brief descriptions in the main text would improve readability.

6. The legend of Figure 1 indicates that the y-axis was truncated at P=5e-30. Indicating the most significant P-value would be helpful (either in the figure or just by stating it in the legend).

Reviewer #2: This paper combines the results from genome-wide association studies (GWAS) of four aerodigestive squamous cell cancers in an effort to identify (a) novel loci that area associated with risk for more than one of the four cancers and (b) identify regions (perhaps already identified by one or more GWAS) that are plausibly associated with more than one cancer. The authors adopt two sensible and complementary approaches to combine results (i) a simple fixed-effect meta analysis (which will have greatest power when an alleles is associated with all four cancers in the same direction and magnitide) and (ii) a subset-based test (which allows for the possibility that not all cancers will be associated with an allele or in the same direction). They report one new locus and a handful of pleiotropic loci.

The paper's methods are reasonable and sound, and the results are reported with care and appropriate caution (given the quite small sample sizes for two of the studied cancers). Like many GWAS, the ultimate biological implications of the findings are unclear and left for others to sort out--not that there's anything wrong with that. My one substantive comment has to do with the striking lack of detectable pleiotropy at 15q25.1. Considering that lung cancer and head and neck cancer both have very strong genetic correlations with smoking behaviors (see ref 14), it's quite puzzling that the locus with the strongest cigarettes per day association is associated with LuSqCC but not OSqCC. This really deserves more comment--why do the authors think this is the case? Are the analyses from ref 14 not broken out by subtype, or at least not the the same way as here?

My remaining comments are minor presentation and grammar points. One more good read over by a copyeditor to clean up run-on sentences, subject-verb agreement and the like would be helpful.

line 217 "...fixed-effects (F-E) ^{meta-analysis} and a subset-based meta-analysis ^{using ASSET}." To disambiguate.

lines 220-221 "Loci that reached suggestive .... considered noteworthy." Circular definitions. "Colors we defined as red were labeled as crimson." Instead: "Loci with p<5e-7 were considered noteworthy" or "We defined suggestive genome-wide significance as p<5e-7."

line 225 Can you give some motivation as to why p<5e-3 was considered as evidence for pleiotropic effects on a trait at a locus known to be genome-wide significantly associated with another trait?

lines 226-228. "Additionally... 19p13.11." This is not a sentence. Delete "that"?

lines 273-276. Not a sentence. Seems to be two sentences pasted together, with the second sentence missing a verb.

line 306 "this analysis" not "this analyses"

line 333 state the reference panel used to calculate r2. Obviously the r2 cannot be 1 in the study samples or the p-values for rs614etc and rs560etc would be identical.

lines 344-346 and throughout the text (tables are better about this): SNPs are not associated with increases or decreases in risk, alleles (relative to other alleles) are. If you must use "[xxx] is associated with increased risk" language, you have to refer to the effect allele. "rs12345[G] is associated with an increased risk," not "rs12345 is associated with an increased risk."

line 352 see comment on 344-346.

line 358 see comment on 344-346.

line 365 see comment on 344-346.

line 430 no comma after based

lines 442-443 I don't know how "interesting" this observation is. You see what you are powered to see. There may be lots of pleiotropic loci with heterogeneous effects, but in these sample sizes, they may be hard to detect--certainly less power than the SNPs where everything aligns. This sentence is a little tautological, a little post-hoc-power-calculation-y--"we were well powered to detect the things we detected." You should note that there may be heterogeneous pleiotropic loci that you've missed b/c of low power.

line 455 discovered not discover

lines 457-459 Run on. Semi-colon or full stop after "limited."

Reviewer #3: This manuscript describes an analysis in which GWAS results from multiple from multiple aero-digestive squamous cell cancers (lung, oral cavity, oropharynx, larynx, and esophagus) are combined in order to identify loci with pleiotropic effects on multiple cancer types. This is a novel and unexplored hypothesis. There is one novel association observed at a genome-wide significance P-value threshold (in the TMEM237 gene region). All other loci passing this threshold have been reported previously for specific cancer types. The authors report several suggestive association signals and describe pleiotropy for previously identified association signals.

My primary concerns is the lack of replication for the novel locus and the strong emphasis on describing and characterizing the suggestive association signals (which seems like more attention that is warranted based on the statistical evidence). A few additional comments are below:

Introduction: On point of clarification: this GWAS is Europeans only? Or does it also include individuals of European ancestry from outside of Europe?

In my view, there is an over-emphasis of “suggestive” association signals in this paper (i.e., associations with P<10-7). Several signals passing this threshold are expected under the null. A substantial amount of text is this paper is devoted to discussing the biological and epidemiological evidence supporting regions identified at this suggestive P-value threshold.

A P-value threshold of 5x10-3 was used to identify pleiotropic loci. How was this threshold determined? Consider a systematic approach to test for pleiotropy with an explicit multiple testing adjustment.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: No: No mention is made re: availability of the summary statistics from the four GWAS contributing to this analysis. To the best of my knowledge, none are publicly available.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 1

Peter McKinnon, Stephen J Chanock

12 Oct 2020

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr Brennan,

Thank you very much for submitting your Research Article entitled 'Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Stephen J. Chanock

Guest Editor

PLOS Genetics

Peter McKinnon

Section Editor: Cancer Genetics

PLOS Genetics

The authors have answered nearly all of the queries and shoudl provide the details of exactly how the summary data can be available for verification and further exploration.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have satisfactorily addressed my comments. Providing some justification for the significance criterion of "two single cancer association results at P<5x10-4" would still be helpful. (The response letter seems to have a typo and states P<0.0001 instead of P<5x10-4.)

Reviewer #2: The authors have thoughtfully responded to my comments.

Reviewer #3: The authors have addressed my comments.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: No: It is not clear that summary statistics for individual cancers or the cross-cancer analyses have been made publicly available. Hence it would be nearly impossible to reproduce the Manhattan plot--researchers would have to download dbGAP data sets, run their own QC and GWAS analyses, then run their own cross-cancer analyses. Inevitably, small differences in analysis pipelines will lead to different results.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 2

Peter McKinnon, Stephen J Chanock

5 Nov 2020

Dear Dr Brennan,

We are pleased to inform you that your manuscript entitled "Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Stephen J. Chanock

Guest Editor

PLOS Genetics

Peter McKinnon

Section Editor: Cancer Genetics

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-20-00903R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Peter McKinnon, Stephen J Chanock

28 Feb 2021

PGENETICS-D-20-00903R2

Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers

Dear Dr Brennan,

We are pleased to inform you that your manuscript entitled "Genome-wide association meta-analysis identifies pleiotropic risk loci for aerodigestive squamous cell cancers" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Alice Ellingham

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. SqCC F-E meta-analyses.

    Quantile-quantile plot of the p-values for ASSET F-E meta-analyses results including lung, oral/oropharyngeal, larynx and esophageal SqCCs. (corrected λ = 1.006).

    (TIFF)

    S2 Fig. Regional association plot at 2q33.1.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC meta-analysis at 2q33.1. Genetic variants colored red according to their LD with rs56321285 (2q33.1 lead SNP) and colored in blue according to LD values with second lead SNP rs1830298. rs563321285 and rs1830298 r2 = 0.02.

    (TIF)

    S3 Fig. Regional association plot at 1q32.1.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) of SqCC F-E meta-analysis at 1q32.1. Genetic variants are colored according to their LD with the rs12133735 (red) and with rs4245739 (blue) a variant previously associated with cancer risk; rs12133735 and rs4245739 (r2 = 0.63).

    (TIF)

    S4 Fig. rs12133735 MDM4 lung eQTL.

    Boxplots for rs12133735 and MDM4 gene expression in 3 datasets from the Microarray eQTL study, from left to right: Laval University, University of British Columbia (UBC) and 3. University of Groningen.

    (TIF)

    S5 Fig. Regional association plot at 5q31.2.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 5q31.2. Genetic variants colored according to their LD with the labeled SNP (purple diamond). rs13181561 and rs7447927 (r2 = 0.94).

    (TIF)

    S6 Fig. Regional association plot at 19p13.11.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 19p13.11. Genotyped and imputed variants colored according to their LD with the labeled SNP (purple diamond). rs61494113 and rs56069439 r2 = 1.

    (TIF)

    S7 Fig. Regional association plot at 6p22.1- 6p21.33.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC F-E meta-analysis at 6p22.1- 6p21.33. Variants colored according to their LD with SNP rs9267123 (lead variant at 6p21.33). rs3116813 (6p22.1) is in moderate LD with rs9267123 (r2 = 0.5). rs1049213 at 6p21.33 is not correlated with rs9267123 (r2 = 0.01).

    (TIF)

    S8 Fig. Regional association plot at 9p21.3.

    Chromosome positions (x-axis) and -log10 P-value (y-axis) SqCC meta-analysis at 9p21.3. Variants colored according to their LD with SNP rs7857345 (9p21.3 lead variant).

    (TIF)

    S9 Fig. Regional association plot at 15p25.1.

    Regional association plot at 15q25 Chromosome positions (x-axis) and -log10 P-value (y-axis). A. aerodigestive SqCC P-values; B. Lung SqCC P-values; C. Oral and oropharyngeal cancer SqCC P-values. Genetic variants colored red according to their LD with rs55781567 (lowest P-value at 15q5 in the meta-analysis).

    (TIF)

    S1 Table. Results with P<5x10-5 aerodigestive SqCC meta-analyses.

    All variants with P<5x10-5 in the fixed-effects (F-E) ASSET meta-analyses of aerodigestive SqCC. Results for each SqCC site are also shown.

    (XLSX)

    S2 Table. Pleiotropic aerodigestive SqCC risk variants.

    108 variants with Pmeta<5x10-7; in the fixed-effects (F-E) meta-analyses; two single-cancer analyses at P<5x10-4 and consistent effect direction across cancer sites.

    (CSV)

    S3 Table. Lung Cis-eQTLs for aerodigestive SqCC loci.

    Cis-eQTLs for novel SqCC loci in the lung Microarray eQTL study datasets.

    (XLSX)

    S4 Table. Cis-eQTLs and cis-sQTLs for new SqCC loci.

    Lung and esophageal Cis-eQTLs and cis-sQTLs in the GTEx catalog V8 for new SqCC loci.

    (XLSX)

    S5 Table. Chromatin states and histone marks in lung and esophageal tissues or cells for new SqCC loci.

    Chromatin and histone annotations for new SqCC loci from the Roadmap and ENCODE projects.

    (XLSX)

    S6 Table. Summary of reported cancer risk associations within the newly SqCC risk loci.

    NHGRI-EBI Catalog (v1.0 e98 2020-02-08) reported cancer risk associations for lead SNP (or proxies r2>0.6) within the new SqCC loci.

    (XLSX)

    S7 Table. Aerodigestive loci genes with somatic mutations.

    Genes within SqCC new loci with somatic mutations in the COSMIC catalogue. (release v90, 5th September 2019).

    (XLSX)

    S8 Table. Significant results from the gene-based aerodigestive SqCC associations.

    Analyses performed with MAGMA with 18669 protein-coding genes.

    (XLSX)

    S9 Table. Aerodigestive SqCC results from the FUMA SNPs to genes mapping.

    (XLSX)

    S10 Table. Genes overlapping between FUMA and MAGMA analyses.

    (XLSX)

    S11 Table. Aerodigestive SqCC results from gene set enrichment analyses.

    (XLSX)

    Attachment

    Submitted filename: PlosGen_Review_Response_2.docx

    Attachment

    Submitted filename: PlosGen_Review_ResponseV2-2Nov20.docx

    Data Availability Statement

    Genotype data have been deposited dbGaP accession number phs001202.v1.p1 for the oral and pharyngeal study [17] and for the lung data [16] accession numbers phs001273.v3.p2 and phs000876.v2.p1. The summary statistics for the lung squamous dataset are deposited in dbGaP (phs001273.v3.p2). The oral and pharyngeal GWAS summary statistics by cancer site and world region have been deposited in the IEU Open GWAS platform (https://gwas.mrcieu.ac.uk/) under the GWAs IDs: ieu-b-89, ieu-b-90, ieu-b-94, ieu-b-96, ieu-b-93, ieu-b-97, ieu-b-91, ieu-b-95 and 98.


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES