Abstract
Recent genome-wide association studies (GWAS) and subsequent meta-analyses have identified over 25 SNPs at 18 loci, together accounting for >15% of the genetic susceptibility to testicular germ cell tumour (TGCT). To identify further common SNPs associated with TGCT, here we report a three-stage experiment, involving 4098 cases and 18 972 controls. Stage 1 comprised previously published GWAS analysis of 307 291 SNPs in 986 cases and 4946 controls. In Stage 2, we used previously published customised Illumina iSelect genotyping array (iCOGs) data across 694 SNPs in 1064 cases and 10 082 controls. Here, we report new genotyping of eight SNPs showing some evidence of association in combined analysis of Stage 1 and Stage 2 in an additional 2048 cases of TGCT and 3944 controls (Stage 3). Through fixed-effects meta-analysis across three stages, we identified a novel locus at 3q25.31 (rs1510272) demonstrating association with TGCT [per-allele odds ratio (OR) = 1.16, 95% confidence interval (CI) = 1.06–1.27; P = 1.2 × 10−9].
INTRODUCTION
Testicular germ cell tumour (TGCT) is the most common malignancy in men aged 15–45 years (1). Incidence of the disease varies considerably between ethnic groups, with populations of Western European descent showing significantly higher incidence than African or Asian populations (2). Incidence of TGCT has increased rapidly in recent decades, with rates more than doubling in Norway, Finland and parts of Israel/Germany over the past 30 years (3,4). Rates in the UK, USA and the majority of other European countries have experienced increases in the range 50–75% across the same time period (2). A plateauing effect is beginning to emerge in certain countries, such as Denmark and Switzerland (5); however, the general trend of rapidly growing incidence is expected to continue (2). Reported risk factors for TGCT include a family history of the disease, a history of undescended testis, microlithiasis (6) as well as history of previous germ cell tumour. Numerous studies have investigated the influence of environmental and maternal risk factors on TGCT, but these association remain inconclusive and explanation for the rapid rise in incidence to date remains elusive (7,8). Studies in families demonstrate that compared with the general male population, the risk of TGCT to brothers of a case is elevated 8- to 10-fold and to fathers/sons of a case 4- to 6-fold. These risks are substantially higher than the equivalent familial relative risks of ∼2-fold typical for common cancers such as breast, colorectal and prostate (9). Twin studies of testicular cancer, which are although small in size, demonstrate a higher risk to monozygotic than dizygotic twins (10). In addition, migration studies show that lower risk immigrant populations maintain lower disease rates, even after several generations of settlement in higher risk host countries (11). Together these epidemiologic and genetic epidemiologic observations support a significant genetic component contributing to development of TGCT.
Linkage analysis undertaken in 237 families identified no genetic loci showing clear evidence of linkage to TGCT, and through candidate association studies, only a single rare deletion of the Y-chromosome was identified, accounting for <1% of the familial risk of TGCT (12,13). More recently, genome-wide association studies (GWAS) of TGCT have been conducted, which together have identified a total of 18 loci predisposing to testicular cancer risk, accounting for 15–20% of the familial risk of TGCT (14–20), see Table 1. The first locus identified at 12q21, encompassing KITLG, remains the strongest genetic risk factor for TGCT, with a per-allele odds ratio (OR) in excess of 2.5. KITLG encodes the ligand for the membrane-bound receptor tyrosine kinase KIT, and the KIT/KITLG system is of interesting relevance to TGCT due to its role in regulating the survival, proliferation and migration of germ cells (21). Furthermore, somatic mutations in KIT are observed in ∼10% of human TGCTs, and mouse models with germline heterozygous deletion of KITLG demonstrate an increased risk of testicular tumours (22,23). In addition, the second (5q31, SPRY4) and third (6p21, BAK1) TGCT risk loci also reside in related pathways. For example, SPRY4 inhibits the mitogen-activated protein kinase pathway, which is activated by KIT/KITLG, whereas BAK1 expression is negatively regulated by KIT (24,25). The most recent TGCT risk loci identified across 2010–2013 have also provided plausible insights into testicular germ cell tumorigenesis, implicating genes involved in germ cell differentiation/specification (DAZL at 3p24.3, PRDM14 at 8q13.3), sex determination (DMRT1 at 9p24), telomerase regulation (TERT/CLPTM1L at 5p15, ATF7IP at 12p13, PITX1 at 5q31.1) and microtubule assembly (TEX14 at 17q22, CENPE at 4q24, PMF1 at 1q22, MAD1L1 at 7p22.3) (14–16,19).
Table 1.
Eighteen TGCT predisposition loci identified in published GWAS to date
| SNPa | Gene(s) | Chromosome position | Study | Risk allele frequency | Per-allele OR |
|---|---|---|---|---|---|
| rs995030 | KITLG | 12q21 | GWAS, Rapley et al. (14)/Kanetsky et al. (17) | 0.8 | 2.55 |
| rs4624820 | SPRY4 | 5q31 | GWAS, Rapley et al. (14)/Kanetsky et al. (17) | 0.54 | 1.37 |
| rs210138 | BAK1 | 6p21 | GWAS, Rapley et al. (14) | 0.2 | 1.5 |
| rs4635969 | TERT/CLPTM1L | 5p15 | GWAS, Turnbull et al. (16) | 0.2 | 1.54 |
| rs2900333 | ATF7IP | 12p13 | GWAS, Turnbull et al. (16) | 0.62 | 1.27 |
| rs755383 | DMRT1 | 9p24 | GWAS, Turnbull et al.(16)/Kanetsky et al. (18) | 0.62 | 1.37 |
| rs10510452 | DAZL | 3p24.3 | iCOGs, Ruark et al. (15) | 0.7 | 1.24 |
| rs2720460 | CENPE | 4q24 | iCOGs, Ruark et al. (15) | 0.62 | 1.24 |
| rs8046148 | HEATR3 | 16q12.1 | iCOGs, Ruark et al. (15) | 0.79 | 1.32 |
| rs7010162 | PRDM14 | 8q13.3 | iCOGs, Ruark et al. (15) | 0.62 | 1.22 |
| rs2839243 | Non-coding | 21q22.3 | iCOGs, Ruark et al. (15) | 0.47 | 1.26 |
| rs2072499 | Non-coding | 1q22 | iCOGs, Ruark et al. (15) | 0.35 | 1.19 |
| rs3805663 | CATSPER3/PITX1 | 5q31.1 | iCOGs, Ruark et al. (15) | 0.63 | 1.25 |
| rs3790672 | Non-coding | 1q24.1 | iCOGs, Ruark et al. (15)/Schumacher et al. (20) | 0.28 | 1.2 |
| rs9905704 | RAD51C/TEX14/PPM1E | 17q22 | iCOGs, Ruark et al. (15)/meta-analysis, Chung et al. (19) | 0.68 | 1.21 |
| rs4888262 | RFWD3 | 16q22.3 | Meta-analysis, Chung et al. (19) | 0.458 | 1.21 |
| rs12699477 | MAD1L1 | 7p22.3 | Meta-analysis, Chung et al. (19) | 0.38 | 1.16 |
| rs17021463 | HPGDS | 4q22.2 | Meta-analysis, Chung et al. (19) | 0.42 | 1.15 |
For loci with multiple reported SNPs, the marker listed is taken from first study referenced in the fourth column.
It is likely that a significant proportion of the remaining ∼80–85% of unexplained genetic susceptibility to TGCT resides in additional common SNPs, as yet undetected. Power calculations for TGCT GWAS analyses suggest that whereas it is unlikely that additional SNPs of the frequency and effect size of the 12q21(KITLG) locus remain unidentified, there are multiple further SNPs associated with TGCT of more modest effect sizes (14–16). However, as per studies in other diseases, identification of these additional novel SNPs is increasingly challenging, demanding studies of larger sample size or more complex design to acquire the necessary statistical power to detect common SNPs of progressively lower effect size. To identify further common SNPs associated with TGCT, here we undertook a three-stage experiment, involving 4098 cases of TGCT and 18 972 controls. We report a novel locus at 3q25 (rs1510272) for which association with TGCT was evident across all three experimental stages with resulting fixed-effects meta-analysis achieving genome-wide significance of P < 5 × 10−8.
RESULTS
Genome-wide association analysis was undertaken in 986 cases of TGCT and 4946 controls for 307 291 SNPs that passed our quality control (QC) criteria (Stage 1) as previously described (14,16). Six hundred and ninety-four of the most strongly associated SNPs from Stage 1 were included on a customized Illumina iSelect genotyping array for which we genotyped 1064 additional cases and 10 082 additional controls, as previously described (Stage 2) (15). From these analyses, we have previously reported 17 SNPs at 15 loci for which association with TGCT was demonstrated at genome-wide significance (14–16). We then undertook a third stage in which we selected eight additional SNPs, prioritized based on a strong overall association from meta-analysis of Stages 1 and 2 and consistent OR effect sizes in both prior stages. These eight SNPs were genotyped in an additional 2048 cases of TGCT and 3944 controls (Stage 3, see Fig. 1 for experiment design). All case and control samples were from the UK and formed unique sets, with no individuals overlapping between stages. We tested the association between each SNP and TGCT risk at each stage using a 1-df trend test, with data from Stages 1 and 2 being adjusted for six principal components. Inflation in the test statistics was observed at only modest levels, with values before adjustment for principal components being: Stage 1 inflation factor (λ) = 1.08 [equivalent to the inflation for a study of 1000 cases/controls of (λ1000) = 1.05] and Stage 2 λ = 1.14 (λ1000 = 1.07). After adjustment for principal components: Stage 1 λ = 1.00 (λ1000 = 1.00) and Stage 2 λ = 1.04(λ1000 = 1.02). Principal component analysis and inflation calculations were not performed in experiment Stage 3 owing to the small number of SNPs.
Figure 1.
Three-stage study design: Genotyping conducted over three stages comprising non-overlapping samples from the UK.
A combined fixed-effects meta-analysis was performed using data from all three experimental stages, across the eight selected SNPs (see section Materials and methods). In the combined meta-analysis, two SNPs were identified with P-values < 5 × 10−8. One of these associations was the marker rs3778969, which is at the 7p22.3 locus (MAD1L1). This association has since been elsewhere reported, with an OR consistent with that which we detected (19). We also identified novel association for SNP rs1510272 at loci 3q25.31 (per-allele OR = 1.16, 95% CI = 1.06–1.27; P = 1.2 × 10−9) (Table 2).
Table 2.
Summary results for all SNPs
| SNPa | Chromosomeb | Locationc | Allelesd | RAFe | Stage 1—GWAS |
Stage 2—iCOGs |
Stage 3—replication |
Combined Pmetah | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| ORf (95% CI) | Ptrend | ORf (95% CI) | Ptrendg | OR (95% CI) | Ptrend | ||||||
| rs351413 | 1 | 210524037 | T/C | 0.40 | 1.24 (1.12–1.37) | 2E − 05 | 1.17 (1.06–1.28) | 1.1E − 03 | 1.06 (0.98–1.15) | 1.2E − 01 | 5.8E − 07 |
| rs6738087 | 2 | 65445829 | C/T | 0.61 | 1.21 (1.1–1.34) | 1E − 04 | 1.15 (1.05–1.27) | 2.5E − 03 | 0.96 (0.89–1.04) | 3.4E − 01 | 3.1E − 03 |
| rs1510272 | 3 | 157783418 | C/T | 0.73 | 1.30 (1.16–1.46) | 9E − 06 | 1.17 (1.06–1.31) | 3.1E − 03 | 1.16 (1.06–1.27) | 8.8E − 04 | 1.2E − 09 |
| rs204990 | 6 | 32269408 | T/G | 0.14 | 1.21 (1.08–1.36) | 8E − 04 | 1.20 (1.08–1.34) | 6.2E − 04 | 1.10 (1.01–1.21) | 3.5E − 02 | 5.2E − 07 |
| rs2280152 | 6 | 112482582 | T/C | 0.78 | 1.24 (1.11–1.38) | 2E − 04 | 1.15 (1.04–1.28) | 8.2E − 03 | 1.02 (0.94–1.11) | 6.5E − 01 | 2.6E − 04 |
| rs3778969 | 7 | 2106516 | A/G | 0.31 | 1.19 (1.08–1.32) | 5E − 04 | 1.19 (1.08–1.3) | 2.9E − 04 | 1.12 (1.03–1.21) | 4.7E − 03 | 1.7E − 08 |
| rs1849003 | 19 | 21934101 | A/G | 0.61 | 1.23 (1.11–1.36) | 6E − 05 | 1.18 (1.07–1.3) | 6.0E − 04 | 1.03 (0.96–1.11) | 4.3E − 01 | 9.0E − 06 |
| rs1824794 | 19 | 22881436 | T/C | 0.78 | 1.31 (1.16–1.48) | 2E − 05 | 1.21 (1.08–1.35) | 1.0E − 03 | 1.08 (0.99–1.19) | 7.4E − 02 | 3.1E − 07 |
dbSNP rs number.
Chromosome.
Build 36 bp position.
Risk-associated/non-risk-associated alleles.
RAF: frequency of the risk allele.
OR: per-allele odds ratio.
Ptrend: P-value for trend, via logistic regression.
Pmeta: P-value for fixed-effects meta-analysis.
For our newly identified SNP at 3q25, we examined evidence of departure from a log-additive (multiplicative) model, to assess any genotype-specific effect. Using Stage 3 data, the individual genotypic ORs were calculated to be ORhet = 1.10, ORhom = 1.29, compared with the per-allele OR = 1.16 with the common variant as the baseline. Testing for a difference in these 1 d.f. and 2 d.f. logistic regression models shows there is no evidence of deviation from a log-additive model (P = 0.61). We also tested for the evidence of interaction between this new SNP and any of the 15 existing TGCT loci previously identified within our group, using Stage 1 data. For each SNP:SNP pair, the combined risk was consistent with the product of the individual risks; hence, no significant deviation from a log-additive combination of risks was observed.
rs1510272 lies in a linkage disequilibrium (LD) block of 51 kb at 3q25.31, in an intergenic region flanked by genes SSR3 (encoding signal sequence receptor, gamma) and KCNA1B (encoding potassium voltage-gated channel, shaker-related subfamily, beta member) in the 5′ direction and TIPARP [encoding TCDD-Inducible Poly(ADP-Ribose) Polymerase] located 92 kb 3′ to the SNP (Figs 2 and 3). Of these flanking genes (which lie in different LD blocks to that of rs1510272), TIPARP is of most biological relevance, owing to its reported function in the platelet-derived growth factor (PDGF) signalling pathway (27,28). The PDGF pathway is required for correct development of both foetal and adult Leydig cells, which are located adjacent to the seminiferous tubules in the testicle and are responsible for producing testosterone (29). Of note, in male humans, a number of testicular abnormalities including undescended testes have been linked to testosterone insufficiency (30).
Figure 2.
Regional plot of the new TGCT loci at 3q25: Plot shows the genomic region of association with TGCT on chromosome 3q25.31. Shown by diamonds are the −log10 association P-values of genotyped SNPs, based on meta-analysis across genotyping Stages 1–3 (for rs1510272) and Stages 1–2 for all other genotyped SNPs. Shown by squares are imputed SNPs at this locus, which were imputed from the Stage 1 dataset (see Materials and methods). The intensity of red shading indicates the strength of LD with the index SNP (labelled). Also shown are the SNP-build 36 coordinates in kilobases (kb), recombination rates in centimorgans (cm) per megabase (Mb) (in blue) and the genes in the region (in green).
Figure 3.
Linkage disequilibrium structure for the new TGCT loci: Plot shows the LD structure for the new TGCT loci at 3q25 (data from Stage 1). rs1510272 is shown as a green triangle, with other genotyped SNPs shown as green bars. The LD block of 51 kb in which rs1510272 resides is outlined in with a thick black line. Other LD blocks, as defined by Oxford recombination hotspots (26), are outlined with thick black lines.
To fine map the new locus, imputation was performed based on Stage 1 experimental data (986 cases and 4946 controls) and using 1000 Genomes reference panel data, as detailed in the section Materials and methods. There were no imputed SNPs with a stronger signal than rs1510272, the strongest imputed association was from rs10663738 (P = 9.5 × 10−7), located 23 kb upstream of our reported SNP, with all of the top 10 imputed hits (P < 3.0 × 10−6) clustered in a region 10–25 kb upstream of our lead SNP (Fig. 3). There were no strongly associated SNPs within the coding regions of either SSR3 or TIPARP; the strongest association at this locus within any coding region was at rs2280031 (P = 8.7 × 10−5), in exon 13 of KCNA1B; however, the base substitution is synonymous.
We then investigated for evidence of association between genotype and differential gene expression at rs1510272 using eQTL (expression quantitative trait loci) data. The strongest association identified (P = 0.007) was for SSR3 expression in data from T-cells: however, after correction for testing in multiple tissue types, this result no longer remained significant (corrected for 20 tests, adjusted P-value threshold <0.0025). There was no evidence of the association between rs1510272 and TIPARP expression in the range of cell types available (31). To refine the signal at this locus, we next assessed SNPs in the same LD block as rs1510272, under the hypothesis that the functional mechanism of association lies elsewhere in the block. There were 112 SNPs in LD with our lead SNP (r2 >0.4), which we again tested for association between genotype and differential gene expression using the same eQTL data as mentioned above. Of the 112 SNPs tested, the strongest association was mediated through SNP rs3773714 (r2 = 0.40), which had association with differential SSR3 expression of P = 3.1 × 10−4. Using a P-value threshold of P < 6.0 × 10−5 (corrected for 784 tests, i.e. 112 SNPs in 7 cell lines), this association no longer remained significant. These expression data relate to analyses in fibroblasts and lymphocytes: evaluation of the influence of rs1510272 and other SNPS in the LD block upon gene expression in testicular germ cells would be more informative.
Enhancer histone markers at rs1510272 are reported in several cell types in both ENCODE and ROADMAP (32,33). In addition, rs1510272 is shown to be a DNase hypersensitivity site in several cell types and also has chromatin immunoprecipitation (ChIP) evidence of the transcriptional regulator Early B-Cell Factor 1 (EBF1) binding at this site (32). Again these data largely relate to analyses in lymphocytes: evaluation in testicular germ cells of functional elements relating to rs1510272 would be more informative with regard to possible mechanisms of transcriptional regulation.
DISCUSSION
In this study, we report a three-stage GWAS experiment, genotyping over 23 000 individuals, to identify a novel locus at 3q25 demonstrating association with TGCT risk, taking the total count of TGCT risk loci to 19. In addition, we confirmed association of the recently reported loci at 7p22.3, with an OR consistent with published estimates.
Functional annotation of rs1510272 is challenging given its position in an intergenic region, although analysis of publically available transcriptional regulation and expression data has provided evidence suggestive of a possible long range regulatory mechanism. Expression data from testicular germ cells would be required to further assess the exact nature of these functional relationships.
This new locus at 3q25 accounts for a modest 0.2% of the genetic susceptibility (excess familial risk) to TGCT, bringing the cumulative total across all 19 loci to 16.4 and 23.8% for brothers and fathers/sons of TGCT cases, respectively. In more common cancer types, increasingly large association studies are continuing to identify common variants associated with disease. In breast cancer, >77 SNPs have been associated with the disease and in prostate cancer >70 (34,35). Modelling of data from these analyses in breast and prostate cancer suggest that there may be thousands of common loci of very modest effect underlying these cancers (34,35). Based on our prior analysis (15), it is likely that there are several additional loci for TGCT of similar or lesser effect to rs1510272, which may be identifiable via additional follow-up studies of our data or via meta-analyses in combination with other GWAS datasets. Novel GWA studies and genome-wide imputation using data from densely genotyped reference panels will likely assist in identification of additional loci, further explaining the missing heritability derived from common genetic variation. However, it is unclear what proportion of the common variation associated with TGCT will be tractable in experiments of a size feasible to perform in this uncommon cancer type. Given the rapid increase in TGCT incidence, studies of environmental risk factors will also be of particular interest, as well as the identification of gene–environment interactions which may also account for a portion of unexplained heritable risk.
The question of potential clinical utility from these common SNPs still remains an area of active debate as the strong effect sizes observed in TGCT loci support the potential for meaningful risk profiling among men. This could in turn enable strategies of screening and targeted prevention. Using these 19 TGCT risk markers under a polygenic risk score (PRS) model and assuming no interaction between SNPs, the top 1% of men with the highest risk genotypes have a 8.7-fold elevated risk of TGCT compared with population average. In comparable PRS models for breast and prostate cancer, the individuals in the top 1% of risk show only a 3- and 5-fold increase in risk, respectively (34) (Mavaddat et al. 2014, manuscript submitted). Despite this superior performance in risk discrimination, a number of factors continue to limit the potential utility of genetic screening at a population level, not least the rare nature and good clinical prognosis of TGCT. However, targeted strategies such as screening populations with already elevated baseline risk could afford a more immediate clinical opportunity.
In summary, via genotyping of 4098 cases of TGCT and 18 972 controls across a three-stage experiment, we have identified a new risk locus for TGCT at 3q25 in addition to the 15 previously reported by our group, taking the total number of associated loci reported to 19 (14–16). It is likely there are several additional loci for TGCT of similar or lesser effect to rs1510272, which may be identifiable via additional follow-up studies and collaborative meta-analyses in combination with other GWAS datasets (15)
MATERIALS AND METHODS
Samples
Stage 1: GWAS
Cases were genotyped on the Illumina HumanCNV370-Duo bead array, and controls were genotyped on the Illumina Infinium 1.2M array at the Wellcome Trust Sanger Institute, as previously described (14,16). We used data on 314 861 SNPs that were successfully genotyped on both arrays. We excluded individuals: with low call rate (<95%), with abnormal autosomal heterozygosity or with >10% non-Western European ancestry (based on multi-dimensional scaling). We included 986 cases and 4946 controls in the final analysis. We filtered out all SNPs with (i) minor allele frequency of <1%, (ii) a call rate of <95% in cases or controls or (iii) minor allele frequency of 1–5% and a call rate of <99% or (iv) deviation from Hardy–Weinberg equilibrium (10−12 in controls and 10−5 in cases).The final number of SNPs passing QC filters was 307 291, for which SNP and TGCT risk association was tested for (and per-allele ORs were estimated), 1 df, using logistic regression adjusting for principal components 1–6 as covariates. All other statistical methods were performed as previously described (14,16).
Stage 2: iCOGs
Genotyping was conducted using a custom Illumina Infinium array (iCOGS array) comprising 211 155 SNPs selected across multiple consortia within the Collaborative Oncological Gene-environment Study (COGS), as previously described (15,36). Of the 1050 SNPs submitted from our analysis, 740 attained an Illumina design score of ≥0.8 and were included on the array. Quality control (QC) was conducted on the full SNP set of 211 155 SNPs on the iCOGS array, with QC exclusions applied as follows to subjects: (i) subjects with overall call rate of <95% or low or high heterozygosity (P< 10−6) (5 cases), (ii) using identity-by-state estimates based on 37 046 uncorrelated SNPs, we identified ‘cryptic’ duplicates and related samples and the sample with the lower call rate was excluded (7 cases), (iii) we identified ethnic outliers by multi-dimensional scaling by combining the iCOGS data with the 3 Hapmap2 populations using 37 046 uncorrelated markers and removed individuals with >10% non-Western European ancestry (18 cases). We included 1064 cases and 10 082 controls in the final analysis. QC was applied to SNPs as follows: (i) discrepant calls in >2% of duplicate samples across COGS consortia, (ii) call rate of <95%, MAF <1%, call rate <99% if MAF = 1–5%, (iii) deviation from Hardy–Weinberg (P < 10−5 in controls, P < 10−12 in cases). Following QC exclusions applied individually to case and control data, we included genotypes from 694 SNPs in subsequent analyses. Association between genotype and TGCT risk was tested using a 1-df trend test, adjusted for principle components 1–6. All other statistical methods were performed as previously described (15).
Stage 3: Taqman genotyping
Genotyping for Stage 3 was performed by 5′ exonuclease assay (Taqman) using the ABI Prism 7900HT Sequence Detection System according to the manufacturer's instructions. Primers and probes were supplied directly by Applied Biosystems, with seven of the eight SNPs available pre-designed ‘off-the-shelf’ and one SNP (rs1849003) requiring custom design. Assays included four negative controls per plate, and 2% of all genotyped samples were duplicated (spread across the plates) as within-platform duplicates (observed concordance 97.1%). In addition, 1696 controls genotyped in Stage 1 were included as cross-platform duplicates in Stage 3 (observed concordance 99.3%). All duplicated samples were excluded from the subsequent Stage 3 analysis.
Statistical analyses
The eight SNPs in Stage 3 were tested for TGCT risk association, and per-allele ORs were estimated, 1 df, using logistic regression in-line with the Stage 1 and Stage 2 analyses. We then obtained overall combined significance levels across all three stages using a fixed-effects meta-analysis, to derive a 1-df test, using a threshold of P < 5 × 10−8 to denote genome-wide significance.
We examined for statistical interaction between rs1510272 and the existing 15 TGCT predisposition loci identified in our group (using Stage 1 data) by evaluating the effect of adding an interaction term to the regression model, using a likelihood ratio test (using a significance threshold of P < 0.003 to account for 15 tests).
The LD block at this locus was evaluated using the HapMap recombination rates (cm/Mb) and defined using the Oxford recombination hotspots (26). The LD block was mapped and visualized using Haploview software. To further refine LD patterns at this locus, we identified all variants correlated with our sentinel SNP (r2 > 0.4 and <250 kb) as reported within the 1000 genomes project. We used the ENCODE and HaploReg databases to investigate for evidence of transcriptional regulation at our identified locus, to assess whether: (i) the variant resides in a region in which modification of histone proteins is suggestive of enhancer and other regulatory activity (H3K4Me1 and H3K27A histone modification) or promoter activity (H3K4Me3 histone modification), (ii) whether the variant lies in a region where the chromatin is hypersensitive to cutting by the DNase enzyme (suggestive of regulatory region), (iii) whether the variant lies in a region of binding of transcription factor proteins [as assayed by chromatin immunoprecipitation with antibodies specific to the transcription factor followed by sequencing of the precipitated DNA (ChIP-seq)], (iv) whether the variant affects a specific regulatory motif, as evaluated from position-weighted matrices assembled from TRANSFAC, JASPAR and protein-binding microarray experiments.
We investigated for evidence of association between the SNPs at our locus and changes in gene expression using GENEVAR, a database for analysis of SNP–gene interactions using eQTL (expression quantitative trait loci) studies. Data were available for four sources: (i) lymphoblastoid cell lines (LCL) from 726 HapMap individuals adipose (HapMap3), (ii) LCL and skin collected from 856 healthy female twins of the MuTHER resource (Muther pilot Twin 1), (iii) adipose, LCL and skin derived from a subset of 160 MuTHER healthy female twins (Muther pilotTwin2) and (iv) fibroblast, LCL and T-cells derived from umbilical cords of 75 Geneva GenCord individuals (Gencord). Spearman Rank Correlation between normalized gene expression levels and the count of one of the alleles of the SNP (0, 1 or 2) was analysed with significance assessed by permutation (10 000 permutations). Expression of genes within 250 kb of the sentinel SNP was examined; a formal threshold to account for multiple testing was not established as due to insufficient levels of association to warrant further testing. All genomic references are based on NCBI Build 36. Analyses were performed using R (v3.02), Stata12 (State College) and PLINK (v1.07) software.
Imputation
In order to fine map this locus, imputation was performed across a 5-Mb region centred on rs1510272, using the genotyped data from Stage 1. The 1000 genomes phase 1 data (Sept-13 release) were used as reference panel, with haplotypes pre-phased using SHAPEIT2. Imputation was performed using IMPUTE2 software; the association between imputed genotype and TGCT was tested using SNPTEST, under a frequentist model of association. QC was performed on the imputed SNPs, excluding those with INFO score <0.8 and MAF <0.01.
SUPPLEMENTARY MATERIAL
FUNDING
We acknowledge National Health Service funding to the National Institute for Health Research Biomedical Research Centre. The COGS research initiative, leading to the results in Stage 2, has received support from the European Community's Seventh Framework Programme under grant agreement n° 223175 (HEALTH-F2-2009-223175), Cancer Research UK (Grants C1287/A10710, C5047/A7357 and C588/A10589) and Prostate Action (now known as Prostate Cancer UK). The Stage 1 GWAS study was supported by the Institute of Cancer Research, Cancer Research UK and the Wellcome Trust. Stage 2 and Stage 3 of this study were supported by Movember and the Institute of Cancer Research. K.L. is supported by a PhD fellowship from Cancer Research UK. D.F.E. is a Principal Research Fellow of Cancer Research UK. We acknowledge support from the NIHR to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden NHS Foundation Trust.
Supplementary Material
ACKNOWLEDGEMENTS
We thank the subjects with TGCT and the clinicians involved in their care for participation in this study. We also thank J. Pugh, R. Linger, J. Marke, D. Hughes and D. Pernet for recruitment of subjects and database entry for the TGCT collections. We thank Liz Rapley for her contributions to sample collection, genotyping and study design. We thank the UK Genetics of Prostate Cancer Study (UKGPCS) study teams for the recruitment of the UKGPCS controls. This study makes use of data generated by the Wellcome Trust Case Control Consortium (WTCCC2). A full list of the investigators who contributed to the generation of the data is available from theWTCCCwebsite. The second stage of this study would not have been possible without the contributions of the following: P. Hall (COGS); D.F. Easton (BCAC), A. Berchuck (OCAC), R. Eeles (PRACTICAL), G. Chenevix-Trench (CIMBA), J. Dennis, A.M. Dunning, A. Lee, E. Dicks and D.F. Easton (Cambridge); J. Benitez, A. Gonzalez-Neira and the staff of the CNIO genotyping unit; J. Simard and D.C. Tessier, F. Bacot, D. Vincent, S. LaBoissière and F. Robidoux and the staff of the McGill University and Génome Québec Innovation Centre; S.E. Bojesen, S.F. Nielsen, B.G. Nordestgaard and the staff of the Copenhagen DNA laboratory; and J.M. Cunningham, S.A. Windebank, C.A. Hilker, J. Meyer and the staff of Mayo Clinic Genotyping Core Facility.
Conflict of Interest statement. None declared.
Website Resource
PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/;
Haploview: http://www.broadinstitute.org/haploview/haploview/;
SNAP: http://www.broadinstitute.org/mpg/snap/;
Wellcome Trust Case Control Consortium 2: http://www.wtccc.org.uk/ccc2/
COGS: http://www.cogseu.org/
Haploreg: http://www.broadinstitute.org/mammals/haploreg/haploreg.php
Genevar: http://www.sanger.ac.uk/resources/software/genevar/
Contributor Information
Collaborators: J. Pugh, R. Linger, J. Marke, D. Hughes, D. Pernet, P. Hall, D.F. Easton, A. Berchuck, R. Eeles, G. Chenevix-Trench, J. Dennis, A.M. Dunning, A. Lee, E. Dicks, D.F. Easton, J. Benitez, A. Gonzalez-Neira, J. Simard, D.C. Tessier, F. Bacot, D. Vincent, S. LaBoissière, F. Robidoux, S.E. Bojesen, S.F. Nielsen, B.G. Nordestgaard, J.M. Cunningham, S.A. Windebank, C.A. Hilker, and J. Meyer
REFERENCES
- 1.Bray F., Ferlay J., Devesa S.S., McGlynn K.A., Moller H. Interpreting the international trends in testicular seminoma and nonseminoma incidence. Nat. Clin. Pract. Urol. 2006;3:532–543. doi: 10.1038/ncpuro0606. [DOI] [PubMed] [Google Scholar]
- 2.Le Cornet C., Lortet-Tieulent J., Forman D., Beranger R., Flechon A., Fervers B., Schuz J., Bray F. Testicular cancer incidence to rise by 25% by 2025 in Europe? Model-based predictions in 40 countries using population-based registry data. Eur. J. Cancer. 2014;50:831–839. doi: 10.1016/j.ejca.2013.11.035. [DOI] [PubMed] [Google Scholar]
- 3.Rosen A., Jayram G., Drazer M., Eggener S.E. Global trends in testicular cancer incidence and mortality. Eur. Urol. 2011;60:374–379. doi: 10.1016/j.eururo.2011.05.004. [DOI] [PubMed] [Google Scholar]
- 4.Stang A., Trabert B., Wentzensen N., Cook M.B., Rusner C., Oosterhuis J.W., McGlynn K.A. Gonadal and extragonadal germ cell tumours in the United States, 1973–2007. Int. J. Androl. 2012;35:616–625. doi: 10.1111/j.1365-2605.2011.01245.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Chia V.M., Quraishi S.M., Devesa S.S., Purdue M.P., Cook M.B., McGlynn K.A. International trends in the incidence of testicular cancer, 1973–2002. Cancer Epidemiol. Biomarkers Prev. 2010;19:1151–1159. doi: 10.1158/1055-9965.EPI-10-0031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tan I.B., Ang K.K., Ching B.C., Mohan C., Toh C.K., Tan M.H. Testicular microlithiasis predicts concurrent testicular germ cell tumors and intratubular germ cell neoplasia of unclassified type in adults: a meta-analysis and systematic review. Cancer. 2010;116:4520–4532. doi: 10.1002/cncr.25231. [DOI] [PubMed] [Google Scholar]
- 7.Lutke Holzik M.F., Rapley E.A., Hoekstra H.J., Sleijfer D.T., Nolte I.M., Sijmons R.H. Genetic predisposition to testicular germ-cell tumours. Lancet Oncol. 2004;5:363–371. doi: 10.1016/S1470-2045(04)01493-7. [DOI] [PubMed] [Google Scholar]
- 8.Beranger R., Le Cornet C., Schuz J., Fervers B. Occupational and environmental exposures associated with testicular germ cell tumours: systematic review of prenatal and life-long exposures. PLoS One. 2013;8:e77130. doi: 10.1371/journal.pone.0077130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hemminki K., Li X. Familial risk in testicular cancer as a clue to a heritable and environmental aetiology. Br. J. Cancer. 2004;90:1765–1770. doi: 10.1038/sj.bjc.6601714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Swerdlow A.J., De Stavola B.L., Swanwick M.A., Maconochie N.E. Risks of breast and testicular cancers in young adult twins in England and Wales: evidence on prenatal and genetic aetiology. Lancet. 1997;350:1723–1728. doi: 10.1016/s0140-6736(97)05526-8. [DOI] [PubMed] [Google Scholar]
- 11.McGlynn K.A., Devesa S.S., Graubard B.I., Castle P.E. Increasing incidence of testicular germ cell tumors among black men in the United States. J. Clin. Oncol. 2005;23:5757–5761. doi: 10.1200/JCO.2005.08.227. [DOI] [PubMed] [Google Scholar]
- 12.Crockford G.P., Linger R., Hockley S., Dudakia D., Johnson L., Huddart R., Tucker K., Friedlander M., Phillips K.A., Hogg D., et al. Genome-wide linkage screen for testicular germ cell tumour susceptibility loci. Hum. Mol. Genet. 2006;15:443–451. doi: 10.1093/hmg/ddi459. [DOI] [PubMed] [Google Scholar]
- 13.Nathanson K.L., Kanetsky P.A., Hawes R., Vaughn D.J., Letrero R., Tucker K., Friedlander M., Phillips K.A., Hogg D., Jewett M.A., et al. The Y deletion gr/gr and susceptibility to testicular germ cell tumor. Am. J. Hum. Genet. 2005;77:1034–1043. doi: 10.1086/498455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rapley E.A., Turnbull C., Al Olama A.A., Dermitzakis E.T., Linger R., Huddart R.A., Renwick A., Hughes D., Hines S., Seal S., et al. A genome-wide association study of testicular germ cell tumor. Nat. Genet. 2009;41:807–810. doi: 10.1038/ng.394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ruark E., Seal S., McDonald H., Zhang F., Elliot A., Lau K., Perdeaux E., Rapley E., Eeles R., Peto J., et al. Identification of nine new susceptibility loci for testicular cancer, including variants near DAZL and PRDM14. Nat. Genet. 2013;45:686–689. doi: 10.1038/ng.2635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Turnbull C., Rapley E.A., Seal S., Pernet D., Renwick A., Hughes D., Ricketts M., Linger R., Nsengimana J., Deloukas P., et al. Variants near DMRT1, TERT and ATF7IP are associated with testicular germ cell cancer. Nat. Genet. 2010;42:604–607. doi: 10.1038/ng.607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kanetsky P.A., Mitra N., Vardhanabhuti S., Li M., Vaughn D.J., Letrero R., Ciosek S.L., Doody D.R., Smith L.M., Weaver J., et al. Common variation in KITLG and at 5q31.3 predisposes to testicular germ cell cancer. Nat. Genet. 2009;41:811–815. doi: 10.1038/ng.393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kanetsky P.A., Mitra N., Vardhanabhuti S., Vaughn D.J., Li M., Ciosek S.L., Letrero R., D'Andrea K., Vaddi M., Doody D.R., et al. A second independent locus within DMRT1 is associated with testicular germ cell tumor susceptibility. Hum. Mol. Genet. 2011;20:3109–3117. doi: 10.1093/hmg/ddr207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chung C.C., Kanetsky P.A., Wang Z., Hildebrandt M.A., Koster R., Skotheim R.I., Kratz C.P., Turnbull C., Cortessis V.K., Bakken A.C., et al. Meta-analysis identifies four new loci associated with testicular germ cell tumor. Nat. Genet. 2013;45:680–685. doi: 10.1038/ng.2634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Schumacher F.R., Wang Z., Skotheim R.I., Koster R., Chung C.C., Hildebrandt M.A., Kratz C.P., Bakken A.C., Bishop D.T., Cook M.B., et al. Testicular germ cell tumor susceptibility associated with the UCK2 locus on chromosome 1q23. Hum. Mol. Genet. 2013;22:2748–2753. doi: 10.1093/hmg/ddt109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Boldajipour B., Raz E. What is left behind—quality control in germ cell migration. Sci. STKE. 2007;2007:pe16. doi: 10.1126/stke.3832007pe16. [DOI] [PubMed] [Google Scholar]
- 22.Heaney J.D., Lam M.Y., Michelson M.V., Nadeau J.H. Loss of the transmembrane but not the soluble kit ligand isoform increases testicular germ cell tumor susceptibility in mice. Cancer Res. 2008;68:5193–5197. doi: 10.1158/0008-5472.CAN-08-0779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Forbes S.A., Tang G., Bindal N., Bamford S., Dawson E., Cole C., Kok C.Y., Jia M., Ewing R., Menzies A., et al. COSMIC (the Catalogue of Somatic Mutations in Cancer): a resource to investigate acquired mutations in human cancer. Nucl. Acids Res. 2010;38:D652–D657. doi: 10.1093/nar/gkp995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sasaki A., Taketomi T., Kato R., Saeki K., Nonami A., Sasaki M., Kuriyama M., Saito N., Shibuya M., Yoshimura A. Mammalian Sprouty4 suppresses Ras-independent ERK activation by binding to Raf1. Cell Cycle. 2003;2:281–282. [PubMed] [Google Scholar]
- 25.Yan W., Samson M., Jegou B., Toppari J. Bcl-w forms complexes with Bax and Bak, and elevated ratios of Bax/Bcl-w and Bak/Bcl-w correspond to spermatogonial and spermatocyte apoptosis in the testis. Mol. Endocrinol. 2000;14:682–699. doi: 10.1210/mend.14.5.0443. [DOI] [PubMed] [Google Scholar]
- 26.Myers S., Bottolo L., Freeman C., McVean G., Donnelly P. A fine-scale map of recombination rates and hotspots across the human genome. Science. 2005;310:321–324. doi: 10.1126/science.1117196. [DOI] [PubMed] [Google Scholar]
- 27.Schmahl J., Raymond C.S., Soriano P. PDGF signaling specificity is mediated through multiple immediate early genes. Nat. Genet. 2007;39:52–60. doi: 10.1038/ng1922. [DOI] [PubMed] [Google Scholar]
- 28.Wu E., Palmer N., Tian Z., Moseman A.P., Galdzicki M., Wang X., Berger B., Zhang H., Kohane I.S. Comprehensive dissection of PDGF-PDGFR signaling pathways in PDGFR genetically defined cells. PLoS One. 2008;3:e3794. doi: 10.1371/journal.pone.0003794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Schmahl J., Rizzolo K., Soriano P. The PDGF signaling pathway controls multiple steroid-producing lineages. Genes Dev. 2008;22:3255–3267. doi: 10.1101/gad.1723908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Habert R., Lejeune H., Saez J.M. Origin, differentiation and regulation of fetal and adult Leydig cells. Mol. Cell Endocrinol. 2001;179:47–74. doi: 10.1016/s0303-7207(01)00461-0. [DOI] [PubMed] [Google Scholar]
- 31.Yang T.P., Beazley C., Montgomery S.B., Dimas A.S., Gutierrez-Arcelus M., Stranger B.E., Deloukas P., Dermitzakis E.T. Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics. 2010;26:2474–2476. doi: 10.1093/bioinformatics/btq452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Consortium E.P., Bernstein B.E., Birney E., Dunham I., Green E.D., Gunter C., Snyder M. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bernstein B.E., Stamatoyannopoulos J.A., Costello J.F., Ren B., Milosavljevic A., Meissner A., Kellis M., Marra M.A., Beaudet A.L., Ecker J.R., et al. The NIH roadmap epigenomics mapping consortium. Nat. Biotechnol. 2010;28:1045–1048. doi: 10.1038/nbt1010-1045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Eeles R.A., Olama A.A., Benlloch S., Saunders E.J., Leongamornlert D.A., Tymrakiewicz M., Ghoussaini M., Luccarini C., Dennis J., Jugurnauth-Little S., et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat. Genet. 2013;45:385–391. doi: 10.1038/ng.2560. 391e381–382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Michailidou K., Hall P., Gonzalez-Neira A., Ghoussaini M., Dennis J., Milne R.L., Schmidt M.K., Chang-Claude J., Bojesen S.E., Bolla M.K., et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet. 2013;45:353–361. doi: 10.1038/ng.2563. 361e351–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sakoda L.C., Jorgenson E., Witte J.S. Turning of COGS moves forward findings for hormonally mediated cancers. Nat. Genet. 2013;45:345–348. doi: 10.1038/ng.2587. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



