Skip to main content
JNCI Journal of the National Cancer Institute logoLink to JNCI Journal of the National Cancer Institute
. 2015 Oct 12;107(12):djv279. doi: 10.1093/jnci/djv279

Analysis of Heritability and Shared Heritability Based on Genome-Wide Association Studies for 13 Cancer Types

Joshua N Sampson 1,, William A Wheeler 1,*, Meredith Yeager 1,*, Orestis Panagiotou 1,*, Zhaoming Wang 1,*, Sonja I Berndt 1,*, Qing Lan 1,*, Christian C Abnet 1,*, Laufey T Amundadottir 1,*, Jonine D Figueroa 1,*, Maria Teresa Landi 1,*, Lisa Mirabello 1,*, Sharon A Savage 1,*, Philip R Taylor 1,*, Immaculata De Vivo 1,*, Katherine A McGlynn 1,*, Mark P Purdue 1,*, Preetha Rajaraman 1,*, Hans-Olov Adami 1, Anders Ahlbom 1, Demetrius Albanes 1, Maria Fernanda Amary 1, She-Juan An 1, Ulrika Andersson 1, Gerald Andriole Jr 1, Irene L Andrulis 1, Emanuele Angelucci 1, Stephen M Ansell 1, Cecilia Arici 1, Bruce K Armstrong 1, Alan A Arslan 1, Melissa A Austin 1, Dalsu Baris 1, Donald A Barkauskas 1, Bryan A Bassig 1, Nikolaus Becker 1, Yolanda Benavente 1, Simone Benhamou 1, Christine Berg 1, David Van Den Berg 1, Leslie Bernstein 1, Kimberly A Bertrand 1, Brenda M Birmann 1, Amanda Black 1, Heiner Boeing 1, Paolo Boffetta 1, Marie-Christine Boutron-Ruault 1, Paige M Bracci 1, Louise Brinton 1, Angela R Brooks-Wilson 1, H Bas Bueno-de-Mesquita 1, Laurie Burdett 1, Julie Buring 1, Mary Ann Butler 1, Qiuyin Cai 1, Geraldine Cancel-Tassin 1, Federico Canzian 1, Alfredo Carrato 1, Tania Carreon 1, Angela Carta 1, John K C Chan 1, Ellen T Chang 1, Gee-Chen Chang 1, I-Shou Chang 1, Jiang Chang 1, Jenny Chang-Claude 1, Chien-Jen Chen 1, Chih-Yi Chen 1, Chu Chen 1, Chung-Hsing Chen 1, Constance Chen 1, Hongyan Chen 1, Kexin Chen 1, Kuan-Yu Chen 1, Kun-Chieh Chen 1, Ying Chen 1, Ying-Hsiang Chen 1, Yi-Song Chen 1, Yuh-Min Chen 1, Li-Hsin Chien 1, María-Dolores Chirlaque 1, Jin Eun Choi 1, Yi Young Choi 1, Wong-Ho Chow 1, Charles C Chung 1, Jacqueline Clavel 1, Françoise Clavel-Chapelon 1, Pierluigi Cocco 1, Joanne S Colt 1, Eva Comperat 1, Lucia Conde 1, Joseph M Connors 1, David Conti 1, Victoria K Cortessis 1, Michelle Cotterchio 1, Wendy Cozen 1, Simon Crouch 1, Marta Crous-Bou 1, Olivier Cussenot 1, Faith G Davis 1, Ti Ding 1, W Ryan Diver 1, Miren Dorronsoro 1, Laure Dossus 1, Eric J Duell 1, Maria Grazia Ennas 1, Ralph L Erickson 1, Maria Feychting 1, Adrienne M Flanagan 1, Lenka Foretova 1, Joseph F Fraumeni Jr 1, Neal D Freedman 1, Laura E Beane Freeman 1, Charles Fuchs 1, Manuela Gago-Dominguez 1, Steven Gallinger 1, Yu-Tang Gao 1, Susan M Gapstur 1, Montserrat Garcia-Closas 1, Reina García-Closas 1, Randy D Gascoyne 1, Julie Gastier-Foster 1, Mia M Gaudet 1, J Michael Gaziano 1, Carol Giffen 1, Graham G Giles 1, Edward Giovannucci 1, Bengt Glimelius 1, Michael Goggins 1, Nalan Gokgoz 1, Alisa M Goldstein 1, Richard Gorlick 1, Myron Gross 1, Robert Grubb III 1, Jian Gu 1, Peng Guan 1, Marc Gunter 1, Huan Guo 1, Thomas M Habermann 1, Christopher A Haiman 1, Dina Halai 1, Goran Hallmans 1, Manal Hassan 1, Claudia Hattinger 1, Qincheng He 1, Xingzhou He 1, Kathy Helzlsouer 1, Brian Henderson 1, Roger Henriksson 1, Henrik Hjalgrim 1, Judith Hoffman-Bolton 1, Chancellor Hohensee 1, Theodore R Holford 1, Elizabeth A Holly 1, Yun-Chul Hong 1, Robert N Hoover 1, Pamela L Horn-Ross 1, G M Monawar Hosain 1, H Dean Hosgood III 1, Chin-Fu Hsiao 1, Nan Hu 1, Wei Hu 1, Zhibin Hu 1, Ming-Shyan Huang 1, Jose-Maria Huerta 1, Jen-Yu Hung 1, Amy Hutchinson 1, Peter D Inskip 1, Rebecca D Jackson 1, Eric J Jacobs 1, Mazda Jenab 1, Hyo-Sung Jeon 1, Bu-Tian Ji 1, Guangfu Jin 1, Li Jin 1, Christoffer Johansen 1, Alison Johnson 1, Yoo Jin Jung 1, Rudolph Kaaks 1, Aruna Kamineni 1, Eleanor Kane 1, Chang Hyun Kang 1, Margaret R Karagas 1, Rachel S Kelly 1, Kay-Tee Khaw 1, Christopher Kim 1, Hee Nam Kim 1, Jin Hee Kim 1, Jun Suk Kim 1, Yeul Hong Kim 1, Young Tae Kim 1, Young-Chul Kim 1, Cari M Kitahara 1, Alison P Klein 1, Robert J Klein 1, Manolis Kogevinas 1, Takashi Kohno 1, Laurence N Kolonel 1, Charles Kooperberg 1, Anne Kricker 1, Vittorio Krogh 1, Hideo Kunitoh 1, Robert C Kurtz 1, Sun-Seog Kweon 1, Andrea LaCroix 1, Charles Lawrence 1, Fernando Lecanda 1, Victor Ho Fun Lee 1, Donghui Li 1, Haixin Li 1, Jihua Li 1, Yao-Jen Li 1, Yuqing Li 1, Linda M Liao 1, Mark Liebow 1, Tracy Lightfoot 1, Wei-Yen Lim 1, Chien-Chung Lin 1, Dongxin Lin 1, Sara Lindstrom 1, Martha S Linet 1, Brian K Link 1, Chenwei Liu 1, Jianjun Liu 1, Li Liu 1, Börje Ljungberg 1, Josep Lloreta 1, Simonetta Di Lollo 1, Daru Lu 1, Eiluv Lund 1, Nuria Malats 1, Satu Mannisto 1, Loic Le Marchand 1, Neyssa Marina 1, Giovanna Masala 1, Giuseppe Mastrangelo 1, Keitaro Matsuo 1, Marc Maynadie 1, James McKay 1, Roberta McKean-Cowdin 1, Mads Melbye 1, Beatrice S Melin 1, Dominique S Michaud 1, Tetsuya Mitsudomi 1, Alain Monnereau 1, Rebecca Montalvan 1, Lee E Moore 1, Lotte Maxild Mortensen 1, Alexandra Nieters 1, Kari E North 1, Anne J Novak 1, Ann L Oberg 1, Kenneth Offit 1, In-Jae Oh 1, Sara H Olson 1, Domenico Palli 1, William Pao 1, In Kyu Park 1, Jae Yong Park 1, Kyong Hwa Park 1, Ana Patiño-Garcia 1, Sofia Pavanello 1, Petra H M Peeters 1, Reury-Perng Perng 1, Ulrike Peters 1, Gloria M Petersen 1, Piero Picci 1, Malcolm C Pike 1, Stefano Porru 1, Jennifer Prescott 1, Ludmila Prokunina-Olsson 1, Biyun Qian 1, You-Lin Qiao 1, Marco Rais 1, Elio Riboli 1, Jacques Riby 1, Harvey A Risch 1, Cosmeri Rizzato 1, Rebecca Rodabough 1, Eve Roman 1, Morgan Roupret 1, Avima M Ruder 1, Silvia de Sanjose 1, Ghislaine Scelo 1, Alan Schned 1, Fredrick Schumacher 1, Kendra Schwartz 1, Molly Schwenn 1, Katia Scotlandi 1, Adeline Seow 1, Consol Serra 1, Massimo Serra 1, Howard D Sesso 1, Veronica Wendy Setiawan 1, Gianluca Severi 1, Richard K Severson 1, Tait D Shanafelt 1, Hongbing Shen 1, Wei Shen 1, Min-Ho Shin 1, Kouya Shiraishi 1, Xiao-Ou Shu 1, Afshan Siddiq 1, Luis Sierrasesúmaga 1, Alan Dart Loon Sihoe 1, Christine F Skibola 1, Alex Smith 1, Martyn T Smith 1, Melissa C Southey 1, John J Spinelli 1, Anthony Staines 1, Meir Stampfer 1, Marianna C Stern 1, Victoria L Stevens 1, Rachael S Stolzenberg-Solomon 1, Jian Su 1, Wu-Chou Su 1, Malin Sund 1, Jae Sook Sung 1, Sook Whan Sung 1, Wen Tan 1, Wei Tang 1, Adonina Tardón 1, David Thomas 1, Carrie A Thompson 1, Lesley F Tinker 1, Roberto Tirabosco 1, Anne Tjønneland 1, Ruth C Travis 1, Dimitrios Trichopoulos 1, Fang-Yu Tsai 1, Ying-Huang Tsai 1, Margaret Tucker 1, Jenny Turner 1, Claire M Vajdic 1, Roel C H Vermeulen 1, Danylo J Villano 1, Paolo Vineis 1, Jarmo Virtamo 1, Kala Visvanathan 1, Jean Wactawski-Wende 1, Chaoyu Wang 1, Chih-Liang Wang 1, Jiu-Cun Wang 1, Junwen Wang 1, Fusheng Wei 1, Elisabete Weiderpass 1, George J Weiner 1, Stephanie Weinstein 1, Nicolas Wentzensen 1, Emily White 1, Thomas E Witzig 1, Brian M Wolpin 1, Maria Pik Wong 1, Chen Wu 1, Guoping Wu 1, Junjie Wu 1, Tangchun Wu 1, Wei Wu 1, Xifeng Wu 1, Yi-Long Wu 1, Jay S Wunder 1, Yong-Bing Xiang 1, Jun Xu 1, Ping Xu 1, Pan-Chyr Yang 1, Tsung-Ying Yang 1, Yuanqing Ye 1, Zhihua Yin 1, Jun Yokota 1, Ho-Il Yoon 1, Chong-Jen Yu 1, Herbert Yu 1, Kai Yu 1, Jian-Min Yuan 1, Andrew Zelenetz 1, Anne Zeleniuch-Jacquotte 1, Xu-Chao Zhang 1, Yawei Zhang 1, Xueying Zhao 1, Zhenhong Zhao 1, Hong Zheng 1, Tongzhang Zheng 1, Wei Zheng 1, Baosen Zhou 1, Meng Zhu 1, Mariagrazia Zucca 1, Simina M Boca 1,, James R Cerhan 1,, Giovanni M Ferri 1,, Patricia Hartge 1,, Chao Agnes Hsiung 1,, Corrado Magnani 1,, Lucia Miligi 1,, Lindsay M Morton 1,, Karin E Smedby 1,, Lauren R Teras 1,, Joseph Vijai 1,, Sophia S Wang 1,, Paul Brennan 1,, Neil E Caporaso 1,, David J Hunter 1,, Peter Kraft 1,, Nathaniel Rothman 1,, Debra T Silverman 1,, Susan L Slager 1,, Stephen J Chanock 1,, Nilanjan Chatterjee 1,
PMCID: PMC4806328  PMID: 26464424

Abstract

Background:

Studies of related individuals have consistently demonstrated notable familial aggregation of cancer. We aim to estimate the heritability and genetic correlation attributable to the additive effects of common single-nucleotide polymorphisms (SNPs) for cancer at 13 anatomical sites.

Methods:

Between 2007 and 2014, the US National Cancer Institute has generated data from genome-wide association studies (GWAS) for 49 492 cancer case patients and 34 131 control patients. We apply novel mixed model methodology (GCTA) to this GWAS data to estimate the heritability of individual cancers, as well as the proportion of heritability attributable to cigarette smoking in smoking-related cancers, and the genetic correlation between pairs of cancers.

Results:

GWAS heritability was statistically significant at nearly all sites, with the estimates of array-based heritability, hl 2, on the liability threshold (LT) scale ranging from 0.05 to 0.38. Estimating the combined heritability of multiple smoking characteristics, we calculate that at least 24% (95% confidence interval [CI] = 14% to 37%) and 7% (95% CI = 4% to 11%) of the heritability for lung and bladder cancer, respectively, can be attributed to genetic determinants of smoking. Most pairs of cancers studied did not show evidence of strong genetic correlation. We found only four pairs of cancers with marginally statistically significant correlations, specifically kidney and testes (ρ = 0.73, SE = 0.28), diffuse large B-cell lymphoma (DLBCL) and pediatric osteosarcoma (ρ = 0.53, SE = 0.21), DLBCL and chronic lymphocytic leukemia (CLL) (ρ = 0.51, SE =0.18), and bladder and lung (ρ = 0.35, SE = 0.14). Correlation analysis also indicates that the genetic architecture of lung cancer differs between a smoking population of European ancestry and a nonsmoking Asian population, allowing for the possibility that the genetic etiology for the same disease can vary by population and environmental exposures.

Conclusion:

Our results provide important insights into the genetic architecture of cancers and suggest new avenues for investigation.


Studies of related individuals have consistently demonstrated that there is notable familial aggregation of cancer. The three largest studies, based on the Swedish Family-Cancer Database (1–3), the Utah Population and Cancer Registry Database (4,5), and the Icelandic Cancer Registry (6), have shown familial aggregation for cancer at nearly every anatomical site. For common cancers such as prostate, breast, and lung, the familial relative risk (FRR), defined as the increase in risk associated with each affected first-degree relative of an individual, is generally estimated to be below or around 2.0. In contrast, for some rare cancers occurring early in life, such as those of testes and bone, estimates of FRR can exceed 5. Although shared environmental factors contribute to this aggregation, studies of twins (7,8) and extended family members (6,9,10) have clearly identified a substantial genetic contribution, commonly known as heritability.

Genome-wide association studies (GWAS) have provided an opportunity to study the contribution of common single-nucleotide polymorphisms (SNPs) to the heritability of complex traits, including cancers. In addition to identifying specific susceptibility SNPs, novel mixed-effect modeling methods (11–13) can utilize GWAS data to quantify the additive heritability attributable to all common susceptibility SNPs captured by genotyping arrays, regardless of whether those SNPs individually have reached the stringent level of genome-wide statistical significance. Understanding the total contribution of common SNPs will be instrumental for evaluating the potential clinical applications of genetics in risk stratification and for guiding future genetic studies of cancer(14,15).

Evidence also suggests that there is overlap between cancers with respect to their genetic architectures. Previous family studies have observed familial co-aggregation of cancers among certain sites (5,6), including pairs of cancers at neighboring sites, such as the colon and rectum, as well as for more distant and seemingly unrelated sites such as the cervix and esophagus (6). GWAS have also directly identified shared regions such as 8q24.1 and 5p15.33 (TERT-CLPTM1L) containing SNPs affecting cancer at multiple sites, while earlier studies have identified major genes such as TP53 (16) and BRCA (17) containing highly penetrant rare variants affecting multiple cancers. Sites with overlapping genetic architectures may be studied together to understand shared biology and to increase power to detect susceptibility loci.

In this study, we performed an analysis of heritability and shared heritability for cancer at 13 different sites using data from case/control GWAS of more than 80 000 individuals carried out or reported to the US National Cancer Institute. We expand upon recent GWAS estimates of heritability (13) by nearly doubling the number of cases evaluated, exploring six new cancer (sub)types, and considering populations of non-European ancestry. We use detailed information on multiple smoking characteristics to assess the proportion of heritability in smoking-related cancers that can be attributed to the genetic determinants of cigarette smoking. Furthermore, we use genetic correlation analysis to assess shared heritability across cancer sites and for lung cancer across two distinct ethnic populations to assess the evidence of gene-environment interactions for this complex malignancy. These analyses provide insights into the contribution of common SNPs to cancer heritability, coheritability and their relationships to a major environmental risk-factor, smoking.

Methods

Study Populations

The study includes 49 492 cancer case patients and 34 131 control patients who participated in large case/control GWAS that were genotyped at or reported to the National Cancer Institute’s (NCI’s) Cancer Genomics Research Laboratory (CGR) between 2007 and 2014. GWAS were conducted in individuals of European ancestry unless otherwise stated. Details about the GWAS can be found in the Supplementary Methods (available online). Studies were approved by the local institutional review boards, and all participants provided written informed consent.

Estimation of Heritability and Genetic Correlation

We estimated heritability, hl 2, and genetic correlation using mixed model methodology (GCTA) software (11,12), adjusting for sex, substudy, and the top 20 eigenvectors, and after following the stringent quality control procedures described in the Supplementary Methods (available online). Genetic correlation was considered statistically significant if the two-sided test statistic provided a P value below .01. To estimate heritability attributable to undiscovered loci, we identified SNPs (Supplementary Table 1, available online) from the National Human Genome Research Institute (NHGRI) catalog or from recent publications (18–20) that were associated with a given cancer (P < 5x10-8) and removed all SNPs within 250 KB of those loci prior to calculation of the genetic relation matrix. A description of the GCTA methodology, determining FRR, and sensitivity analyses are provided in the Supplementary Methods (available online).

Shared SNP Analysis

Among the 375 SNPs associated with at least one of the studied cancers in the NHGRI catalog or a recent publication, we considered only the 318 SNPs that were outside regions surrounding TERT (1280000 +/- 500kb on chromosome 5) and 8q24 (128000000 +/- 500kb on chromosome 8), both already known to have pleiotripic effects on multiple cancers, and that were either genotyped or could be imputed accurately in at least one GWAS. Among those SNPs, we evaluated their associations with all other cancers and reported their associations if the corresponding two-sided P value was below .001.

Next, we created a polygenic risk score for each cancer. Let “T” denote a specific cancer site, RiT be an individual’s risk of cancer “T”, ΩT be the subset of the 318 SNPs associated with cancer “T”, and Gij be the number of minor alleles for individual i at SNP j in ΩT. Then the polygenic risk score, Z Ti, for individual i was defined as ZTi=jΩTβ^TjGij , where β^Tj was estimated from fitting the model

logit(RiT)jΩTβTjGij

using our GWAS. We then tested for an association between the polygenic risk score of the primary cancer and the risk of each of the other cancers and considered the test to be statistically significant if the two-sided P value was below .05.

Estimation of the Heritability of Smoking and Its Contribution to Cancer Heritability

To estimate the proportion of cancer heritability that can be explained by genetic determinants of smoking, we first defined a smoking-related risk score variable (ST ) that weights three smoking-related risk factors, namely smoking status, intensity, and duration according to their strength of association with a particular cancer (Supplementary Methods, available online). Variability of ST , a mathematical construct which we denote by Var(ST) , explains the variation of cancer risk because of smoking in a population on the log-risk scale. The component of Var(ST)  attributable to genetics can be estimated as VGS=hTS2Var(ST) , where hTS2 is the heritability of ST and Var(ST) is the total variability of ST in the underlying population, which we approximate by the 10 530 control patients with relevant smoking variables. The proportion, P, of cancer heritability attributable to smoking is then estimated as P=VGS/VG, where VG=2log(FRR) transforms our estimate of cancer heritability to the variability in risk on the log-risk scale (21).

Results

The estimates of array-based heritability, hl 2, on the liability threshold (LT) scale ranged from 0.05 to 0.38 across the 13 cancer sites (Table 1), with esophageal cancer (hl 2 = 0.38, 95% confidence interval [CI] = 0.17 to 0.59, Asian population), prostate cancer (hl 2 = 0.38, 95% CI = 0.24 to 0.51), and testicular cancer (hl 2 = 0.30, 95% CI = 0.08 to 0.51) displaying the strongest heritable components. After removing known susceptibility loci (Supplementary Table 1, available online), the adjusted estimates of heritability remained similar (Table 1) for most of the cancer sites, indicating that the majority of the common underlying susceptibility loci remain to be discovered. Known loci constituted the largest proportion of heritability for cancers of the testes (33%) and prostate (25%).

Table 1.

GWAS estimates of cancer heritability on the liability scale*

Cancer h l 2 (95% CI) h r 2 (95% CI)
Bladder 0.123 (0.086 to 0.160) 0.112 (0.075 to 0.148)
Breast (ER-) 0.096 (0 to 0.199) 0.079 (0 to 0.181)
Endometrium 0.178 (0.085 to 0.270) 0.177 (0.085 to 0.270)
Esophagus† 0.381 (0.174 to 0.588) 0.370 (0.164 to 0.577)
Glioma 0.046 (0 to 0.116) 0.036 (0 to 0.106)
Kidney 0.147 (0.023 to 0.270) 0.136 (0.012 to 0.259)
Lung
 Asian†,‡ 0.121 (0.064 to 0.177) 0.102 (0.046 to 0.159)
 European 0.206 (0.142 to 0.271) 0.189 (0.125 to 0.253)
Lymphoma
 CLL 0.220 (0.162 to 0.278) 0.164 (0.105 to 0.222)
 DLBCL 0.092 (0.038 to 0.145) 0.088 (0.035 to 0.142)
Osteosarcoma 0.159 (0.079 to 0.239) 0.158 (0.078 to 0.238)
Pancreas 0.098 (0.037 to 0.160) 0.084 (0.023 to 0.145)
Prostate
 Overall 0.378 (0.244 to 0.513) 0.285 (0.151 to 0.419)
 Nonadvanced stage 0.351 (0.211 to 0.491) 0.259 (0.120 to 0.399)
 Advanced stage 0.232 (0.157 to 0.307) 0.193 (0.119 to 0.268)
Stomach† (noncardia) 0.253 (0 to 0.522) 0.243 (0 to 0.512)
Testes 0.299 (0.084 to 0.513) 0.199 (0 to 0.415)

* hl 2 (95% confidence interval [CI]) is the estimated heritability on the liability scale (95% CI) using all qualifying single-nucleotide polymorphisms (SNPs), while hr 2 is the heritability after removing SNPs within 250kb of a previous genome-wide association study hit. CI = confidence interval; CLL = chronic lymphocytic leukemia; DLBCL = diffuse large B-cell lymphoma; ER = estrogen receptor; GWAS = genome-wide association study.

† Asian population.

‡ Nonsmoking women.

We transformed our measures of the genetic contribution to cancer from heritability on the liability threshold scale to familial relative risk (FRR) for direct comparison with registry-based family studies (Table 2). For most cancers, the FRR explained by GWAS SNPs was between 1.28 and 1.63. The estimates of FRR were highest for testicular cancer (FRR = 3.09, 95% CI = 1.41 to 6.05), osteosarcoma (2.90, 95% CI = 1.73 to 4.70) and CLL (2.28, 95% CI = 1.86 to 2.77). The GWAS estimates of excess risk (FRR-1) in the studies of European ancestry were in the range of 15% to 53% of the average estimates based on the Icelandic, Swedish, and Utah registry data, with the exception of DLBCL (4.5%). For lung cancer in an Asian population, our observed FRR of 1.31 (95 % CI = 1.16 to 1.46) (Table 2) can be compared with an estimate of 2.44 (95 % CI = 1.79 to 3.32) from a Taiwanese family study (22).

Table 2.

Estimates of first-degree familial relative risk from familial registries and GWAS*

Cancer Sweden Iceland† Utah† GWAS
All 1st-degree relationships Parent/child Sibling FRR (90% CI) FRR (95% CI) FRR (95% CI)
FRR (95% CI) FRR (95% CI) FRR (95% CI)
Bladder 1.69 (1.33 to 2.14) 1.53 (1.16 to 1.99) 3.30 (1.70 to 5.78) 1.68 (1.39 to 2.05) 1.8 (1.4 to 2.3) 1.37 (1.25 to 1.50)
Breast (ER-) 1.28 (0.98 to 1.63)
Endometrium 3.02 (2.33 to 3.92) 2.85 (2.08 to 3.82) 3.97 (1.97 to 7.13) 1.86 (1.31 to 2.62) 1.4 (1.1 to 1.8) 1.56 (1.25 to 1.92)
Esophagus 2.14 (0.77 to 4.70) 2.09 (1.30 to 3.31) 1.3 (0.2 to 10.0) 1.63‡ (1.27 to 2.05)
Glioma 1.67 (1.43 to 1.94) 3.31 (2.08 to 5.02) 1.41 (0.74 to 2.40) 2.3 (0.99 to 4.5) 1.19 (0.91 to 1.54)
Kidney 1.78 (1.33 to 2.39) 1.52 (1.06 to 2.11) 4.52 (2.15 to 8.35) 2.30 (1.89 to 2.80) 2.1 (1.3 to 3.5) 1.54 (1.07 to 2.13)
Lung
 European 1.70 (1.42 to 2.05) 1.64 (1.34 to 2.00) 2.61 (1.29 to 4.68) 2.00 (1.83 to 2.16) 2.4 (1.9 to 3.0) 1.42 (1.28 to 1.57)
 Asian 1.31‡,§ (1.16 to 1.46)
Lymphoma
 CLL 8.5 (6.1 to 11.7) 6.1 (4.75 to 7.65) 2.28 (1.86 to 2.77)
 DLBCL 9.8 (3.1 to 31.0) 1.40 (1.15 to 1.68)
Osteosarcoma 2.90 (1.73 to 4.70)
Pancreas 1.68 (1.16 to 2.35) 2.33 (1.83 to 2.96) 2.1 (1.3 to 3.2) 1.35 (1.12 to 1.62)
Prostate 2.75 (2.32 to 3.25) 2.71 (2.26 to 3.22) 4.91 (1.28 to 12.7) 1.89 (1.75 to 2.01) 2.1 (1.9 to 2.2) 1.51 (1.32 to 1.72)
Stomach 1.99 (1.47 to 2.71) 1.72 (1.19 to 2.40) 8.82 (3.50 to 18.3) 1.90 (1.74 to 2.05) 2.0 (1.1 to 3.7) 1.94‡ (0.95 to 3.49)
Testes 7.07 (5.34 to 9.37) 4.31 (2.05 to 7.95) 8.50 (6.01 to 11.7) 3.52 (1.18 to 7.37) 1.8 (0.4 to 8.6) 3.09 (1.41 to 6.05)

* Comparison of first-degree familial relative risk (95% CI) measured by our genome-wide association study (last column) with estimates of familial relative risk from the three largest family studies. Prior estimates are from (1, 6, 10), except for CLL (31), DLBCL (32), glioma (sibling) (33), pancreas (34), and esophagus (3). CI = confidence interval; CLL = chronic lymphocytic leukemia; DLBCL = diffuse large B-cell lymphoma; ER = estrogen receptor; FRR = family relative risk; GWAS = genome-wide association study.

† All first-degree relationships.

‡ Asian population.

§ Nonsmoking women.

Smoking is heritable and is a strong risk factor for both lung and bladder cancers. We estimated the heritability of the lung cancer smoking-related risk score to be 0.15 (95% CI = 0.10 to 0.20) and the heritability of the bladder cancer smoking-related risk score to be 0.16 (95% CI = 0.09 to 0.22) (Table 3; Supplementary Tables 3–4 and Supplementary Figures 1–2, available online). Based on a comparison with our estimates for the total heritability of cancer (Table 3), we estimated that 23.6% (95% CI = 14.2% to 37.4%) and 7.1% (95% CI = 4.3% to 11.3%) of the heritability for lung and bladder cancer, respectively, can be attributed to the genetic determinants of the three smoking characteristics. When restricting the analysis to ever-smokers, the heritability of the smoking-related risk scores remained similar (Table 3) and we estimated that 22.0% (95% CI = 7.8% to 43.0%) and 3.0% (95% CI = 1.2% to 5.9%) of the total heritability, respectively, was attributable to that of smoking behavior for lung and bladder cancers. Although limited by the number of never-smokers, we also noted lower estimates of heritability in this subgroup (Supplementary Table 5, available online).

Table 3.

Estimates of the contribution of smoking to cancer heritability*

Cancer All subjects Former/current smokers
VGS: smoking (95% CI) VG: cancer (95% CI) Ratio, % (95% CI) VGS: smoking (95% CI) VG: cancer (95% CI) Ratio, % (95% CI)
Bladder 0.045 (0.029 to 0.061) 0.62 (0.44 to 0.81) 7.1 (4.3 to 11.3) 0.020 (0.0073 to 0.032) 0.66 (0.41 to 0.88) 3.0 (1.2% to 5.9)
Lung 0.166 (0.098 to 0.23) 0.70 (0.50 to 0.90) 23.6 (14.2 to 37.4) 0.11 (0.043 to 0.167) 0.47 (0.27 to 0.67) 22.0 (7.8% to 43.0)

* The ratio, VGS:smoking/VG:cancer, estimates the proportion of the genetic variability of cancer that can be attributed to the genetic determinants of smoking, where VGS:smoking and VG:cancer are the genetic variances of the smoking-related risk score and cancer, respectively. CI = confidence interval.

The estimated genetic correlations between most pairs of cancer sites were modest and statistically nonsignificant (Figure 1; Supplementary Table 6, available online). Among the 91 compared pairs, all combinations of the 12 solid tumors, CLL, and DLBCL, the median (mean) correlation was 0.031 (0.055) and, in tests for statistical significance, only 12% and 22% of these 91 P values were below .1 and .2, providing little evidence of the enrichment that would be expected to occur with many strongly correlated pairs. The four pairs with the strongest correlations (P < .01) were kidney and testes (ρ = 0.73, SE = 0.28), DLBCL and osteosarcoma (ρ = 0.53, SE = 0.21), DLBCL and CLL (ρ = 0.51, SE = 0.18), and bladder and lung (ρ = 0.35, SE = 0.14). The two types, CLL and osteosarcoma, that were individually correlated with DLBCL showed minimal correlation (ρ = 0.19, SE = 0.15) with each other. For prostate cancer, the best-fitting estimate of the genetic correlation between aggressive and nonaggressive disease was at the upper boundary (ρ = 1, SE = 0.25), indicating a shared genetic architecture between the two clinical subtypes of this malignancy. For lung cancer, on the other hand, we estimated the genetic correlation across two distinct racial/ethnic groups and found the correlation between nonsmoking Asian women and individuals of European ancestry to be only modest and statistically nonsignificant (ρ = 0.10, SE = 0.15), suggesting distinct genetic etiologies for the same malignancy in two distinct populations with different ethnic background and exposure history.

Figure 1.

Figure 1.

Genetic correlation of cancer pairs. A) The genetic correlation between cancer sites. Dots indicate P < .01. B) Distribution of the corresponding Z-statistics from testing the null hypotheses of no genetic correlations. Black curve illustrates the expected distribution under the null hypothesis of no genetic correlation. CLL = chronic lymphocytic leukemia; DLBCL = diffuse large B-cell lymphoma.

As a second means to assess genetic overlap, we tested whether the 318 SNPs previously associated with one of the 13 cancers at GWAS levels of statistical significance (P < 5x10-8), but outside the known pleiotropic regions of 5p15.33 and 8q24, were associated with other cancers (Supplementary Tables 7–8, available online). Among the 318 SNPs, 25 were associated with a second cancer at P values of less than .001 (Supplementary Tables 7–8, available online). We further calculated a polygenic risk score for each cancer (Methods) with at least ten associated SNPs among the 318 previously discovered. We found individuals with a high risk score for lung cancer were at an increased risk of bladder cancer (RR of bladder cancer comparing 90th vs 10th percentile = 1.07, 95% CI = 1.05 to 1.10) and individuals with a high risk score for CLL were at an increased risk of DLBCL (RR = 1.12, 95% CI = 1.07 to 1.16) (Supplementary Table 9–12, available online). After adjusting for status, intensity, and duration of smoking, the strength of association between the lung cancer score and bladder cancer was reduced but remained statistically significant (RR = 1.05, 95% CI = 1.02 to 1.08).

We estimated heritability separately for men and women (Supplementary Tables 13–14, available online). Although we observed differences by sex, none reached statistical significance. When studies were divided by sex, the genetic correlation between cancer in men and women exceeded 0.9 in nearly all scenarios, with the 95% confidence interval always including 1. Moreover, we also estimated heritability under multiple additional analytical settings (Supplementary Table 15, available online) and found the results from these sensitivity analyses to be qualitatively similar. Sensitivity analyses further suggested that population stratification was adequately handled by adjusting for principal components (Supplementary Table 16, available online).

Discussion

Our analysis confirms that common SNPs meaningfully contribute to risk across the spectrum of cancers and account for different fractions of estimated heritability. The estimates of array-based heritability on the liability threshold scale spanned a wide range, from 5% (glioma) to 38% (prostate, esophageal cancer), and the majority of the observed heritability for most cancers is likely attributable to genetic variants that are not located in previously identified susceptibility regions. The corresponding estimates of familial relative risk were between 1.19 (glioma) and 2.90 (osteosarcoma). The estimated FRRs of the most common adult cancers were restricted to a narrow range of 1.4 to 1.6, which are lower than those previously observed in family-based studies (1,5,6). A number of additional factors are likely to contribute to familial aggregation, including rare genetic variants (23–25), nonadditive effects, and shared environment.

Expanding GWAS, either through new genotyping or meta-analyses, should continue to yield new susceptibility loci. The pace of expected discovery, measured by the number of additional associations for given study sizes, will likely continue to be faster for those cancers with the highest FRRs. For stomach cancer (Asian population), osteosarcoma, testes, and CLL, all with FRRs over 2, we predict that more loci may be identified for studies of comparable size (15). Consistent with this prediction, comparatively smaller GWAS of these highly heritable cancers have historically yielded more associations.

Genetic variants can influence cancer risk, either indirectly through associations with nicotine dependence and smoking behavior, or directly through other mechanisms (26). We estimate that the genetic determinants of smoking behavior accounted for at least 24% (95% CI = 14% to 37%) and 7% (4% to 11%) of the total heritability of lung and bladder cancers, respectively, in our study populations. These percentages likely underestimate the true influence of smoking genetics by not accounting for other, unmeasured, smoking characteristics, and may be slightly biased by systematic differences in the distribution of smoking characteristics between the general population and our GWAS control sample selected according to experimental design (Supplementary Table 17, available online). Our result is consistent with the largest meta-analysis (27) of case-control studies in which the excess relative risk (FRR = 1.82, 95% CI = 1.64 to 2.05) for lung cancer was reduced by 36% when focused on nonsmokers (FRR = 1.52, 95% CI = 1.11 to 2.06).

Our analysis of coheritability indicated that, in general, most pairs of cancers studied here are unlikely to have strong genetic correlations. Although our study had low power to detect modest but possibly important correlations between specific pairs of cancers (28), the distribution of test statistics over all pairs of cancers deviated little from the null (Figure 1), suggesting that at most a small fraction have modest or high (eg, ρ > 0.3) correlations (Supplementary Tables 18–19, available online). Four pairs of sites that show notable correlation in our study are bladder and lung, testes and kidney, DLBCL and CLL, and DLBCL and osteosarcoma. Pleiotropic analyses of polygenic risk scores offered additional evidence for overlapping genetic architectures between bladder and lung cancer and between DLBCL and CLL. While no family study has been of sufficient size to explore the connection between DLBCL and CLL or DLBCL and osteosarcoma, the Icelandic registry study (6) found that the relative risks for testicular cancer were elevated in relatives of kidney cancer case patients, as indicated by our results.

Analysis of the genetic correlation for the same cancer site but across distinct subtypes and populations can provide important insights. For instance, the strong genetic correlation between aggressive and nonaggressive prostate cancer indicates that common SNPs are unlikely to offer a diagnostic means to distinguish these two subtypes of cancer with different prognoses. On the other hand, there was an absence of correlation for lung cancer between studies in nonsmoking Asian females and a Caucasian population where case patients were primarily smokers. This result is consistent with the observation that distinct sets of susceptibility SNPs have emerged in these two populations (29,30). In general, low correlation across populations can be caused by different causal SNPs, differences in effect sizes or allele frequencies of shared causal SNPs, and differences in linkage disequilibrium (LD). Simulations (results not shown) based on empirical genotype data suggest that the observed low correlation (ρ = 0.10, SE = 0.15) is unlikely to be explained by general differences in the allele frequency and LD pattern between these two populations. However, substantial confounding may arise if disease-causing loci are particularly selected to have strong differences with respect to be these factors.

Our study has some limitations. While total sample size was large, the number of cases for certain cancer sites, especially when stratified by subtypes, could be limited, resulting in large confidence intervals for estimates of heritability. Such uncertainty, together with various sources of bias and variability of the prior estimates of FRR from registry-based studies, make it more difficult to assess the proportion of observed FRR that can be attributed to common SNPs. Because of the limited sample size, our analysis also lacked power to detect modest genetic correlations among specific cancer types/subtypes. This lack of precision may partially explain the more extensive familial co-aggregation observed in a study based on the Icelandic Cancer Registry (6). However, the discrepancy also can be attributed to different cancers pathologies, environmental differences across populations, and, most importantly, other sources of shared heritability. Another limitation is that principal components may not perfectly adjust for ancestry and therefore shared environments or behaviors among individuals with similar ancestry may inflate our estimates of heritability. However, tests for such inflation found little evidence to support this concern.

For some cancer sites, the individuals used in the previous study (13) of GWAS heritability overlap with a subset the individuals used here, and therefore the two studies do not offer independent assessments of heritability. For kidney cancer, our population overlaps considerably with that studied by Lu and colleagues (13), who estimated its heritability to be 0.18. For other cancers, including bladder, lung, pancreas, and prostate, the prior study populations are primarily subsets of our own populations, including the 25% to 50% of individuals from the earliest GWAS. Prior estimates for these four cancers were respectively 0.01, 0.10, 0.18, and 0.81, and show substantive differences from our updated estimates (Table 1). This analysis of more than 80 000 individuals provides important perspectives on the heritability of cancer across anatomical sites. We affirm that there is a large heritable component to most cancers, but the majority of cancer heritability cannot be attributed to known susceptibility loci. We further demonstrate that marker SNPs are not omnipresent across cancers, in that there does not appear to be strong genetic correlations between most pairs of cancer sites. Among smoking-related cancers, we showed that the genetic determinants of smoking make a statistically significant contribution to the heritability of lung and bladder cancer. Overall, as GWAS expand in size and design, a comprehensive analysis should continue to focus on cancers individually, look across groups of cancers identified as being genetically similar, and account for important environmental risk factors.

Funding

This study was supported by the Intramural Research Program of the National Institutes of Health.

Supplementary Material

Supplementary Data

Acknowledgments

For acknowledgements and author contributions, please see the Supplementary Material (available online).

The funders had no direct role in the design of the study, the collection, analysis, or interpretation of the data, the writing of the manuscript, or the decision to submit the manuscript for publication.

References

  • 1. Czene K, Lichtenstein P, Hemminki K. Environmental and heritable causes of cancer among 9.6 million individuals in the Swedish family-cancer database. Int J Cancer. 2002;99 (2):260–266. [DOI] [PubMed] [Google Scholar]
  • 2. Hemminki K, Li X. Familial risk in testicular cancer as a clue to a heritable and environmental aetiology. Br J Cancer. 2004;90 (9):1765–1770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Ji J, Hemminki K. Familial risk for esophageal cancer: an updated epidemiologic study from Sweden. Clin Gastroenterol Hepatol. 2006;4 (7):840–845. [DOI] [PubMed] [Google Scholar]
  • 4. Goldgar DE, Easton DF, Cannon-Albright LA, et al. Systematic population-based assessment of cancer risk in first-degree relatives of cancer probands. J Natl Cancer Inst. 1994;86 (21):1600–1608. [DOI] [PubMed] [Google Scholar]
  • 5. Kerber RA, O’Brien E. A cohort study of cancer risk in relation to family histories of cancer in the Utah population database. Cancer. 2005;103 (9):1906–1915. [DOI] [PubMed] [Google Scholar]
  • 6. Amundadottir LT, Thorvaldsson S, Gudbjartsson DF, et al. Cancer as a complex phenotype: pattern of cancer distribution within and beyond the nuclear family. PLoS Med. 2004;1 (3):e65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Ahlbom A, Lichtenstein P, Malmström H, et al. Cancer in twins: genetic and nongenetic familial risk factors. J Natl Cancer Inst. 1997;89 (4):287–293. [DOI] [PubMed] [Google Scholar]
  • 8. Lichtenstein P, Holm NV, Verkasalo PK, et al. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med. 2000;343 (2):78–85. [DOI] [PubMed] [Google Scholar]
  • 9. Zaitlen N, Kraft P, Patterson N, et al. Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits. PLoS Genet. 2013;9 (5):e1003520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Cannon-Albright LA, Thomas A, Goldgar DE, et al. Familiality of cancer in Utah. Cancer Res. 1994;54 (9):2378–2385. [PubMed] [Google Scholar]
  • 11. Lee SH, Wray NR, Goddard ME, et al. Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet. 2011;88 (3):294–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Yang J, Benyamin B, McEvoy BP, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42 (7):565–569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Lu Y, Ek WE, Whiteman D, et al. Most common ‘sporadic’ cancers have a significant germline genetic component. Hum Mol Genet. 2014;23 (22):6112–6118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Chatterjee N, Wheeler B, Sampson J, et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet. 2013;45 (4):400–405, 405e1–405e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Park JH, Wacholder S, Gail MH, et al. Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat Genet. 2010;42 (7):570–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Petitjean A, Achatz MI, Borresen-Dale AL, et al. TP53 mutations in human cancers: functional selection and impact on cancer prognosis and outcomes. Oncogene. 2007;26 (15):2157–2165. [DOI] [PubMed] [Google Scholar]
  • 17. Friedenson B. BRCA1 and BRCA2 pathways and the risk of cancers other than breast or ovarian. Medscape Gen Med. 2005;7 (2):60. [PMC free article] [PubMed] [Google Scholar]
  • 18. Al Olama AA, Kote-Jarai Z, Berndt SI, et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat Genet. 2014;46 (10):1103–1109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Cerhan JR, Berndt SI, Vijai J, et al. Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma. Nat Genet. 2014;46 (11):1233–1238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Wolpin BM, Rizzato C, Kraft P, et al. Genome-wide association study identifies multiple susceptibility loci for pancreatic cancer. Nat Genet. 2014;46 (9):994–1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Pharoah PDP, Antoniou A, Bobrow M, et al. Polygenic susceptibility to breast cancer and implications for prevention. Nat Genet. 2002;31 (1):33–36. [DOI] [PubMed] [Google Scholar]
  • 22. Lo YL, Hsiao CF, Chang GC, et al. Risk factors for primary lung cancer among never smokers by gender in a matched case-control study. Cancer Causes Control. 2013;24 (3):567–576. [DOI] [PubMed] [Google Scholar]
  • 23. Jones S, Hruban RH, Kamiyama M, et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science. 2009;324 (5924):217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Meindl A, Hellebrand H, Wiek C, et al. Germline mutations in breast and ovarian cancer pedigrees establish RAD51C as a human cancer susceptibility gene. Nat Genet. 2010;42 (5):410–414. [DOI] [PubMed] [Google Scholar]
  • 25. Wang Y, McKay JD, Rafnar T, et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nat Genet. 2014;46 (7):736–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Subramanian J, Govindan R. Molecular genetics of lung cancer in people who have never smoked. Lancet Oncol. 2008;9 (7):676–682. [DOI] [PubMed] [Google Scholar]
  • 27. Matakidou A, Eisen T, Houlston RS. Systematic review of the relationship between family history and lung cancer risk. Br J Cancer. 2005;93 (7):825–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Visscher PM, Hemani G, Vinkhuyzen AAE, et al. Statistical Power to Detect Genetic (Co)Variance of Complex Traits Using SNP Data in Unrelated Samples. Plos Genet. 2014;10 (4):10(4):e1004269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Lan Q, Hsiung CA, Matsuo K, et al. Genome-wide association analysis identifies new lung cancer susceptibility loci in never-smoking women in Asia. Nat Genet. 2012;44 (12):1330–1335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Landi MT, Chatterjee N, Yu K, et al. A genome-wide association study of lung cancer identifies a region of chromosome 5p15 associated with risk for adenocarcinoma. Am J Hum Genet. 2009;85 (5):679–691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Goldin LR, Bjorkholm M, Kristinsson SY, et al. Elevated risk of chronic lymphocytic leukemia and other indolent non-Hodgkin’s lymphomas among relatives of patients with chronic lymphocytic leukemia. Haematologica. 2009;94 (5):647–653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Goldin LR, Bjorkholm M, Kristinsson SY, et al. Highly increased familial risks for specific lymphoma subtypes. Br J Haematol. 2009;146 (1):91–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Scheurer ME, Etzel CJ, Liu M, et al. Familial aggregation of glioma: a pooled analysis. Am J Epidemiol. 2010;172 (10):1099–1107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Hemminki K, Li X. Familial and second primary pancreatic cancers: a nationwide epidemiologic study from Sweden. Int J Cancer. 2003;103 (4):525–530. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from JNCI Journal of the National Cancer Institute are provided here courtesy of Oxford University Press

RESOURCES