Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jun 1.
Published in final edited form as: Curr Opin Genet Dev. 2010 Apr 21;20(3):257–261. doi: 10.1016/j.gde.2010.03.006

Candidate gene association studies: Successes and failures

Boris Pasche 1, Nengjun Yi 2
PMCID: PMC2885524  NIHMSID: NIHMS199447  PMID: 20417090

Summary

Epidemiologic studies of twins indicate that 20-40% of common tumors such as breast, colorectal and prostate cancers are inherited. However, the effect of high penetrance tumor susceptibility genes such as APC, BRCA1, BRAC2, MSH1, MLH2 and MSH6 only accounts for a small fraction of these cancers. A combination of low to moderate penetrance tumor susceptibility genes likely accounts for the large remaining proportion of familial cancer risk. Candidate tumor susceptibility genes have been identified based on the discovery of tumor-specific mutations, in vitro experiments, as well as animal models of cancer. Translational studies based on in vitro and in vivo discoveries have led to the identification of novel phenotypes and genotypes associated with cancer in humans. Case control studies followed by validation studies and meta-analyses have unveiled several novel tumor susceptibility genes, several of which belong to genes encoding metabolizing enzymes and genes from the TGF-ß signaling pathway. Together with genome wide association studies, candidate gene approaches are likely to fill a large gap in our knowledge of the genetic basis of cancer within the next decade.

Introduction

Analyses of cohorts of twins show a relatively large effect of heritability for several forms of cancer suggesting that our current knowledge of the genetics of cancer is limited(1). This effect is likely due to a combination of low penetrance tumor susceptibility genes. Such variants are relatively common in the population and as such may confer a much higher attributable risk in the general population than rare mutations in high penetrance cancer susceptibility genes.

A large amount of data on the location, quantity, type, and frequency of genetic variants in the human genome have been generated by both the International Human Genome Sequencing Project and the International HapMap Project(2-4). Advances in genotyping technologies have significantly decreased the costs of genotyping. Furthermore, the inclusion of DNA collection in many observational studies has resulted in a large number of reports investigating the association of genetic variants with cancer risk. Two different approaches have led to the identification of several low penetrance tumor susceptibility genes in the past decade: 1) genome wide association studies, and 2) candidate gene association studies. The main topic of this article is a succinct review of candidate gene association studies.

Identification of candidate tumor susceptibility genes

Several lines of evidence indicate that cancer development is a multistep process and that these steps are the results of genetic alterations that transform normal human cells into malignant clones(5). It has been proposed that cancer genotypic changes are the manifestation of several essential alterations in cell physiology that collectively determine malignant growth. These essential alterations include mechanisms regulating cell growth, angiogenesis, invasion, and metastasis(5). Candidate low-penetrance genes are chosen on the basis of biological plausibility. Alterations in their protein sequence, and therefore function, could affect pathways involved in cell growth control, detoxification, and carcinogenesis. Genes that belong to these pathways have been the major focus of genetic association studies testing selected variants.

Systematic review of candidate tumor susceptibility genes

A systematic review of all meta-analyses and pooled analyses on genetic polymorphisms and risk of cancer was published Dong et al. in 2008(6). This review was restricted to published studies that included 500 cases combined from all summarized studies and evaluated cancer risk as the outcome and excluded high-penetrance genes such as APC, BRCA1 and BRCA2. Three parameters, P value, study power and false-positive report probability, were used to fully evaluate statistical tests of an association between a genetic variant and cancer risk. The P value is the commonly used statistic, representing the probability of obtaining a more extreme estimate than the one observed when the null hypothesis of no association is true. The study power is the likelihood of detecting an association when one exists. The false-positive report probability (FPRP) is defined as the probability of no association given a statistically significant finding and is useful when the prior probability is small that an association hypothesis is true, then a statistically significant finding has high chance of being false positive. The FPRP is calculated from the statistical power of the test, the observed P value, and a given prior probability for the association(7). Prior probability assignment was determined before obtaining results from the analysis and was independent of any data used in the analysis. Prior probability was subjective and was determined by both previous epidemiologic findings and experimental evidence about known functions of a genetic variant. Two levels of prior probabilities were used to calculate the FPRP: a low level that would be similar to what would be expected for a candidate gene (0.001) and a very low prior that would be similar to what would be expected for a random SNP (0.000001). A total of 161 published meta-analyses and pooled analyses addressing 344 gene-variant cancer associations were identified. Ninety eight (28%) of the 344 gene-variant associations were statistically significant (P values between .05 and 8.6 × 10−15). Statistically significant associations were found among 16 cancer sites, predominantly among studies investigating breast cancer, glioma, and lung cancer. Thirteen of these associations were noteworthy at a prior probability of 0.001 and statistical power to detect an OR of 1.5, of which 4 remained noteworthy at even a lower prior probability similar to one appropriate for a randomly selected SNP in a genome-wide association study (1/1 000 000=0.000001) with P values between10−7 and10−15. These four gene-variant cancer associations were GSTM1 null and bladder cancer (OR, 1.5; 95% CI, 1.3-1.6; P=1.9 × 10−14), NAT2 slow acetylator and bladder cancer (OR, 1.46; 95% CI, 1.26-1.68; P=2.5 × 10−7), MTHFR C677T and gastric cancer (OR, 1.52; 95% CI, 1.31-1.77; P=4.9 × 10−8), and GSTM1 null and acute leukemia (OR, 1.20; 95% CI, 1.14-1.25; P =8.6 × 10−15). These associations are less likely to be false-positives and have a high likelihood of being true associations with cancer risk. The authors also observed that genes encoding for metabolizing enzymes made up the majority of noteworthy associations.

While the Dong et al. analysis provides a comprehensive overview of genetic associations studies with cancer, it has some limitations. The authors followed the guidelines suggested by Wacholder et al. for the calculation of FPRP from its three determinants, i.e. the magnitude of the P value, statistical power and the fraction of tested hypotheses that is true(7). The main limitation of the FPRP approach is the daunting task of assigning a range for prior probability. On the other hand, the main strength of this approach over other non-Bayesian methods is the fact that it minimizes the proportion of false-positive reports. The methods and quality control for assessing some of the polymorphisms were not uniform and inclusion of several studies using different methods is another limitation of the Dong et al. analysis.

The TGF-β signaling pathway

In vitro and in vivo studies have shown that the Transforming Growth Factor Beta (TGF-β) pathway is a central regulator of normal and transformed epithelial cell phenotypes(8). For most cell types, activation of TGF-β signaling results in potent inhibition of proliferation and migration, while promoting apoptosis and other properties associated with tumor suppression. However, these tumor suppressor properties are overridden in most cancer cells and TGF-β signaling may induce cellular changes associated with invasion and metastasis(8). Hence, the TGF-β pathway fulfills the requirements of a cancer-related pathway and has been extensively investigated with respect to tumor susceptibility alleles.

Several groups have tested the hypothesis that constitutively altered TGF-β signaling may affect cancer risk. Dunning et al. studied the association of several SNPs within the gene encoding for TGF-β1 (TGFB1) and risk for invasive breast cancer in three case-control series(9). These studies identified a T29C (L10P) SNP associated with breast cancer risk. This particular SNP was also found to result in increased TGFB1 secretion(9). Validation studies included 11 studies with a total of 15,109 controls and 12,946 breast cancer cases. They were conducted by the Breast Cancer Associations Consortium (BCAC) and confirmed a significant association of TGFB1 T29C (L10P) with breast cancer for heterozygotes and rare homozygotes as compared to common homozygotes: O.R. 1.07, 95% CI 1.02-1.13 and O.R. 1.16, 95% CI 1.08-1.25, respectively(10). Validation studies from the BCAC have also identified a variant encoding for the caspase 8 gene (CASP8). CASP8 D302H was significantly associated with breast cancer risk: O.R. 0.89, 95% CI 0.85–0.94, and 0.74, 95% CI 0.62–0.87 for heterozygotes and rare homozygotes, respectively, as compared to common homozygotes(10). Of note, only two (22.2%) of the nine candidate variants were associated with breast cancer in these validation studies(10).

We have previously identified TGFBR1*6A(11), a common variant of the type one TGF-beta receptor, TGFBR1. Functional studies have shown that TGFBR1*6A, which encodes for 6 alanines (*6A) transduces TGF-ß beta growth inhibitory signals less effectively than wild type TGFBR1, which encodes for 9 alanines (*9A) (12, 13). Furthermore, we have shown that *6A may switch TGF-β growth inhibitory signals into growth stimulatory signals in cancer cells(14), enhancing the migration and invasion of breast cancer cells(15). The most recent meta-analysis of all reports investigating the association of *6A with cancer included 32 studies with a total of 27,809 cases and controls was recently published(16). Overall, *6A was significantly associated with cancer risk using all genetic models (for allelic effect: OR = 1.11; 95% CI = 1.03–1.21; for *6A/*6A vs. *9A/*9A: OR = 1.30; 95% CI = 1.01–1.69; for *9A/*6A vs. *9A/*9A: OR = 1.08; 95% CI = 1.01–1.15; for dominant model: OR = 1.08; 95% CI = 1.02–1.15; for recessive model: OR = 1.29; 95% CI = 1.00–1.68)(16).

Discovery of novel phenotypes and genotypes based on animal models of cancer

Given the potent role of TGF-ß beta signaling in controlling cell growth and the accumulating evidence that TGF-ß beta signaling is disrupted in most tumor types, we hypothesized that constitutively decreased TGF-ß beta signaling is associated with increased cancer risk(13). To test this hypothesis, we generated a normal mouse model of Tgfbr1 haploinsufficiency. A knockout mouse model of Tgfbr1 generated by targeted deletion of exon 3 has been previously described(17). However, there is growing evidence that the signal sequence of human *6A may have intrinsic biological effects, which are caused by mutations within the exon 1 GCG repeat sequence(18, 19). While the exon 3 Tgfbr1 knockout model does not result in the generation of functional Tgfbr1(17), the generation of a functionally active signal sequence cannot be excluded. To circumvent this potential problem, we designed a classical knockout vector to insert a Neomycin resistance cassette (Neo) into a Not I site located immediately after the start codon and removing 1.1kb of mouse genomic sequence immediately upstream of this Not I site(20). We found that Tgfbr1 haploinsufficient mice do not exhibit any particular phenotype and have a normal life span. However, when crossed with one of the most common mouse models of colorectal cancer, the ApcMin/+ mouse model, these mice exhibited a markedly increased propensity to develop colorectal cancer with a phenotype analogous to human colorectal cancer(20). We validated these findings in humans and discovered that constitutively decreased TGFBR1 is a common finding in patients with colorectal cancer(21). We also found that this novel phenotype is associated with 2 different haplotypes in humans, one of which includes *6A(21). The association of TGFBR1 haplotypes with cancer susceptibility was further studied in non-small cell lung cancer using seven TGFBR1 haplotype tagging SNPs (22). While none of the SNPs were associated with cancer risk, there was a strong association of a 4-SNP haplotype with non-small cell lung cancer risk (O.R. 0.11; 95% CI 0.03-0.39). This association is analogous to the *6A association with colorectal cancer risk, as *6A is not consistently associated with colorectal cancer risk(23) but appears to be part of the haplotype strongly associated with colorectal cancer risk(21). We are currently studying the association of TGFBR1 haplotypes as well as constitutively decreased TGFBR1 expression with breast cancer and colorectal cancer risk in large sibling case control studies, which should shed light on the precise role of these genetic and phenotypic alterations.

It has been generally accepted that interactions between SNPs or genes and interactions between genes and environmental factors substantially contribute to the genetic risk of cancer. Identification of such interactions could potentially lead to increased understanding about disease mechanisms and thus is an important and active research area(24). In 2009 Cordell published a survey of the methods and related software packages that are currently being used to detect these interactions(24). Using the Bayesian hierarchical generalized linear models of Yi and Banerjee(25), we detected strong interactions among the TGFBR1 haplotypes tagging SNPs in non-small cell lung cancer study[20] (Figure 1), and found that the model including interactions greatly improves the accuracy of data fit and risk prediction. The analysis of genetic interactions could help understand how the genetic effect(s) of a SNP varies with the genotypes at another SNP and aid the discovery of additional genetic variants.

Figure 1. Interactions among TGFBR1 haplotypes.

Figure 1

Probabilities of being case for two-locus genotypes at SNPs that show significant interactions under the epistatic model adjusted for gender, age, and smoking status. The notation c, h, and r represent common homozygote, heterozygote, and rare homozygote, respectively.

Conclusion

One of the main weaknesses of genetic epidemiology studies based on candidate gene approaches (Fig. 2) has been the lack of replication. Indeed, there have been many studies exploring a previously published statistically significant finding for a genetic variant, which failed to reproduce those findings, suggesting a large number of false-positive reports(26, 27). The utilization of large validation studies as well as meta-analyses and pooled analyses to combine both statistically significant and nonsignificant results from individual studies has allowed for the validation of several candidate genes but the majority of candidates have not been validated(6). Several gene variants that belong to the apoptosis pathway and the TGF-β signaling pathway have been identified through candidate gene approaches: CASP D302H, TGFB1 T29C (L10P) and TGFBR1*6A. Some variants have been validated in large validation studies(10) and some in large meta-analyses(16). The use of false-positive report probability allows for the integration of biological plausibility in the overall assessment of candidate genes(6). The association of three additional variants with cancer risk has been demonstrated using this rigorous approach: GSTM1 null, NAT2 slow acetylator and MTHFR C677T(6). There is growing evidence that a deeper understanding of gene × gene interactions may further help in the characterization of candidate genes. Furthermore, in vitro as well as in vivo models have allowed for the identification of novel haplotypes associated with cancer risk(21, 22), which are currently being validated. These findings provide strong support for additional studies of candidate genes that belong to cancer-related pathways. These studies are likely to either complement or confirm some of the GWA studies.

Figure 2.

Figure 2

Acknowledgements

This work was supported by grants CA112520, CA108741, CA137000, GM069430 and 5P60AR048098 from the NIH.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Reference List

  • (1).Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, et al. Environmental and heritable factors in the causation of cancer - Analyses of cohorts of twins from Sweden, Denmark, and Finland. N E J Med. 2000;343:78–85. doi: 10.1056/NEJM200007133430201. [DOI] [PubMed] [Google Scholar]
  • (2).Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
  • (3).Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. The sequence of the human genome. Science. 2001;291:1304–51. doi: 10.1126/science.1058040. [DOI] [PubMed] [Google Scholar]
  • (4).The International HapMap Consortium A haplotype map of the human genome. Nature. 2005;437:1299–320. doi: 10.1038/nature04226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. doi: 10.1016/s0092-8674(00)81683-9. [DOI] [PubMed] [Google Scholar]
  • (6).Dong LM, Potter JD, White E, Ulrich CM, Cardon LR, Peters U. Genetic Susceptibility to Cancer: The Role of Polymorphisms in Candidate Genes. JAMA: The Journal of the American Medical Association. 2008;299:2423–36. doi: 10.1001/jama.299.20.2423. This is the first systematic review of candidate gene association studies making use of the false-positive report probability. This study shows that approximately one-third of gene-variant cancer associations were statistically significant, with variants in genes encoding for metabolizing enzymes being the most significantly associated with cancer risk.
  • (7).Wacholder S, Chanock S, Garcia-Closas M, El ghormli L, Rothman N. Assessing the Probability That a Positive Report is False: An Approach for Molecular Epidemiology Studies. J Natl Cancer Inst. 2004;96:434–42. doi: 10.1093/jnci/djh075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Siegel PM, Massague J. Cytostatic and apoptotic actions of TGF-beta in homeostasis and cancer. Nat Rev Cancer. 2003;3:807–20. doi: 10.1038/nrc1208. [DOI] [PubMed] [Google Scholar]
  • (9).Dunning AM, Ellis PD, McBride S, Kirschenlohr HL, Healey CS, Kemp PR, et al. A Transforming Growth Factor{beta}1 Signal Peptide Variant Increases Secretion in Vitro and Is Associated with Increased Incidence of Invasive Breast Cancer. Cancer Res. 2003;63:2610–5. [PubMed] [Google Scholar]
  • (10).Cox A, Dunning AM, Garcia-Closas M, Balasubramanian S, Reed MWR, Pooley KA, et al. A common coding variant in CASP8 is associated with breast cancer risk. Nat Genet. 2007;39:352–8. doi: 10.1038/ng1981. [DOI] [PubMed] [Google Scholar]
  • (11).Pasche B, Luo Y, Rao PH, Nimer SD, Dmitrovsky E, Caron P, et al. Type I transforming growth factor beta receptor maps to 9q22 and exhibits a polymorphism and a rare variant within a polyalanine tract. Cancer Res. 1998;58:2727–32. [PubMed] [Google Scholar]
  • (12).Chen T, de Vries EG, Hollema H, Yegen HA, Vellucci VF, Strickler HD, et al. Structural alterations of transforming growth factor-beta receptor genes in human cervical carcinoma. Int J Cancer. 1999;82:43–51. doi: 10.1002/(sici)1097-0215(19990702)82:1<43::aid-ijc9>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]
  • (13).Pasche B, Kolachana P, Nafa K, Satagopan J, Chen YG, Lo RS, et al. T beta R-I(6A) is a candidate tumor susceptibility allele. Cancer Res. 1999;59:5678–82. [PubMed] [Google Scholar]
  • (14).Pasche B, Knobloch TJ, Bian Y, Liu J, Phukan S, Rosman D, et al. Somatic Acquisition and Signaling of TGFBR1*6A in Cancer. JAMA: The Journal of the American Medical Association. 2005;294:1634–46. doi: 10.1001/jama.294.13.1634. [DOI] [PubMed] [Google Scholar]
  • (15).Rosman DS, Phukan S, Huang CC, Pasche B. TGFBR1*6A Enhances the Migration and Invasion of MCF-7 Breast Cancer Cells through RhoA Activation. Cancer Res. 2008;68:1319–28. doi: 10.1158/0008-5472.CAN-07-5424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Liao RY, Mao C, Qiu LX, Ding H, Chen Q, Pan HF. TGFBR1*6A/9A polymorphism and cancer risk: a meta-analysis of 13,662 cases and 14,147 controls. Mol Biol Rep. 2009 doi: 10.1007/s11033-009-9906-7. [DOI] [PubMed] [Google Scholar]
  • (17).Larsson J, Goumans MJ, Sjostrand LJ, van Rooijen MA, Ward D, Leveen, et al. Abnormal angiogenesis but intact hematopoietic potential in TGF-beta type I receptor-deficient mice. EMBO Journal. 2001;20:1663–73. doi: 10.1093/emboj/20.7.1663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Pasche B, Knobloch TJ, Bian Y, Liu J, Phukan S, Rosman D, et al. Somatic Acquisition and Signaling of TGFBR1*6A in Cancer. JAMA: The Journal of the American Medical Association. 2005;294:1634–46. doi: 10.1001/jama.294.13.1634. [DOI] [PubMed] [Google Scholar]
  • (19).Rosman DS, Phukan S, Huang CC, Pasche B. TGFBR1*6A Enhances the Migration and Invasion of MCF-7 Breast Cancer Cells through RhoA Activation. Cancer Res. 2008;68:1319–28. doi: 10.1158/0008-5472.CAN-07-5424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Zeng Q, Phukan S, Xu Y, Sadim M, Rosman DS, Pennison M, et al. Tgfbr1 Haploinsufficiency Is a Potent Modifier of Colorectal Cancer Development. Cancer Res. 2009;69:678–86. doi: 10.1158/0008-5472.CAN-08-3980. This study shows that constitutively decreased Tgfbr1 signaling is a potent modifier of colorectal cancer susceptibility in mice and provided the scientific underpinning for human studies.
  • (21).Valle L, Serena-Acedo T, Liyanarachchi S, Hampel H, Comeras I, Li Z, et al. Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science. 2008;321:1361–5. doi: 10.1126/science.1159397. This study demonstrates that constitutively decreased expression of TGFBR1 in humans is a potent modifier of colorectal cancer.
  • (22).Lei Z, Liu RY, Zhao J, Liu Z, Jiang X, You W, et al. TGFBR1 Haplotypes and Risk of Non-Small-Cell Lung Cancer. Cancer Res. 2009;69:7046–52. doi: 10.1158/0008-5472.CAN-08-4602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Skoglund J, Song B, Dalen J, Dedorson S, Edler D, Hjern F, et al. Lack of an Association between the TGFBR1*6A Variant and Colorectal Cancer Risk. Clinical Cancer Research. 2007;13:3748–52. doi: 10.1158/1078-0432.CCR-06-2865. [DOI] [PubMed] [Google Scholar]
  • (24).Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009;10:392–404. doi: 10.1038/nrg2579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Yi N, Banerjee S. Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics. 2009;181:1101–13. doi: 10.1534/genetics.108.099556. This study develops hierarchical generalized linear models and computationally efficient algorithms for multiple interacting genes analysis for various types of phenotypes. The proposed models can simultaneously fit a large number of effects, including covariates, main effects of numerous loci, gene-gene (epistasis) and gene-environment (G×E) interactions.
  • (26).Ioannidis JPA, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG. Replication validity of genetic association studies. Nat Genet. 2001;29:306–9. doi: 10.1038/ng749. [DOI] [PubMed] [Google Scholar]
  • (27).Morgan TM, Krumholz HM, Lifton RP, Spertus JA. Nonvalidation of Reported Genetic Risk Factors for Acute Coronary Syndrome in a Large-Scale Replication Study. JAMA: The Journal of the American Medical Association. 2007;297:1551–61. doi: 10.1001/jama.297.14.1551. [DOI] [PubMed] [Google Scholar]

RESOURCES