Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Sep 1.
Published in final edited form as: Per Med. 2008 Nov 1;5(6):589–597. doi: 10.2217/17410541.5.6.589

Cancer genetic association studies in the genome-wide age

Sharon A Savage 1
PMCID: PMC2659339  NIHMSID: NIHMS79711  PMID: 19727433

Abstract

Genome-wide association studies of hundreds of thousands of SNPs have led to a deluge of studies of genetic variation in cancer and other common diseases. Large case–control and cohort studies have identified novel SNPs as markers of cancer risk. Genome-wide association study SNP data have also advanced understanding of population-specific genetic variation. While studies of risk profiles, combinations of SNPs that may increase cancer risk, are not yet clinically applicable, future, large-scale studies will make individualized cancer screening and prevention possible.

Keywords: cancer, cancer risk, genome-wide association study, genomics, population genetics, SNP

Genetic risk of cancer

Cancer is a complex, multifactorial disease with several well-described risk factors, such as smoking and lung cancer [1], infection with human papilloma virus and cervical cancer [2], and increasing age [3]. In addition to these factors, there is growing evidence that common and rare genetic variants also contribute to cancer risk.

Studies of cancer-prone families have led to important advances in cancer biology. For example, Knudson’s original description of the inheritance of retinoblastoma led to the identification of mutations in the RB gene, a tumor suppressor and transcriptional regulator, in these patients [4,5]. The study of families with Li–Fraumeni syndrome led to the identification of mutations in the TP53 gene, which has since been shown to be a critical transcription factor in normal cell growth, apoptosis and DNA repair [6-9]. Careful epidemiological study led to the identification of familial breast cancer pedigrees and the discovery of the BRCA1 and BRCA2 genes, which are important in DNA repair and highly penetrant breast and ovarian cancer risk factors [10-12].

However, despite these advances, our understanding of cancer genetic risk factors is still limited because the majority of people who develop cancer do not have a cancer predisposition syndrome. Instead, it appears to be a polygenic disease that can cluster in families but does not demonstrate a simple inheritance pattern. In this case, the combination of several common genetic variants could contribute to increased cancer risk in certain populations. The common disease/common variant hypothesis seeks to understand the effects of common genetic variation (typically >1% minor allele frequency) that could contribute to polygenic disorders, such as cancer. The basis of this hypothesis is that there may be several common variants that together contribute to the risk of common disease.

Growing from candidate genes to genome-wide association studies

SNPs are the most common form of genetic variation in the genome. Of the roughly 3 billion nucleotide bases in the genome, it is estimated that 10 million of these sites (1 in 300) are SNPs with a minor allele frequencies of at least 1% [13]. Other types of genetic variation between individual human genomes include insertions and deletions (from a single base to thousands of bases) and copy-number variation. Advances in understanding these genetic variants are reviewed elsewhere [14].

Gene sequencing and genotyping initiatives, such as the International HapMap Project [13], the SeattleSNP Variation Discovery Resource [101] and the SNP500Cancer project [15], led to improved understanding of population-specific genetic variation and the haplotype structure of the human genome. The presence of linkage disequilibrium, or the nonrandom segregation of alleles, has made it possible to choose SNPs that can serve as surrogates for untested markers by using the basic principle that shared haplotypes contain variants that track together across generations. Through the use of haplotype-tagging methods, based on various statistical algorithms, fewer SNPs can be genotyped and still provide a thorough interrogation of the genetic variation across a gene, a specific locus or the entire genome [16,17].

Technological advances have, in part, driven the advancement from single SNP studies to genome-wide association (GWA) studies of hundreds of thousands of SNPs. In less than 10 years, SNP assays have expanded from the time-consuming restriction fragment length polymorphism (RFLP) methods that evaluated one SNP at a time, to high-throughput genotyping of GWA studies on tens of thousands of samples. The next step, complete genome sequencing, will generate a more detailed map of human genetic variation. The 1000 Genomes Project, an international research consortium, will build upon the HapMap project by completely sequencing the genomes of 1000 individuals [102]. This project seeks to identify genetic variants with frequencies of 0.5–1% in the population studied. This high-resolution map will be important in understanding the genetic contributions to health and illness. Pilot studies in a small number of individuals are underway to evaluate the best approaches for this large-scale project.

Replicating candidate gene associations

Until recently, the majority of genetic association studies were small, candidate gene or pathway studies of a relatively small number of SNPs. An a priori hypothesis regarding the role the gene of interest played in the pathogenesis of the disease was considered to be a crucial factor in study design. Using this approach, associations of increased gastric cancer risk and the -251 T/A promoter polymorphisms in the interleukin 8 gene, a proinflammatory chemokine, were replicated in several studies [18-21], but appear to be population-dependent [22,23].

Studies such as these were encouraging, but the majority of such studies have not been reproducible. In a comprehensive review of over 600 positive association studies, only 166 associations were studied more than three-times and only six were consistently replicated [24]. Similarly, a meta-analysis of 25 different associations in 201 genetic association studies of complex diseases showed evidence of replication in less than half of the studies [25].

A recent study of 161 meta-analyses and pooled analyses of genetic polymorphisms and cancer risk evaluated 344 gene-variant associations from the literature [26]. Of these, 98 gene-variant associations were reported as statistically significant in the published studies. Analyses of these studies using the False Positive Report Probability (FPRP) method [27], which incorporates biological plausibility into SNP data interpretation and is hypothesized to increase the probability of find a true association, the authors found that 13 and 4 gene-variant associations were reproducible at FPRP values of 0.001 and 0.000001, respectively. Genes encoding metabolizing enzymes were among the most consistently associated with cancer risk in that study.

Readily-accessible genotype datasets that are annotated with key clinical features from numerous studies are now critical for the continued advancement of genetic association research. The Genetic Association Database [103], which compiles data from published association studies, is increasingly useful in guiding study design, SNP selection, and genetic association study replication. Genetic association studies are also being compiled through the Human Genome Epidemiology Network (HuGeNet) [28]. HuGeNet forms the basis for meta-analyses of genetic association studies in cancer and other diseases. The systematic analysis of these studies will help to identify the true-positive associations and potentially generate leads for future studies.

There are numerous factors that may contribute to these reproducibility challenges. Publication bias, the tendency of scientists and journals to publish positive findings, plays a role. This is changing as more journals are reviewing and publishing null results in genetic-association studies [29]. Sample size and multiple testing comparisons may limit the statistical power of the study. Cancer is a heterogeneous disease and careful pathologic classification of cases is necessary. Matching of controls based on ethnic groups and other population genetic factors is crucial but also prone to error. There can also be some error in genotyping results due to factors including variable DNA quality, assay conditions and others. The candidate gene/pathway approach assumes that current knowledge of the cancer biology related to these genes is relevant, however, there are likely numerous presently unknown genes and pathways that contribute to cancer risk. Cautious interpretation of these and other such epidemiological studies has been encouraged [30,31].

Genome-wide association studies in cancer

While the candidate gene approach has not been abandoned, the availability of whole-genome technologies has led investigators to take an agnostic approach; assume we know nothing about the genetics of the disorder, compile a sufficiently large sample size, and test as many SNPs as possible, without any a priori biological hypothesis. This has long been the approach of family-based linkage studies, which evaluate differences in the inheritance patterns of genetic markers (microsatellites or SNPs) to find loci associated with disease. The strength of such family-based studies is the relatedness of the subjects; however, they require large families and/or large numbers of families with the same disorder and do not always identify causative loci.

The present approach with GWA studies continues to evaluate and expand upon the common disease/common variant hypothesis. It takes advantage of high-throughput genotyping methods to evaluate a very large number of common SNPs. The current genotyping platforms utilize common SNPs (>5% minor allele frequency) chosen through haplotype tagging algorithms and are based on data from the HapMap and other sequencing and genotyping initiatives.

The major strengths of a GWA study are that common SNPs are comprehensively interrogated across the entire genome, and due to this, it is hypothesis generating, rather than driven by a priori assumptions [32,33]. The challenges in genome-wide scans are largely due to the enormous size of the dataset and statistical methods [34]. GWA studies of common disease in which genetic variation may contribute relative risks of less than 1.5 require very large sample sizes and several replication strategies. Owing to the large number of statistical tests needed in these datasets, very high significance thresholds and replication are required. These efforts have resulted in the formation of large national and international consortia in which cases and controls are pooled.

A summary of the growing list of cancer GWA studies is presented in Table 1. Thus far, the majority of studies have evaluated common cancers, such as breast, prostate, and colon cancer. GWA studies identified SNPs in the 8q24.21 locus as associated with prostate and colon cancer [35-39]. Subsequent studies suggest that certain variants in this region are also associated with increased breast, and ovarian cancer risk [37,40]. This region at 8q24.21 is a ‘gene desert’ whose functional importance is yet to be discerned. Additional collaborative efforts between investigators in genomics, population genetics, epidemiology, molecular biology and other basic sciences will be critical to understanding the role of this region in cancer risk.

Table 1.

Overview of cancer genome-wide association studies.

Cancer type Cases/controls in first stage (n) Number of SNPs analyzed in first stage Total number of cases/controls in study* Gene or locus Estimated attributable risk (%) Ref.
Breast 1145/1142 528,173 2921/3214 FGFR2 16 [41]
390/364§ 205,586 26,258/26,894 FGFR2, TNRC9, MAP3K1, LSP1 5.5–10.5 [64]
249/299 150,080 1442/1465 6q22.33 (ECHDC1, RNF146) 7 [65]
1599/11,546 311,524 4533/17,513 2q35, 16q12 (TNRC9) 13–14# [66]

Colorectal 930/960§ 547,647 8264/6206 8q24.21 20 **[67]
1257/1336 99,632 7480/7779 8q24.21 NR [35]
922/927§ 547,647 18,794/18,453 8q23.3, 10p14 NR **[68]
940/965§ 547,647 8413/6949 18q21(SMAD7) 15 **[69]
981/1002‡‡ 541,628 17,457/16,353 11q23, 8q24.21, 18q21 (SMAD7) 3.3–9.6 [70]

Gastric 188/752 85,576 932/1398 PSCA NR [71]

Lung 1926/2522 310,023 4493/7274 CHRNA5, CHRNA3, CHRNB4## 14 [72]
10,995§§ 306,207 1024/32,244 CHRNA5, CHRNA3, CHRNB4## 18 [73]
1154/1137 317,498 3878/4831 CHRNA##, CRP, IL1RAP NR [74]

Melanoma 864/864 535,150 2094/2215 20q11.22 (CDC91L1) 11 [75]

Neuroblastoma 1032/2043 464,934 1752/4171 6p22.3 (FLJ22536, FLJ44180) NR [43]

Prostate 1435/3064 316,515 3018/5881 8q24.21 10–24 [38]
1501/11,290 310,520 3493/14,348 17q12 (HNF1B¶¶), 17q24.3 36*** [76]
1854/21,372 310,520 10,093/28,962 2p15, Xp11.22 5 to 7 [77]
40/40 11,555 1454/1636 1q25, 7p21 NR [78]
1172/1157 538,548 4296/4299 8q24.21 21 [36]
1235/1599‡‡‡ 60,275 2477/2516 DAB2IP NR [79]

Prostate 1172/1157 527,869 5113/5121 8q24, 17q (HNF1Bi), chr10 (MSMB, CTBP2), chr7 (JAZF1) 8–20 [80]
1854/1894 541,129 5122/5260 8q24, 17q, other loci on chr 3, 6, 7, 10, 11, 19, X 15 [39]

This is a general overview and is not intended to be a comprehensive description of the data.

*

Includes number of cases and controls in the original scan plus those in the replication studies.

This is the value, or range of values, reported in the manuscript for the population that was studied. Different methods to calculate the population attributable risk or fraction may have been used and as such, the values may not be directly comparable.

§

Included subjects with a family history of the cancer type studied.

In the first stage cases were high-risk, BRCA1/2 mutation-negative Ashkenazi Jewish women with breast cancer who also had at least three family members with breast cancer. Controls were cancer-free Ashkenazi Jewish controls. In the replication studies, cases were Ashkenazi Jewish breast cancer cases.

#

Combined population attributable risk for SNPs in both regions was 25%.

**

Studies derived from familial colorectal cancer cases. The authors also reported data on subjects with adenoma only.

‡‡

This study included early-onset colon cancer cases.

§§

This study sought to identify genes that were associated with smoking, an important cause of lung cancer. The genes associated were nicotinic acetylcholine receptors and associated with both smoking and lung cancer.

¶¶

HNF1B was previously known as TCF2.

##

CHRNA5, CHRNA3 and CHRNB4 have also been associated with nicotine dependence.

***

This was a joint population attributable risk due to the combination of the two high-risk SNPs, rs4430796 and rs1859962.

‡‡‡

Combined SNPs from a scan of 498 cases and 494 controls with cases and controls from the CGEMS study. The SNPs reported were SNPs that overlapped between the two studies.

CGEMS: Cancer Genetic Markers of Susceptibility; chr: Chromosome; CI: Confidence interval; NR: Not reported.

Two GWA studies of breast cancer identified SNPs in a biologically relevant gene, FGFR2, as associated with increased risk. This is particularly encouraging because the FGFR2 gene encodes a receptor, tyrosine kinase, that is amplified or overexpressed in some breast cancers [41,42]. Other breast cancer GWA studies have found at least six loci associated with relatively modestly increased risk (Table 1).

To date, the majority of cancer genome-wide scans have been conducted in common cancers. Interestingly, a recent GWA study of neuroblastoma, a rare childhood tumor, found increased risk of neuroblastoma with certain SNPs in the 6p22.3 region [43]. Comprehensive genetic-association studies of childhood cancer have the potential for high-yield findings. It is possible that pediatric cancer risk has a stronger genetic component than cancers that affect the elderly, primarily because the latency period of environmental exposures that could contribute to pediatric cancer risk is significantly shorter. Several GWA studies of pediatric cancer are currently being planned to investigate this hypothesis.

Genome-wide scans & population genetics

Genome-wide scans are valuable beyond studies of disease risk. Population-specific genetic variation can be widely variable between different ethnic groups. The differences in genetic variation between ethnic groups due to a combination of evolutionary history, migration and population admixture are now incorporated into study design and analysis. Genome-wide analyses have further advanced this understanding.

For example, the genetic relationships between 938 unrelated individuals from 51 populations were studied using a panel of 650,000 SNPs [44]. This study showed that the relationship between haplotype heterozygosity and geography was consistent with serial founder effect originating in sub-Saharan Africa. An evaluation of 388,654 SNPs from 102 subjects in the SNP500Cancer panel showed that even with modest sample sizes, genes with high genetic distance between subpopulations could be identified [45]. In a larger subanalysis of two GWA studies in European Americans, genotyped in the Cancer Genetic Markers of Susceptibility (CGEMS) project, principal component analyses were used to identify sets of SNPs that can identify and be used to correct population stratification [46].

A challenge in the design of GWA studies is obtaining an appropriate number of cases and controls for adequate statistical power after correcting for multiple comparisons. One approach to increasing statistical power is through increasing the number of control subjects. Population genetic analyses of SNPs in GWA studies have suggested that sharing ‘genomic’ controls is a valid approach for increasing control sample size. This method evaluates the population-specific genetic variation in the control groups of several studies and tests for genetic similarities. Based on these analyses, subjects can be matched based on their SNP profiles. This approach was used in a large study of seven diseases in 14,000 cases and 3000 shared controls [47]. Previously known as well as novel risk loci were identified in diabetes, coronary artery disease, Crohn’s disease, and rheumatoid arthritis. The pooling of such controls could be used in studies of rare cancers in which case samples are most readily available through clinical trial networks and standard matched controls are not available. A major limitation of shared genomic controls is that it is not possible to include environmental risk factors in the analyses.

Additional benefits of the genome-wide age are the formation of large international collaborations, the advancement of ‘team’ science, and data sharing. The NIH recently implemented a genome-wide association study data-sharing policy [104] for its grantees and intramural investigators that articulates the expectation that investigators will make such datasets available in the central NIH repository of genome-wide association study data, the Database of Genotypes and Phenotypes (dbGaP) [105]. Posting should occur as soon as the appropriate quality-control assessments are complete. The primary investigators will have a 12-month period of exclusive data access to permit them to submit genome-wide association studies dataset analyses for publication.

Clinical translation: are we ready?

The success of GWA studies in cancer and other diseases has helped to advance to the concept of genomic medicine. Goals of the clinical translation of genomic research include:

  • Risk prediction and disease prevention based on these risks

  • Improved mechanistic understanding of disease

  • Development of preventive and/or therapeutic interventions

  • Medication response and side effects (pharmaco-genomics)

On an individual patient level, this is challenging because the majority of individual SNP associations resulting from GWA studies of cancer risk have odds ratios less than 1.5 and relatively low population attributable risks (Table 1). For a large population, an odds ratio of 1.2 could have public health implications, but until an intervention has been proven to be effective for that population, translation of these risks to large or small populations will be limited.

Attempts are being made to evaluate the effects of the presence or absence of several alleles conferring a very modest increased risk with the overall goal to develop a test that is relevant to both the general population and the individual patient. The potential utility of a breast cancer risk profile based on seven SNPs generated in GWA studies was recently evaluated [48]. The authors created risk profiles based on these SNPs and evaluated the potential usefulness of the profiles in identifying women at increased risk of breast cancer who would benefit from cancer screening programs. They concluded that the risk profiles generated by these common moderate-risk alleles did not sufficiently identify individual women at increased risk who would benefit from individualized prevention. However, they suggest that if genotype data were available on all women in a large population the risk profiles could be useful at the population screening level.

The 8q24 locus and prostate cancer risk has been an active area of investigation at the individual level. Several SNPs in this region were identified in GWA studies and confirmed in several other studies that focused only on SNPs in that region [49]. The associations are statistically strong but most had small effect sizes (i.e., odds ratios were generally <1.5). The 8q24 SNP, rs1447295, had an odds ratio of prostate cancer of 1.93 for familial prostate cancer patients compared with control subjects [50]. In another study, men with a combination of two SNPs in 8q24 were 2.7-times more likely to develop prostate cancer than men with neither risk allele [51]. Subsequent studies of the same loci, suggested that men with a certain combination of five SNPs from 8q24, 17q12, and 17q23.3 had a ninefold increased risk of prostate cancer and that this was worth consideration as a clinically relevant test [52]. This led to much debate in the literature, which made it clear that confirmatory studies and careful consideration of the global implications were warranted before such a test should move forward [53-60].

This debate continues to grow as several companies have now made available direct-to-consumer SNP panels that genotype the same SNPs as many of the GWA studies described here. The interested individual can, for a fee, have their SNP ‘profile’ determined and obtain a list of diseases and their level of risk [61]. The clinical utility and public health impact of such data is not proven. Testing women with a strong family history of breast and ovarian cancer for BRCA1/2 mutations has implications in cancer screening and even decisions for prophylactic surgery [62]. However, the effect of a SNP or SNPs that may increase an individuals risk by 1.5-fold over the general population is not known.

Owing to these concerns, a framework for the translation of genomic research into healthcare and disease prevention was proposed which outlines suggested phases of this research [63]. Four translation research phases were proposed:

  • Phase 1 translation – discovery

  • Phase 2 translation – health application to evidence-based guidelines

  • Phase 3 translation – practice guidelines to health practice

  • Phase 4 translation – practice to population health impact

Within Phase 2 translation, the analytic and clinical validity, clinical utility, and ethical (ACCE), legal and social implications must also be considered.

Conclusion & future perspective

The rapid advancement of genomic findings in common diseases has led to expectations of clinical utility in the near future. However, there are several issues that need to be addressed before personalized prediction of cancer risk can be achieved. The dawn of the genome-wide age has led to a deluge of data that must be rigorously tested and replicated in studies with high statistical power. We must remember that SNPs are markers of the genome and are not likely to be causal on their own. To that end, thoughtful collaboration between basic scientists, genomicists, bioinformaticians and epidemiologists is required to advance understanding of the genetic basis of cancer risk and its contribution to cancer biology.

In order to follow-up on recent findings, comprehensive, prospective clinical studies of large cohorts will need to be conducted before the widespread application of genomic medicine occurs. In the next 5–10 years, as the body of GWA studies of cancer continues to grow, individual risk profiles based on a combination of a large number of SNP markers will be created and tested. It will be possible to adjust screening practices for common cancers, such as breast or prostate cancer, based on an individual’s genetic cancer risk profile. However, in order for genomics to have a positive impact on disease prevention and/or treatment careful study design, and consideration of the ethical, and legal consequences such studies must be included. When that occurs, the goal of patient-specific screening and/or intervention, informed by individual genotype data, may be achievable.

Executive summary.

  • Both common and rare genetic variants may contribute to cancer risk.

  • Advances in genotyping technology have made large-scale genome-wide association (GWA) studies possible.

  • Prior to the advent of GWA studies, most studies of SNPs and cancer risk were not reproducible.

  • GWA studies comprehensively evaluate common (>5% minor allele frequency) genetic variation across the genome. They are hypothesis generating rather than driven by a priori assumptions.

  • Numerous novel risk loci have been identified in GWA studies of cancer.

  • The large number of SNPs evaluated in GWA studies has advanced our understanding of population-specific genetic variation and made the use of shared ‘genomic’ controls possible.

  • Collaboration between basic scientists, genomicists, bioinformaticians and epidemiologists is required to advance understanding of the genetic basis of cancer risk and its contribution to cancer biology.

  • The creation of cancer risk profiles based on common genetic variants identified in GWA studies are not yet clinically applicable to individuals, but hold great promise for individualized cancer prevention.

Acknowledgments

Financial & competing interests disclosure

This research was supported by the Intramural Research Program of the National Institutes of Health and the National Cancer Institute. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

Bibliography

Papers of special note have been highlighted as: • of interest

  • 1.Wogan GN, Hecht SS, Felton JS, Conney AH, Loeb LA. Environmental and chemical carcinogenesis. Semin Cancer Biol. 2004;14(6):473–486. doi: 10.1016/j.semcancer.2004.06.010. [DOI] [PubMed] [Google Scholar]
  • 2.Schiffman M, Castle PE, Jeronimo J, Rodriguez AC, Wacholder S. Human papillomavirus and cervical cancer. Lancet. 2007;370(9590):890–907. doi: 10.1016/S0140-6736(07)61416-0. [DOI] [PubMed] [Google Scholar]
  • 3.DePinho RA. The age of cancer. Nature. 2000;408(6809):248–254. doi: 10.1038/35041694. [DOI] [PubMed] [Google Scholar]
  • 4.Knudson AG., Jr Mutation and cancer: statistical study of retinoblastoma. Proc Natl Acad Sci USA. 1971;68(4):820–823. doi: 10.1073/pnas.68.4.820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zhu L. Tumour suppressor retinoblastoma protein Rb: a transcriptional regulator. Eur J Cancer. 2005;41(16):2415–2427. doi: 10.1016/j.ejca.2005.08.009. [DOI] [PubMed] [Google Scholar]
  • 6.Li FP, Fraumeni JF., Jr Soft-tissue sarcomas, breast cancer, and other neoplasms. A familial syndrome? Ann Intern Med. 1969;71(4):747–752. doi: 10.7326/0003-4819-71-4-747. • First description of the Li Fraumeni cancer predisposition syndrome.
  • 7.Sengupta S, Harris CC. p53: traffic cop at the crossroads of DNA repair and recombination. Nat Rev Mol Cell Biol. 2005;6(1):44–55. doi: 10.1038/nrm1546. [DOI] [PubMed] [Google Scholar]
  • 8.Malkin D, Li FP, Strong LC, et al. Germ line p53 mutations in a familial syndrome of breast cancer, sarcomas, and other neoplasms. Science. 1990;250(4985):1233–1238. doi: 10.1126/science.1978757. [DOI] [PubMed] [Google Scholar]
  • 9.Li FP, Fraumeni JF, Jr, Mulvihill JJ et al. A cancer family syndrome in twenty-four kindreds. Cancer Res. 1988;48(18):5358–5362. [PubMed] [Google Scholar]
  • 10.Wang W. Emergence of a DNA-damage response network consisting of Fanconi anaemia and BRCA proteins. Nat Rev Genet. 2007;8(10):735–748. doi: 10.1038/nrg2159. [DOI] [PubMed] [Google Scholar]
  • 11.Wooster R, Neuhausen SL, Mangion J, et al. Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12–13. Science. 1994;265(5181):2088–2090. doi: 10.1126/science.8091231. • This was the first description of the BRCA2 as a breast cancer susceptibility gene.
  • 12.Hall JM, Lee MK, Newman B, et al. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990;250(4988):1684–1689. doi: 10.1126/science.2270482. • This work led to the discovery of the BRCA1 gene in familial breast and ovarian cancer.
  • 13.The International HapMap Consortium. The International HapMap Project. Nature. 2003;426(6968):789–796. doi: 10.1038/nature02168. • Describes the scientific background and plan for the HapMap.
  • 14.Eichler EE, Nickerson DA, Altshuler D, et al. Completing the map of human genetic variation. Nature. 2007;447(7141):161–165. doi: 10.1038/447161a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Packer BR, Yeager M, Burdett L, et al. SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes. Nucleic Acids Res. 2006;34(Database issue):D617–D621. doi: 10.1093/nar/gkj151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Stram DO. Tag SNP selection for association studies. Genet Epidemiol. 2004;27(4):365–374. doi: 10.1002/gepi.20028. [DOI] [PubMed] [Google Scholar]
  • 17.de Bakker PI, Yelensky R, Pe’er I, Gabriel SB, Daly MJ, Altshuler D. Efficiency and power in genetic association studies. Nat Genet. 2005;37(11):1217–1223. doi: 10.1038/ng1669. [DOI] [PubMed] [Google Scholar]
  • 18.Savage SA, Abnet CC, Mark SD, et al. Variants of the IL8 and IL8RB genes and risk for gastric cardia adenocarcinoma and esophageal squamous cell carcinoma. Cancer Epidemiol Biomarkers Prev. 2004;13(12):2251–2257. [PubMed] [Google Scholar]
  • 19.Lee WP, Tai DI, Lan KH, et al. The -251T allele of the interleukin-8 promoter is associated with increased risk of gastric carcinoma featuring diffuse-type histopathology in Chinese population. Clin Cancer Res. 2005;11(18):6431–6441. doi: 10.1158/1078-0432.CCR-05-0942. [DOI] [PubMed] [Google Scholar]
  • 20.Ohyauchi M, Imatani A, Yonechi M, et al. The polymorphism interleukin 8 -251 A/T influences the susceptibility of Helicobacter pylori related gastric diseases in the Japanese population. Gut. 2005;54(3):330–335. doi: 10.1136/gut.2003.033050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Taguchi A, Ohmiya N, Shirai K, et al. Interleukin-8 promoter polymorphism increases the risk of atrophic gastritis and gastric cancer in Japan. Cancer Epidemiol Biomarkers Prev. 2005;14(11 Pt 1):2487–2493. doi: 10.1158/1055-9965.EPI-05-0326. [DOI] [PubMed] [Google Scholar]
  • 22.Savage SA, Hou L, Lissowska J, et al. Interleukin-8 polymorphisms are not associated with gastric cancer risk in a Polish population. Cancer Epidemiol Biomarkers Prev. 2006;15(3):589–591. doi: 10.1158/1055-9965.EPI-05-0887. [DOI] [PubMed] [Google Scholar]
  • 23.Garza-Gonzalez E, Bosques-Padilla FJ, Mendoza-Ibarra SI, Flores-Gutierrez JP, Maldonado-Garza HJ, Perez-Perez GI. Assessment of the toll-like receptor 4 Asp299Gly, Thr399Ile and interleukin-8 -251 polymorphisms in the risk for the development of distal gastric cancer. BMC Cancer. 2007;7:70. doi: 10.1186/1471-2407-7-70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med. 2002;4(2):45–61. doi: 10.1097/00125817-200203000-00002. • One of many papers that have pointed out the lack of reproducibility in genetic association studies.
  • 25.Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003;33(2):177–182. doi: 10.1038/ng1071. • One of many papers that have pointed out the lack of reproducibility in genetic association studies.
  • 26.Dong LM, Potter JD, White E, Ulrich CM, Cardon LR, Peters U. Genetic susceptibility to cancer: the role of polymorphisms in candidate genes. JAMA. 2008;299(20):2423–2436. doi: 10.1001/jama.299.20.2423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96(6):434–442. doi: 10.1093/jnci/djh075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Khoury MJ, Little J, Gwinn M, Ioannidis JP. On the synthesis and interpretation of consistent but weak gene-disease associations in the era of genome-wide association studies. Int J Epidemiol. 2007;36(2):439–445. doi: 10.1093/ije/dyl253. • An excellent review of modest findings in genetic-association studies.
  • 29.Rebbeck TR, Martinez ME, Sellers TA, Shields PG, Wild CP, Potter JD. Genetic variation and cancer: improving the environment for publication of association studies. Cancer Epidemiol Biomarkers Prev. 2004;13(12):1985–1986. [PubMed] [Google Scholar]
  • 30.Boffetta P, McLaughlin JK, La Vecchia C, Tarone RE, Lipworth L, Blot WJ. False-positive results in cancer epidemiology: a plea for epistemological modesty. J Natl Cancer Inst. 2008;100(14):988–995. doi: 10.1093/jnci/djn191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ioannidis JP, Gwinn M, Little J, et al. A road map for efficient and reliable human genome epidemiology. Nat Genet. 2006;38(1):3–5. doi: 10.1038/ng0106-3. • Excellent description of important issues to consider in genetic epidemiology.
  • 32.Hunter DJ, Thomas G, Hoover RN, Chanock SJ. Scanning the horizon: what is the future of genome-wide association studies in accelerating discoveries in cancer etiology and prevention? Cancer Causes Control. 2007;18(5):479–484. doi: 10.1007/s10552-007-0118-y. • Excellent description of important issues to consider in genetic epidemiology.
  • 33.Amos CI. Successful design and conduct of genome-wide association studies. Hum Mol Genet. 2007;16(Spec No 2):R220–R225. doi: 10.1093/hmg/ddm161. • Excellent description of important issues to consider in genetic epidemiology.
  • 34.Hunter DJ, Kraft P. Drinking from the fire hose – statistical issues in genome wide association studies. N Engl J Med. 2007;357(5):436–439. doi: 10.1056/NEJMp078120. [DOI] [PubMed] [Google Scholar]
  • 35.Zanke BW, Greenwood CM, Rangrej J, et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nat Genet. 2007;39(8):989–994. doi: 10.1038/ng2089. [DOI] [PubMed] [Google Scholar]
  • 36.Yeager M, Orr N, Hayes RB, et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet. 2007;39(5):645–649. doi: 10.1038/ng2022. [DOI] [PubMed] [Google Scholar]
  • 37.Schumacher FR, Feigelson HS, Cox DG, et al. A common 8q24 variant in prostate and breast cancer from a large nested case-control study. Cancer Res. 2007;67(7):2951–2956. doi: 10.1158/0008-5472.CAN-06-3591. [DOI] [PubMed] [Google Scholar]
  • 38.Gudmundsson J, Sulem P, Manolescu A, et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nat Genet. 2007;39(5):631–637. doi: 10.1038/ng1999. [DOI] [PubMed] [Google Scholar]
  • 39.Eeles RA, Kote-Jarai Z, Giles GG, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet. 2008;40(3):316–321. doi: 10.1038/ng.90. [DOI] [PubMed] [Google Scholar]
  • 40.Ghoussaini M, Song H, Koessler T, et al. Multiple loci with different cancer specificities within the 8q24 gene desert. J Natl Cancer Inst. 2008;100(13):962–966. doi: 10.1093/jnci/djn190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hunter DJ, Kraft P, Jacobs KB, et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet. 2007;39(7):870–874. doi: 10.1038/ng2075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Easton DF, Pooley KA, Dunning AM, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447(7148):1087–1093. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Maris JM, Mosse YP, Bradfield JP, et al. Chromosome 6p22 locus associated with clinically aggressive neuroblastoma. N Engl J Med. 2008;358(24):2585–2593. doi: 10.1056/NEJMoa0708698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Li JZ, Absher DM, Tang H, et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319(5866):1100–1104. doi: 10.1126/science.1153717. [DOI] [PubMed] [Google Scholar]
  • 45.Hughes AL, Welch R, Puri V, et al. Genome-wide SNP typing reveals signatures of population history. Genomics. 2008;92(1):1–8. doi: 10.1016/j.ygeno.2008.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Yu K, Wang Z, Li Q, et al. Population substructure and control selection in genome-wide association studies. PLoS one. 2008;3(7):E2551. doi: 10.1371/journal.pone.0002551. • Identified sets of SNPs that can identify and be used to correct population stratification.
  • 47.Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–678. doi: 10.1038/nature05911. • One of the largest studies thus far to use genomic controls.
  • 48.Pharoah PD, Antoniou AC, Easton DF, Ponder BA. Polygenes, risk prediction, and targeted prevention of breast cancer. N Engl J Med. 2008;358(26):2796–2803. doi: 10.1056/NEJMsa0708739. [DOI] [PubMed] [Google Scholar]
  • 49.Savage SA, Greene MH. The evidence for prostate cancer risk loci at 8q24 grows stronger. J Natl Cancer Inst. 2007;99(20):1499–1501. doi: 10.1093/jnci/djm186. [DOI] [PubMed] [Google Scholar]
  • 50.Wang L, McDonnell SK, Slusser JP, et al. Two common chromosome 8q24 variants are associated with increased risk for prostate cancer. Cancer Res. 2007;67(7):2944–2950. doi: 10.1158/0008-5472.CAN-06-3186. [DOI] [PubMed] [Google Scholar]
  • 51.Zheng SL, Sun J, Cheng Y, et al. Association between two unlinked loci at 8q24 and prostate cancer risk among European Americans. J Natl Cancer Inst. 2007;99(20):1525–1533. doi: 10.1093/jnci/djm169. [DOI] [PubMed] [Google Scholar]
  • 52.Zheng SL, Sun J, Wiklund F, et al. Cumulative association of five genetic variants with prostate cancer. N Engl J Med. 2008;358(9):910–919. doi: 10.1056/NEJMoa075819. [DOI] [PubMed] [Google Scholar]
  • 53.Coates RJ, Khoury MJ, Gwinn M. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2738. doi: 10.1056/NEJMc080680. [DOI] [PubMed] [Google Scholar]
  • 54.Gelmann EP. Complexities of prostate-cancer risk. N Engl J Med. 2008;358(9):961–963. doi: 10.1056/NEJMe0708703. [DOI] [PubMed] [Google Scholar]
  • 55.Eisinger F. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2740–2741. [PubMed] [Google Scholar]
  • 56.Gartner CE, Barendregt JJ, Hall WD. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2738–2739. [PubMed] [Google Scholar]
  • 57.Janssens AC, van Duijn CM. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2739. [PubMed] [Google Scholar]
  • 58.Severi G, Byrnes GB, Hopper JL. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2739–2740. [PubMed] [Google Scholar]
  • 59.Thorat MA. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2740. [PubMed] [Google Scholar]
  • 60.Vickers A, Lilja H, Scardino P. Five genetic variants associated with prostate cancer. N Engl J Med. 2008;358(25):2740. [PubMed] [Google Scholar]
  • 61.Hunter DJ, Khoury MJ, Drazen JM. Letting the genome out of the bottle – will we get our wish? N Engl J Med. 2008;358(2):105–107. doi: 10.1056/NEJMp0708162. [DOI] [PubMed] [Google Scholar]
  • 62.Greene MH, Piedmonte M, Alberts D, et al. A prospective study of risk-reducing salpingo-oophorectomy and longitudinal CA–125 screening among women at increased genetic risk of ovarian cancer: design and baseline characteristics: a Gynecologic Oncology Group study. Cancer Epidemiol Biomarkers Prev. 2008;17(3):594–604. doi: 10.1158/1055-9965.EPI-07-2703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Khoury MJ, Gwinn M, Yoon PW, Dowling N, Moore CA, Bradley L. The continuum of translation research in genomic medicine: how can we accelerate the appropriate integration of human genome discoveries into health care and disease prevention? Genet Med. 2007;9(10):665–674. doi: 10.1097/GIM.0b013e31815699d0. • Suggests a framework for translating genomic studies to clinical medicine.
  • 64.Easton DF, Pooley KA, Dunning AM, et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007;447(7148):1087–1093. doi: 10.1038/nature05887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Gold B, Kirchhoff T, Stefanov S, et al. Genome-wide association study provides evidence for a breast cancer risk locus at 6q22.33. Proc Natl Acad Sci USA. 2008;105(11):4340–4345. doi: 10.1073/pnas.0800441105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Stacey SN, Manolescu A, Sulem P, et al. Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nat Genet. 2007;39(7):865–869. doi: 10.1038/ng2064. [DOI] [PubMed] [Google Scholar]
  • 67.Tomlinson I, Webb E, Carvajal-Carmona L, et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nat Genet. 2007;39(8):984–988. doi: 10.1038/ng2085. [DOI] [PubMed] [Google Scholar]
  • 68.Tomlinson IP, Webb E, Carvajal-Carmona L, et al. A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3. Nat Genet. 2008;40(5):623–630. doi: 10.1038/ng.111. [DOI] [PubMed] [Google Scholar]
  • 69.Broderick P, Carvajal-Carmona L, Pittman AM, et al. A genome-wide association study shows that common alleles of SMAD7 influence colorectal cancer risk. Nat Genet. 2007;39(11):1315–1317. doi: 10.1038/ng.2007.18. [DOI] [PubMed] [Google Scholar]
  • 70.Tenesa A, Farrington SM, Prendergast JG, et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21. Nat Genet. 2008;40(5):631–637. doi: 10.1038/ng.133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Sakamoto H, Yoshimura K, Saeki N, et al. Genetic variation in PSCA is associated with susceptibility to diffuse-type gastric cancer. Nat Genet. 2008;40(6):730–740. doi: 10.1038/ng.152. [DOI] [PubMed] [Google Scholar]
  • 72.Hung RJ, McKay JD, Gaborieau V, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008;452(7187):633–637. doi: 10.1038/nature06885. [DOI] [PubMed] [Google Scholar]
  • 73.Thorgeirsson TE, Geller F, Sulem P, et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 2008;452(7187):638–642. doi: 10.1038/nature06846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Amos CI, Wu X, Broderick P, et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet. 2008;40(5):616–622. doi: 10.1038/ng.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Brown KM, Macgregor S, Montgomery GW, et al. Common sequence variants on 20q11.22 confer melanoma susceptibility. Nat Genet. 2008;40(7):817–818. doi: 10.1038/ng.163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Gudmundsson J, Sulem P, Steinthorsdottir V, et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against Type 2 diabetes. Nat Genet. 2007;39(8):977–983. doi: 10.1038/ng2062. [DOI] [PubMed] [Google Scholar]
  • 77.Gudmundsson J, Sulem P, Rafnar T, et al. Common sequence variants on 2p15 and Xp11.22 confer susceptibility to prostate cancer. Nat Genet. 2008;40(3):281–283. doi: 10.1038/ng.89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Nam RK, Zhang WW, Loblaw DA, et al. A genome-wide association screen identifies regions on chromosomes 1q25 and 7p21 as risk loci for sporadic prostate cancer. Prostate Cancer Prostatic Dis. 2007;11(3):241–246. doi: 10.1038/sj.pcan.4501010. [DOI] [PubMed] [Google Scholar]
  • 79.Duggan D, Zheng SL, Knowlton M, et al. Two genome-wide association studies of aggressive prostate cancer implicate putative prostate tumor suppressor gene DAB2IP. J Natl Cancer Inst. 2007;99(24):1836–1844. doi: 10.1093/jnci/djm250. [DOI] [PubMed] [Google Scholar]
  • 80.Thomas G, Jacobs KB, Yeager M, et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet. 2008;40(3):310–315. doi: 10.1038/ng.91. [DOI] [PubMed] [Google Scholar]

Websites

RESOURCES