Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Dec 1.
Published in final edited form as: Cancer Discov. 2012 Dec;2(12):1087–1090. doi: 10.1158/2159-8290.CD-12-0424

Integrative Cancer Epidemiology - The Next Generation

Margaret R Spitz 1, Neil E Caporaso 2, Thomas A Sellers 3
PMCID: PMC3531829  NIHMSID: NIHMS417110  PMID: 23230187

Summary

We outline an integrative approach to extend the boundaries of molecular cancer epidemiology by integrating modern and rapidly evolving “omics” technologies into state-of-the-art molecular epidemiology. In this way, one can comprehensively explore the mechanistic underpinnings of epidemiologic observations into cancer risk and outcome. We highlight the exciting opportunities to collaborate across large observational studies and to forge new interdisciplinary collaborative ventures.

Keywords: multidisciplinary epidemiologic research, integrating new technologies

Background

Epidemiologic studies of all designs have contributed to much of our understanding of cancer etiology (1). Yet the discipline has also received its fair share of criticism (2) with charges that it generates “conflicting results” that tend to “confuse” the public and “disorient” policy makers; and that it is “forever sounding false alarms”. The field has matured dramatically since molecular epidemiology first emerged as a defined discipline in the late 1980s as an extension of traditional (classical) epidemiologic research to analyze links between disease and both exposure and biologic risk factors (3). This mandated incorporating biospecimens into classic epidemiological study designs and enabled the merging of molecular and biochemical markers of exposure and/or early effect with questionnaire data. The goal was to understand mechanisms of carcinogenesis and the interplay between lifestyle behaviors, exposure, genes, and cancer etiology. We are now transitioning to an era where applying advanced technologies (including high throughput platforms for genotyping and sequencing, omics-based approaches for biomarker discovery and targeted therapies, novel imaging opportunities, and advanced statistical and bioinformatic tools) allows us to dissect the molecular basis of carcinogenesis. Early on in the progression of this new research approach, Sellers (4) was already highlighting the need “to carefully consider the perspectives and expertise of epidemiology and genetics in the design, conduct, and execution” of these large studies that were becoming data-driven, complex and expensive. In their recent commentary, “Bigger, Better, Sooner–Scaling Up for Success” Thun et al. also pointed to the value of these large-scale collaborative studies (5). Kuller noted that the history of epidemiologic advances “is intimately intertwined between epidemiology, pathology and development of new technologies” (6).

The exciting opportunities to collaborate across large observational studies with the requisite tissue and biospecimens and the need to integrate rapidly evolving high throughput technologies, present the challenge of forging new interdisciplinary collaborative ventures. These must transcend existing dichotomies (for example, environment vs. gene, risk vs. outcome; somatic vs. germline; genetic vs. epigenetic, laboratory vs. population based) that do little to advance in-depth understanding, but can obscure the complex biological reality that needs the full participation of multidisciplinary teams of scientists to unravel. We therefore present an overarching concept of an integrative approach to merge the boundaries of molecular cancer epidemiology with biobehavioral research, tumor molecular genomics, and systems biology approaches to explore the mechanistic underpinnings of epidemiologic observations into cancer risk and outcome. We argue that the discipline of molecular epidemiology can provide the framework for the study designs that enable this type of team science.

Integrative epidemiology definition

Integrative Epidemiology was conceived of as a cohesive approach to combine the rigor of epidemiologic study design with the rapid advances in analytical systems and biostatistical and bioinformatic tools mentioned above, utilizing the same populations, biospecimens and data elements as in case-control or cohort studies of risk to extend to studies of outcome and response to therapy, as well as cancer risk taking behaviors (e.g. nicotine dependence or physical inactivity). It builds upon the theory that gene discovery and elucidation of broader molecular mechanisms move back and forth between studies of molecular epidemiology and those of tumor molecular genetics, and of intermediate phenotypes, thereby enriching and informing all disciplines (7, 8). Caporaso (8) has argued that such an approach is efficient and although the cost of such larger studies is greater, the marginal cost per unit information is actually lower and the scientific payoff greater. A unifying premise in the concept of Integrative Epidemiology is that changes in the function of a single gene or pathway can contribute to susceptibility to carcinogenic exposure, predisposition to cancer development, patient prognosis, and prediction of response to therapy (7). Integrative Cancer Epidemiology (7, 8) represents a coalescence of diverse research interests and methodologies relevant across other fields of medicine, for example in cardiovascular epidemiology.

Since we first wrote about Integrative Epidemiology in 2005, we have witnessed the development and increasing availability of high throughput genotyping platforms and massively parallel sequencing that provide orders of magnitude improvement in throughput over our early candidate gene studies using simple Taqman platforms or Sanger sequencing approaches, enabling “genome-wide” applications that were not possible previously. This integrative concept has been drastically reshaped by the scale of data throughput now possible. The challenge for epidemiologists is to rigorously apply the principles of observational science to these new approaches, including attention to study design, data and sample collection, and marker validation that are hallmarks of high quality research. Linking of tissue repositories with well-characterized epidemiologic, clinical, phenotypic and omic data is another requisite. It follows, therefore, that there is the need to re-educate epidemiologists in integrating modern and rapidly evolving “omic” technologies into state-of-the-art molecular epidemiology, and in becoming facile in incorporating diverse and high dimensional data. No single scientist can be accomplished in all types of research endeavors, but we all need to appreciate the accelerating pace of new technologies, understand each others’ languages, and recognize the potential applications to population studies in order to overcome the barriers posed by our individual disciplines and to effectively communicate and collaborate.

Advances in Integrative Epidemiology

There are examples in the literature that hint to the broad emergence of these integrative approaches. Ogino et al (9) urged the incorporation of molecular pathology into traditional epidemiologic studies to examine the relationship between exposures and molecular signatures in the tumor, as well as the interactive influences of exposure and molecular features on tumor progression. They termed this approach “molecular pathological epidemiology”. Thomas et al (10) have pointed out the largely untapped potential of Genome-Environment Wide Interaction Studies (GEWIS), and stressed the importance of well-designed studies with careful measurement and efficient analysis of both genetic and environmental factors. Khoury and Wacholder (11) also supported the notion that agnostic approaches to interrogating the human genome for genetic risk factors could be extended into a similar approach for gene-environment-wide interaction studies. Approaches to evaluate environmental factors associated with disease have not yet yielded the hoped for technical advances such as a “chip” or standard bioassays that can broadly survey exposures although newer metabolomics technologies hold potential.…. Patel et al (12) have proposed borrowing the GWAS methodology to create a model Environmental-Wide Association Study (EWAS), to search for environmental factors associated with disease on a broad scale.

Exome sequencing has proved successful in the identification of genes that cause some rare Mendelian diseases, although many familial cancers remain unexplained after the first wave of exomic exploration, attributable perhaps to how early we still are in applying this technology. Studies are in progress to assess whether a component of the missing heritability of many common disorders resides in rare gene variants of moderate/low penetrance that are potentially tractable by exome sequencing (13), to duplications/deletions (14), or to more subtle changes residing in non-coding regulatory regions of the genome. Epigenomic profiling technologies have reached the stage at which large-scale epigenome-wide association studies are also becoming feasible. The correlations that have been observed between genotype and epigenotype (methQTLs) are encouraging for the prospects of further integrated analysis (15).

Garnett et al (16) outlined how systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies. As a translational application of this technology, Platz et al (17) integrated data from an efficient, high-throughput in-vitro screen with available drug use data from a large, prospective cohort study in a successful proof of principle study that linking biology and epidemiology can inform new indications for existing drugs. Each of the examples cited reinforce the pivotal role that epidemiology can play in bridging basic and clinical research.

No new technology can substitute for careful selection of population samples and refined hypothesis testing (6). We should focus efforts on “smarter” study designs either within existing cohorts or as ancillary studies. For example, in association studies of genetic variation in cancer risk, it can be assumed that rare variants will be enriched at extreme ends of the phenotype being investigated. Therefore new studies should consider selecting individuals with extreme phenotypes, such as cancer probands from high risk families, or young onset cases. As a case in point, in a recent editorial, Cirulli and Goldstein (18) have advocated the value of extreme-trait sequencing because variants that contribute to the trait will be enriched in frequency in such groups. Even small sample sizes may suggest candidate variants that can then be replicated in larger samples. Kazma and Bailey (19) also stress how crucial sample selection is in designing these studies. Population-based designs are more suitable for detecting the effect of multiple rare variants, while family-based designs enrich for rare variants, for which the effect likely would be concealed at the population level.

The Future

Khoury et al (20) recently challenged the cancer epidemiology community to reflect on the critical scientific priorities they will be confronting in the near future. We cite a few opportunities below. Beyond analysis of germline DNA, great value can be added by collection of tumor tissues for integration of somatic genetic alterations and extraction of RNA species for profiling. There are new opportunities available for analysis of blood samples for measurement of circulating miRNAs or tumor cells, and the specific requirements to ensure valid measurement of these species must be defined at the start of the study by scientists expert in their measurement. Advances in imaging play a major role in early detection of cancer, and such data have proven to be quite valuable in our understanding of breast cancer (through quantification of mammographic density) (21). Indeed, Kumar et al (22) have proposed that comprehensive interrogation of radiologic images (“radiomics”) can reveal and refine early detection of cancer imaging studies and doing so within the framework of an epidemiologic study, complemented by other types of biological, risk factor and tissue data, may prove to be a powerful approach.

The value and contributions of team science is now recognized as essential to ensure the successful application of advances in technical capabilities to the understanding of underlying biologic complexities (23). To fully empower such collaboration across disciplines, education at many levels is needed. Senior scientists need exposure to new disciplines. Newer scientists need immersion in informatics and emerging technologies, always with the caveat that the ever more rapid march of technology will make continual re-education mandatory for all. Kuller (6) has pointed out that epidemiologists require a solid background in biologic sciences and an understanding of the new tools being developed, in addition to their solid quantitative skills. Recognizing the evolving challenges being faced in computational resources and data management, the National Cancer Institute sponsored a workshop in 2011 entitled “Next Generation Analytic Tools for Large-Scale Genetic Epidemiology Studies of Complex Diseases” (24) highlighting among other needs, those related to annotation and curation of biological pathway databases, tools for data visualization, and new open-source, user-friendly analytical tools. The group also recommended improved computational training and support for graduate students and postdoctoral fellows, as such skills are critical to properly leverage and interpret increasingly dense data sets across multiple sources and platforms. It is likely that medical schools and schools of public health will need to develop integrated programs and grant review committees and funding agencies will have to be reoriented to this new research reality. Multiple discussions will have to take place about the need to establish standard research guidelines for this evolving area of research as suggested by Ogino et al (25), along the lines of STROBE (Strengthening the Reporting of Observational Epidemiology) (26). In order to implement genetic risk prediction in clinical practice, there needs to be a comprehensive evaluation of risk prediction models and recommendations for the reporting of Genetic Risk Prediction Studies (GRIPS) have been proposed to maximize the synthesis of data across multiple studies (27).

We will also be facing challenges in translating scientific discoveries into meaningful interventions at the population level. Khoury (28) has discussed the role of “Translational Epidemiology” along the multidisciplinary research continuum from basic discovery through evidence guidelines to implementation in practice and in assessing population health outcomes. Greater efforts will be required in knowledge integration at all phases of the research continuum so that there is translation of findings to inform treatment and prevention trials to fill the “translational gap” from discovery to global impact.

In summary, there is a need for rapid and efficient integration of the emerging wealth of genomic, epigenomic and transcriptomic information for prediction of risk and improvements in disease outcomes. Inevitably this will mandate an integrated philosophy and a growing emphasis on molecular epidemiology research and the application of research approaches intrinsic to observational science to all aspects of translational research. We advocate this approach as it offers unprecedented opportunities for discovery of causes, mechanisms and outcomes of cancer, while being attentive to the rigor of study design, careful population selection and pristine data collection.

Acknowledgments

Financial support; National Cancer Institute CA55769 (MRS), CA127219 (MRS),

Footnotes

Disclosure of Potential Conflicts of Interest: The authors declare that they have no competing financial interests. None of the sponsors played a role in the study design, in the writing of this report, or in the decision to submit the paper for publication.

References

  • 1.Greenwald P, Dunn BK. Landmarks in the history of cancer epidemiology. Cancer Res. 2009 Mar 15;69(6):2151–62. doi: 10.1158/0008-5472.CAN-09-0416. [DOI] [PubMed] [Google Scholar]
  • 2.Taubes G. Epidemiology Faces Its Limits. Science. 1995 Jul 14;:164–169. doi: 10.1126/science.7618077. [DOI] [PubMed] [Google Scholar]
  • 3.Perera FP. Molecular cancer epidemiology: a new tool in cancer prevention. J Natl Cancer Inst. 1987 May;78(5):887–98. [PubMed] [Google Scholar]
  • 4.Sellers TA. The beginning of the end for the epidemiologic focus on gene x environment interactions? Cancer Epidemiol Biomarkers Prev. 2006 Jun;15(6):1059–60. doi: 10.1158/1055-9965.EPI-06-0366. [DOI] [PubMed] [Google Scholar]
  • 5.Thun MJ, Hoover RN, Hunter DJ. Bigger, better, sooner--scaling up for success. Cancer Epidemiol Biomarkers Prev. 2012 Apr;21(4):571–5. doi: 10.1158/1055-9965.EPI-12-0191. [DOI] [PubMed] [Google Scholar]
  • 6.Kuller LH. Invited commentary. The Twenty-First Century Epidemiologist. Need for Different Training? A J Epidem. 2012 Aug 30; doi: 10.1093/aje/kws227. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
  • 7.Spitz MR, Wu X, Mills G. Integrative Epidemiology: from risk assessment to outcome prediction. J Clin Oncol. 2005 Jan 10;23(2):267–75. doi: 10.1200/JCO.2005.05.122. [DOI] [PubMed] [Google Scholar]
  • 8.Caporaso NE. Integrative study designs—next step in the evolution of molecular epidemiology. Cancer Epidemiol Biomarkers Prev. 2007 Mar;16(3):365–6. doi: 10.1158/1055-9965.EPI-07-0142. [DOI] [PubMed] [Google Scholar]
  • 9.Ogino S, Stampfer M. Lifestyle factors and microsatellite instability in colorectal cancer: the evolving field of molecular pathological epidemiology. J Natl Cancer Inst. 2010 Mar 17;102(6):365–7. doi: 10.1093/jnci/djq031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Thomas D. Gene–environment-wide association studies: emerging approaches. Nature Reviews Genetics. 2010 Apr;11(4):259–72. doi: 10.1038/nrg2764. Review. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Khoury MJ, Wacholder S. Invited commentary: from genome-wide association studies to gene-environment-wide interaction studies--challenges and opportunities. Am J Epidemiol. 2009 Jan 15;169(2):227–30. doi: 10.1093/aje/kwn351. discussion 234–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Patel CJ, Bhattacharya J, Butte AJ. An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus. PLoS One. 2010 May 20;5(5):e10746. doi: 10.1371/journal.pone.0010746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Snape K, Ruark E, Tarpey P, Renwick A, Turnbull C, Seal S, et al. Predisposition gene identification in common cancers by exome sequencing: insights from familial breast cancer. Breast Cancer Res Treat. 2012 Jul;134(1):429–33. doi: 10.1007/s10549-012-2057-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yang XR, Ng D, Alcorta DA, Liebsch NJ, Sheridan E, Li S, et al. T (brachyury) gene duplication confers major susceptibility to familial chordoma. Nat Genet. 2009 Nov;41(11):1176–8. doi: 10.1038/ng.454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet. 2011 Jul 12;12(8):529–41. doi: 10.1038/nrg3000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012 Mar 28;483(7391):570–5. doi: 10.1038/nature11005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Platz EA, Yegnasubramanian S, Liu JO, Chong CR, Shim JS, Kenfield SA, et al. A novel two-stage transdisciplinary study indentifies digoxin as a possible drug for prostate cancer treatment. Cancer Discovery. 2011 Jun;1(1):68–77. doi: 10.1158/2159-8274.CD-10-0020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cirulli ET, Goldstein DB. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet. 2010 Jun;11(6):415–25. doi: 10.1038/nrg2779. Review. [DOI] [PubMed] [Google Scholar]
  • 19.Kazma R, Bailey JN. Population-based and family-based designs to analyze rare variants in complex diseases. Genet Epidemiol. 2011;35( Suppl 1):S41–7. doi: 10.1002/gepi.20648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Khoury MJ, Freedman AN, Gillanders EM, Harvey CE, Kaefer C, Reid BC, et al. Frontiers in cancer epidemiology: a challenge to the research community from the epidemiology and genomics research program at the national cancer institute. Cancer Epidemiol Biomarkers Prev. 2012 Jul;21(7):999–100. doi: 10.1158/1055-9965.EPI-12-0525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kelemen LE, Sellers TA, Vachon CM. Can genes for mammographic density inform cancer etiology? Nat Rev Cancer. 2008 Oct;8(10):812–23. doi: 10.1038/nrc2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kumar V, Gu Y, Basu S, Berglund A, Eschrich SA, Schabath MB, et al. Radiomics: the process and the challenges. Magn Reson Imaging. 2012 Nov;30(9):1234–48. doi: 10.1016/j.mri.2012.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sellers TA, Caporaso N, Lapidus S, Petersen GM, Trent J. Opportunities and barriers in the age of team science. Cancer, Causes & Control. 2006 Apr;17(3):229–37. doi: 10.1007/s10552-005-0546-5. [DOI] [PubMed] [Google Scholar]
  • 24.Mechanic LE, Chen HS, Amos CI, Chatterjee N, Cox NJ, Divi RL, et al. Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genet Epidemiol. 2012;36:22–35. doi: 10.1002/gepi.20652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ogino S, King EE, Beck AH, Sherman ME, Milner DA, Giovannucci E. Interdisciplinary education to integrate pathology and epidemiology: towards molecular and population-level health science. A J Epidem. 2012 Aug 30; doi: 10.1093/aje/kws226. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gallo V, Egger M, McCormack V, Farmer PB, Ioannidis JP, Kirsch-Volders M, et al. STROBE Statement. STrengthening the Reporting of OBservational studies in Epidemiology - Molecular Epidemiology (STROBE-ME): an extension of the STROBE Statement. PLoS Med. 2011 Oct;8(10):e1001117. doi: 10.1371/journal.pmed.1001117. Epub 2011 Oct 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Janssens AC, Ioannidis JP, van Duijn CM, Little J, Khoury MJ GRIPS Group. Strengthening the reporting of Genetic RIsk Prediction Studies: the GRIPS Statement. PLoS Med. 2011 Mar;8(3):e1000420. doi: 10.1371/journal.pmed.1000420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Khoury MJ, Gwinn M, Ioannidis JP. The emergence of translational epidemiology: from scientific discovery to population health impact. Am J Epidemiol. 2010 Sep 1;172(5):517–24. doi: 10.1093/aje/kwq211. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES