Skip to main content
Transactions of the American Clinical and Climatological Association logoLink to Transactions of the American Clinical and Climatological Association
. 2013;124:84–93.

The Epigenetic Basis of Common Human Disease

Andrew P Feinberg 1,
PMCID: PMC3715917  PMID: 23874013

I am delighted to talk to this distinguished body about epigenetics, a field that did not exist at the American Clinical and Climatological Association founding 128 years ago. Yet the antecedent of this field, Darwinian evolution, had at that time just emerged and had transformed biology. Much of what is driving our field is coming to terms with the role of Darwinian natural selection in a shifting environment, and how that plasticity relates to common human disease. So, just as we have been rediscovering the nuances of Darwinism throughout its history, the field of epigenetics provides a modern way to relate natural selection to our environment and to disease, but is still thoroughly grounded in the history of biology and medicine over the last century.

EPIGENETICS

What is epigenetics? This question is a special case of the question that underlies the efforts of all geneticists, classical or not, namely: What is the basis of phenotypic variation? That is a relatively easy question to answer when one compares a species to its closest relatives. The basis of species differences is completely comprehensible on the basis of variation in DNA sequence. Not that we understand all of the details yet, but the genomes are sequenced and ultimately we will be able to fully understand these differences based on the sequence information that was already obtained.

A much more difficult question is: What is the basis of phenotypic differences among the various organs of the body? Here the differences in phenotype are much greater than between species, for example, between a brain and a heart, than between the heart of a human and the heart of a chimpanzee. Yet the DNA sequence of the brain cell and the heart cell is identical, and thus it is not the DNA sequence that tells these cells what to do differently. It is some other information, and that is the substrate of epigenetics. Not that the DNA sequence doesn't define the epigenetic machinery in the first place—of course it does. But to understand the phenotype of different cell types, one must understand the differences in epigenetic information between those cell types.

Formally, the definition of epigenetics is alterations in DNA or associated factors other than the sequence itself, that carry information, and which are replicated during cell division. Another distinguishing characteristic of epigenetic information is its susceptibility to environmental modification. In a classic experiment, Wolff showed that dietary exposure to methionine, an essential precursor for DNA methylation, can modify a particular variant locus that leads to the Agouti mutation in mice. Exposed mice showed a coat color change as well as obesity (1).

DNA methylation is a major epigenetic target and the main focus of this presentation. It is a chemical modification of the nucleotide cytosine, at the 5-position counting from the attachment point to deoxyribose (Figure 1). DNA methylation is copied faithfully during DNA replication when it occurs at the dinucleotide CpG. That is because during semiconservative replication the parent strands both show meCpG and the daughter strands simply CpG. An enzyme DNA methyltransferase I recognizes this hemimethylated DNA and transfers a methyl group from S-adenosyl-methionine, whose ultimate source is dietary methionine, as noted above. We now know that many environmental agents influence the epigenome, including diet, smoking, radiation exposure, toxicants, and aging itself (2). As a dramatic example, investigators have shown that older monozygotic twins are substantially different epigenetically compared to younger twins (3).

Fig. 1.

Fig. 1.

DNA Methylation. The 5-position of cytosine is modified with a CH3 group. Hemimethylated DNA is copied by DNA methyltransferase I to the daughter strand during semiconservative DNA replication. The methyl donor is S-adenosyl-methionine, which is derived from dietary methionine.

EPIGENETIC EPIDEMIOLOGY

In 2004, the epidemiologist M. Daniele Fallin and I proposed a model in which epigenetics, genetics, and environmental health could be integrated to create a new field of epigenetic epidemiology (4). Our reasoning was that epigenetic changes do not occur in isolation, or in competition with genetic change, but the two are highly related. We suggested that environmental exposure and genetic variation could be integrated by modifying the epigenome. This is in addition to purely genetic effects that have no epigenetic contribution (Figure 2). An advantage of our model is that it also affords an opportunity for quantitative phenotypes to arise without necessarily requiring hundreds or thousands of individual genetic variants. For example, a DNA polymorphism is discrete, i.e., it is there or it is not, whereas DNA methylation is a quantitative number in a given tissue measurement. Thus, even a single site of DNA methylation variability could give a wide range of continuous values of phenotype, in comparison to the many loci for the same effect that would be required from sequence variation.

Fig. 2.

Fig. 2.

Epigenetic epidemiology. This new field promises to integrate environmental exposure, DNA variation, and time (aging). It can lead to quantitative phenotypes that contribute to common diseases.

To pursue this idea, we were fortunate to be able to secure funding from an innovation grant from the National Human Genome Research Institute, Centers of Excellence in Genomic Science (CEGS). Under the auspices of this Center, we have developed new tools for genome-scale analysis of the epigenome and for integrating that information with genetic and phenotypic data. This approach has required the close collaboration of independent investigators from the fields of medicine, biostatistics, epidemiology, computer science, and molecular biology. Some of the technology that has emerged from our center include: CHARM, an array-based for genome scale analysis of DNA methylation, without much of the bias toward what was already known about gene-local epigenetic variants; relatively low-cost whole genome bisulfite sequencing; and novel approaches to parallel computing. The work on our CEGS grant also led directly to widely used commercial methylation arrays that do not require advanced biochemical or genetic techniques for laboratories grounded in other methods, such as for established clinical investigators.

REVISITING A CLASSIC PARADIGM: CANCER EPIGENETICS

In the first whole genome DNA methylation sequencing experiment comparing human cancer to normal control tissue, we made a surprising discovery that resonates with the earlier work I did which inaugurated the field of cancer epigenetics. We recently discovered large, i.e., many > 1 megabase in size, hypomethylated blocks in human colon cancer, when comparing to normal colonic mucosa from the same patients (5). Note in Figure 3 the large hypomethylated blocks and extreme hypervariability of DNA methylation within these large blocks (5). It is apparent comparing patient to patient, but also when comparing the data along the block for each individual patient, suggesting a fundamental disruption in the mechanism for maintaining epigenetic integrity in these cells (Figure 3).

Fig. 3.

Fig. 3.

Large hypomethylated blocks in human colorectal cancer. Blue represents normal samples, and red the cancers. The Y-axis is methylation level. Approximately 1 megabase of DNA is shown on the X-axis. (Reprinted from Nature Genetics (5) with permission.)

This hypomethylation involves approximately one half of the genome, and contains one third of single copy genes. It is not due to selective hypomethylation of repetitive sequences, as previous literature has intimated, and which is ruled out from the comprehensive sequencing done here (5). The block hypomethylation is also functionally important and underlies tumor cell heterogeneity. Thus, all of the most hypervariably expressed genes in cancer lie within these blocks. They include most of the genes that are responsible for tumor invasion and metastasis, such as matrix metalloproteinases. It was also a gratifying result because it explained the observation I made as a postdoctoral fellow with Bert Vogelstein and reported in Nature in 1983, of widespread hypomethylation of the human genome in cancer (6). At the time, of course, we had no way to know that this hypomethylation involved the large but discrete regions described above.

There are also mechanistic clues that come from this whole genome approach. It turns out that the hypomethylated blocks correspond to a nuclear structure described by us and others as LOCKs, blocks, or LADS and corresponding to large regions of the genome that undergo coordinate chromatin methylation, i.e., of histones, notably histone H3 lysine 9 dimethylation, and histone H3 lysine 27 trimethylation (79). These regions are also associated with the nuclear lamina, i.e., they form a structure of heterochromatin with a particular nuclear localization. The data described above suggest that this fundamental mechanism of nuclear organization is disrupted in cancer. Again, the results resonate with historic findings, as Teodor Boveri himself described a century ago the loss of clear nuclear structural organization in tumors, founding the field of cancer genetics (10).

These recent observations also have practical translational implications. The mathematician Andrew Teschendorf from the University College London applied our methylation variability idea to predict which patients will go on to develop cancer from biopsy specimens of nonmalignant tissues. He examined the methylation profiles in cervical biopsy specimens and found that those with a variable pattern of DNA methylation, as described above, were much more likely to advance to cervical cancer than those with a relatively stable pattern. This was much more predictive than mean differences in DNA methylation (11).

A NOVEL PARADIGM: RHEUMATOID ARTHRITIS

We have recently undertaken the study of rheumatoid arthritis (RA) as a model for testing our approach to epigenetic epidemiology. RA is a chronic systemic autoimmune disease, and also a complex genetic disease. It affects up to 1% of the population and is twice as frequent in women as in men. The concordance rate for monozygotic twins is only 15%, suggesting a strong epigenetic component, i.e., not explained by DNA sequence. Traditional genetic studies such as genome-wide association can explain only approximately 20% of the cause of the disease.

To address RA from a genetic and epigenetic basis, we examined 354 cases and 337 controls that had already been genetically tested as part of the EIRA study in Sweden. We used the Illumina Infinium HumanMethylation450 BeadChip array, to which we had contributed design elements from our earlier work. This study was performed by Yun Liu, a genetics postdoctoral fellow, Dani Fallin noted above, and Martin Aryee, a faculty biostatistician. The study was performed jointly and equally with a Swedish team led by Tomas Ekstrom, Lars Klareskog, and Leonid Padyukov at the Karolinska Institute. This work was recently published in Nature Biotechnology (12) and its findings are summarized here.

Because we were studying blood cells, there were two major problems that needed to be overcome (Figure 4). First was the fact that DNA from blood comes from mixed cell types. We used a method published by Andrew Housman that lets one infer the differential count from the measurement of DNA methylation at genes across the genome (13). We then performed a cell type adjustment, accounting for these differences, to reduce confounding by cell type differences. The second problem was that one must distinguish cause from effect in epigenetic studies. This is different from classical genetic analysis because you are born with your genome. Thus, the methylation differences we see could be consequential to the disease rather than playing a causal role. To address this problem, we framed the question by asking to what degree methylation differences arose downstream from genetic variations, and yet lay upstream of the RA phenotype. There are a series of steps that come from the epidemiological “causal inference” literature (14), but had not been applied previously to epigenetics. These included showing the following: 1) that genotype and phenotype are associated; 2) that genotype is associated with methylation after adjusting for case-control status; 3) that methylation is associated with RA after adjusting for genotype; and most importantly 4) that the genotype-phenotype association is abrogated after adjusting for methylation.

Fig. 4.

Fig. 4.

Practical problems in epigenetic epidemiology. Two major hurdles must be overcome. First, one must correct for that fact that DNA from blood represents an epigenetically mixed cell type. Second, one must use statistical tools to show that DNA methylation mediates the link between genotype and phenotype. The same approach can be used to analyze potential environmental causes as well. (Reprinted from Nature Biotechnology (12) with permission.)

The result of this analysis was that we found two regions of methylation variation, both within the MHC complex, and both within a large gene of unknown function termed C6orf10 (12). We could show that this methylation change lay between or mediated genetic predisposition in RA. The implication is that this methylation in turn regulates other genes within the MHC cluster. The results were replicated in an independent set of 12 cases and 12 controls of purified cells, and it also appeared to be monocyte-specific (12). Thus, we suspect that the reactivity of monocytes is a major contributor to RA. We do not yet know which genes are undergoing genetically driven epigenetic regulation because there are more than 20 candidate genes in linkage disequilibrium to C6orf10. We are currently performing detailed genome and epigenomic sequencing of this region in patients as well as controls, and we are also performing detailed RNA expression analysis, to address these issues.

Although still very much an ongoing question, this work is the first example of a genome-scale analysis of both genetic and epigenetic variation in common disease, and the same tools can be brought to bear to understand how the epigenome can modify environmental, rather than genetic risk, such as smoking, the other arm of our epigenetic epidemiology model.

SUMMARY

Epigenetics was historically a niche field of interest to biologists studying model organisms. For the last three decades, epigenetics has moved into the heart of cancer biology. Now the field promises to bridge the gap in our understanding of the relationships among genetic risk, environmental exposure, and common disease. It is natural that these first steps were made in the study of an autoimmune disease because blood cells are a natural surrogate for the tissue of interest, but the same ideas can be applied to other disorders, particularly those that may be age-related, including diabetes and cardiovascular disease. Our ultimate hope is that entirely new modalities for therapy will be discovered that are targeted at the regulatory sequences, or the protein effectors that bind them. This would be a more “surgical” (dare I use that word in this audience?) approach to biological response modification than nonspecific immunosuppressors in autoimmune disease, or cytotoxic drugs in oncology. Again, I am most grateful to be able to add my small contribution to this extraordinary body of scholarship of the Association.

Footnotes

Potential Conflicts of Interest: None disclosed.

DISCUSSION

Nestler, Richmond: Absolutely wonderful, but if I understood correctly, hypomethylation is what you want to avoid in most cases. So, when we're born, are we saturated with methylation?

Feinberg, Baltimore: When we are born, we actually increase methylation with age. That aging slide I showed you was increase in DNA methylation. Some increases in methylation are probably bad; some decreases in methylation are probably bad. It depends on the gene. For example, increased methylation and silencing of genes could reduce things like plasticity, ability to secrete collagen, wound healing. In cancer, the predominant change across the genome is the loss of DNA methylation which can turn genes on allowing cells to invade and metastasize and so forth, so it depends on the disease.

Quesenberry, Providence: Great talk. There is a new area evolving, I think, that really relates to this and that's extracellular vesicles and their ability to continuously and totally change cell fate in what we believe is in an epigenetic fashion. I'd be interested in your comments.

Feinberg, Baltimore: Well I didn't know about that but let's talk later. That sounds really interesting.

Quesenberry, Providence: There is a new Society of Extracellular Vesicles.

Feinberg, Baltimore: I would be very interested to hear about it.

Sacher, Cincinnati: Absolutely brilliant talk. If I remember from my basic biochemistry, folate is involved in methyl transfer, right?

Feinberg, Baltimore: Yes. It's a cofactor in the formation of S-adenosylmethionine, the methyl donor.

Sacher, Cincinnati: Is there any specific relationship? Could you conceive that folate, either in excess or with deprivation, can be involved in some of these influences.

Feinberg, Baltimore: Folate deficiency probably is a significant problem in terms of loss of DNA methylation and cancer risk and so forth and probably some of the effect in terms of preventing neural tube defects are related to this sort of pathway, so it looks like it's probably very important. I think there are a lot of wonderful reasons that women are taking folate prior even to conception that is probably having very salutary effects. I don't know of any evidence where increased folate would be their problem.

REFERENCES

  • 1.Wolff GL, Kodell RL, Moore SR, Cooney CA. Maternal epigenetics and methyl supplements affect agouti gene expression in Avy/a mice. FASEB J. 1998;12:949–57. [PubMed] [Google Scholar]
  • 2.Feil R, Fraga MF. Epigenetics and the environment: emerging patterns and implications. Nat Rev Genet. 2011;13:97–109. doi: 10.1038/nrg3142. [DOI] [PubMed] [Google Scholar]
  • 3.Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML, et al. Epigenetic differences arise during the lifetime of monozygotic twins. Proc Natl Acad Sci U S A. 2005;102:10604–9. doi: 10.1073/pnas.0500398102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bjornsson HT, Fallin MD, Feinberg AP. An integrated epigenetic and genetic approach to common human disease. Trends Genet. 2004;20:350–8. doi: 10.1016/j.tig.2004.06.009. [DOI] [PubMed] [Google Scholar]
  • 5.Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, et al. Increased methylation variation in epigenetic domains across cancer types. Nat Genet. 2011;43:768–75. doi: 10.1038/ng.865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Feinberg AP, Vogelstein B. Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature. 1983;301:89–92. doi: 10.1038/301089a0. [DOI] [PubMed] [Google Scholar]
  • 7.Reddy KL, Feinberg AP. Higher order chromatin organization in cancer. Semin Cancer Biol. 2012;23:109–15. doi: 10.1016/j.semcancer.2012.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wen B, Wu H, Shinkai Y, Irizarry RA, Feinberg AP. Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat Genet. 2009;41:246–50. doi: 10.1038/ng.297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhou VW, Goren A, Bernstein BE. Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet. 2011;12:7–18. doi: 10.1038/nrg2905. [DOI] [PubMed] [Google Scholar]
  • 10.Boveri T. In: 1929; The Origin of Malignant Tumors. Boveri M., translator. London: Balliere, Tindall and Cox; [Google Scholar]
  • 11.Teschendorff AE, Widschwendter M. Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions. Bioinformatics. 2012;28:1487–94. doi: 10.1093/bioinformatics/bts170. [DOI] [PubMed] [Google Scholar]
  • 12.Liu Y, Aryee MJ, Padyukov L, Fallin MD, Hesselberg E, Runarsson A, et al. Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nature Biotechnol. 2013;31:142–7. doi: 10.1038/nbt.2487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86. doi: 10.1186/1471-2105-13-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Millstein J, Zhang B, Zhu J, Schadt EE. Disentangling molecular relationships with a causal inference test. BMC Genet. 2009;10:23. doi: 10.1186/1471-2156-10-23. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Transactions of the American Clinical and Climatological Association are provided here courtesy of American Clinical and Climatological Association

RESOURCES