Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Nov 7.
Published in final edited form as: Chem Soc Rev. 2012 Jul 27;41(21):6916–6930. doi: 10.1039/c2cs35104h

5-Hydroxymethylcytosine – the elusive epigenetic mark in mammalian DNA

Edita Kriukienė 1, Zita Liutkevičiūtė 1, Saulius Klimašauskas 1,
PMCID: PMC3467341  NIHMSID: NIHMS401846  PMID: 22842880

Abstract

Over the past decade, epigenetic phenomena claimed a central role in cell regulatory processes and proved important factors for understanding complex human diseases. One of the best understood epigenetic mechanisms is DNA methylation. In the mammalian genome, cytosines (C) were long known to exist in two functional states: unmethylated or methylated at the 5-position of the pyrimidine ring (5mC). Recent studies of genomic DNA from the human and mouse brain, neurons and from mouse embryonic stem cells found that a substantial fraction of 5mC in CpG dinucleotides is converted to 5-hydroxymethyl-cytosine (hmC) by the action of 2-oxoglutarate- and Fe(II)-dependent oxygenases of the TET family. These findings provided important clues in a long elusive mechanism of active DNA demethylation and bolstered a fresh wave of studies in the area of epigenetic regulation in mammals. This 15 review is dedicated to critical assessment of the most popular techniques with respect to their suitability for analysis of hmC in mammalian genomes. It also discusses the most recent data on biochemical and chemical aspects of the formation and further conversion of this nucleobase in DNA and its possible biological roles in cell differentiation, embryogenesis and brain function.

1. Introduction

Post-replicative modification of DNA: a higher (epigenetic) layer of information

Every single living cell in each tissue of a living organism carries a full set of genes (the genome), which contain the information required for its functioning and ability to reproduce itself. The key function of the genome is the storage, replication and transmission of the genetic information. The genome is composed of long polymeric strands of randomly interchanging purine and pyrimidine nucleotides: adenine, guanine, cytosine and thymine (A, G, C, T) (Fig. 1). A meaningful and timely reading of the genome consisting of billions of recurring G:C or A:T base pairs in different types of cells is possible by using a sort of (epi)genetic “bookmarking” system – a mechanism that allows living organisms to adapt to the changing environment and dynamically reprogram the fate of each cell. A key player in this process is the phenomenon called DNA methylation. DNA methylation is established by action of DNA methyltransferase enzymes, which transfer a methyl group from the ubiquitous cofactor S-adenosyl-L-methionine (SAM) onto DNA. In bacteria and archaea, DNA methyltransferases modify nucleobases by replacing a hydrogen atom with a bulkier methyl group in the exocyclic amino group of adenine or cytosine or in the intracyclic C5-position of cytosine yielding, correspondingly, N6-methyladenine (6mA), N4-methylcytosine ( 4mC) or 5-methylcytosine (5mC) (Fig. 1). Structural and mechanistic aspects of DNA methylation have been studied in detail. DNA cytosine-C5 methyltransferases are distinct from the amino-specific enzymes by their reaction involves the formation of a covalent intermediate between a cysteine residue from an enzyme and the C6-atom of the cytosine ring (Fig. 2A).

Fig.1.

Fig.1

Base pairing and postreplicative modifications in DNA

Fig. 2.

Fig. 2

Covalent catalysis by cytosine-5 transferases. Enzymatic transfer of single carbon units to pyrimidine-5 centres requires covalent addition of a nucleophile at the C6 position of the pyrimidine ring. DNA cytosine-5 methyltransferases catalyse the transfer a methyl group from the cofactor S-adenosyl-l-methionine (SAM) (A) or in vitro addition of exogenous formaldehyde (B) to the C5-position of their target cytosine residues in DNA producing 5mC or hmC, respectively. (C) 2'-deoxycytidine-5'-monophosphate (dCMP) is converted into 5-hydroxymethyl-dCMP (dhmCMP) by a bacteriophage-borne deoxycytidylate hydroxymethylase (CHase), which (i) transfers a methylene group from methylenetetrahydrofolate (ii) and adds the hydroxyl anion to the transferred methylene group, producing dhmCMP and tetrahydrofolate. The hydroxymethylated nucleotide is incorporated into bacteriophage DNA and glucosylated by α and β-glucosyltransferases (AGT and BGT) to yield 5-glucosyloxymethylcytosine (glc-hmC).5

Note that the biological methylations do not alter the coding specificity of the target nucleobases preserving the original genetic content of the genome (Fig.1). On the other hand, they all occur in the major groove of the DNA helix (Fig. 1) where they can be accessed and interpreted as “steric” signals (bumps) by specialized cellular proteins, enzymes or large multicomponent complexes. These features make such modified bases well suited to serve as epigenetic marks in biological signaling, which provides an additional layer of information encoded in the genome. All three types of DNA methylation found in microorganisms occur sequence-specifically. A unique DNA methylation pattern (a combination of methylated sequences) imposed by restriction-modification systems serves as a species “self code”, whereas DNA with a non-matching modification pattern is eliminated by an accompanying restriction endonuclease. In higher eukaryotes, the sole methylation product is 5mC; the methylation occurs not only in sequence-specific but also in a locus-specific manner. In mammalian genomes (especially in somatic cells), methylation of cytosine predominantly occurs at CpG dinucleotides, whereas a fraction of 5mC is associated with non-CG contexts in embryonic stem cells.1 CpG motifs are under-represented in mammals and tend to cluster in the high density regions, referred to as CpG islands. Extensive studies of 5mC distribution estimated that the majority (70–80%) of CpGs are methylated, except of those localized in CpG islands. Traditionally, 5mC when localized at CpG islands, are important transcriptional silencers at gene promoters. Three major types of DNA methyltransferases are active on mammalian genomes (Fig. 3A). Initial methylation patterns are thought to be established by so-called de novo DNA methyltransferases Dnmt3a and Dnmt3b, whereas preservation of the CpG methylation marks across cell divisions is carried out by the maintenance methyltransferase Dnmt1.

Fig.3.

Fig.3

Schematic representation of the human DNA methyltransferase and Tet protein families. (A) Dnmts share a conserved catalytic domain (MTase), but differ in their N-terminal regulatory regions. Dnmt1 contains a PCNA binding domain, a pericentric heterochromatin targeting sequence (TS), a CXXC domain, and two bromo adjacent homology domains (BAH). Dnmt3a and 3b comprise a PWWP domain named after a conserved Pro-Trp-Trp-Pro motif and a ATRX-Dnmt3-Dnmt3L (ADD) domain also known as the PHD (plant homeodomain) domain. (B) Tet1,2,3 proteins contain a cysteine-rich (Cys) region and a double-stranded β-helix (DSBH) fold which is characteristic of the 2OG-Fe(II) oxygenases and is required for the catalytic activity. Tet1 and Tet3 also contain a CXXC domain.

The cytosine methylation typically leads to strong and heritable gene silencing. The methylation patterns in the genome establish a self-code (epigenotype) of different tissues and together with histone modifications modulate tissue-specific transcriptional programmes in a process called epigenetic regulation. The epigenetic role of 5-methylcytosine has been extensively studied and widely accepted to be crucial in a variety of cellular processes including development gene regulation, differentiation, genomic imprinting, silencing of transposable elements, X-chromosome inactivation and others.2 Underlying its importance and heritability, 5mC is often called the fifth base of DNA.

Although DNA methylation patterns can be established and maintained through generations, they may undergo dynamic changes and can be reversible in a genome-wide or locus-specific manner. The mechanism of demethylation and enzymes implicated in this process were elusive until recently, when the seemingly firm position of 5mC as the untouchable epigenetic mark in DNA was shattered by the discovery of 5-hydroxymethylcytosine (hmC) in 2009.3, 4 That year changed dramatically the future of epigenetic research.

2. Occurrence of 5-hydroxymethylated nucleobases

The presence of 5-hydroxymethylcytosine in DNA was first observed in certain bacteriophages. In T-even phages, 5-hydroxymethylated cytosine is incorporated into the genome during DNA synthesis, and is subsequently modified by phage α-and β-glucosyltransferases creating a highly glucosylated DNA containing 5-glucosyloxymethylcytosine (glc-hmC) residues. The precursor nucleotide, 2’-deoxy-5-hydroxymethylcytidine-5’-monophosphate (dhmCMP), is produced by enzymatic addition of a methylene group at the 5-position of dCMP. This reaction is mechanistically similar to DNA C5-methylation as it involves the formation of a covalent intermediate between a cysteine residue from an enzyme and the C6-position of cytosine (reviewed in 5) (Fig. 2C).

A similar modified base J (5-(β-D-glucosyl)oxymethyluracil) (Fig. 1) is present in DNA of flagellated protozoa of the order of Kinetoplastida (which include parasites Trypanosoma brucei, Leishmania sp. and others) and closely related unicellular alga Euglena gracilis.6 It replaces up to 1% of thymines and is found in repetitive sequences, mostly telomeric repeats. The first step of its biosynthetic pathway (Fig. 4A) is the oxidation of thymine to hmU by the JBP1/JBP2 proteins which are members of a large Fe2+- and 2-oxoglutarate-dependent oxygenase family of enzymes. In the second step, a glucose moiety is added by a yet unidentified glucosyltransferase. Although the biological role of J base is unclear, it was demonstrated to be essential for some of the species (reviewed in 6).

Fig. 4.

Fig. 4

Reactions of Fe(II)/2-oxoglutarate dependent oxygenases in nucleotide metabolism and epigenetic regulation. (A) Production of J base in kinetoplastids involves enzymatic oxidation of specific thymine residues in DNA by J-binding proteins 1 or 2 (JBP1,2), resulting in 5-hydroxymethyluridine (hmU), which undergoes addition of a glucose moiety catalyzed by an unidentified glucosyltransferase (GTase).6 (B) Mammalian Tet1,2,3 proteins catalyse oxidation of 5mC in DNA producing hmC, and then fC or caC, depending on reaction conditions and other factors. (C) A thymidine salvage pathway in fungi. Thymine-7-hydroxylase (T7H) carries out three consecutive oxidation reactions of thymine methyl group to produce 5-carboxyuracil (iso-orotate), which is enzymatically decarboxylated to uracil by iso-orotate decarboxylase (IODC). (D) Bacterial DNA repair enzyme AlkB is able to carry out oxidative demethylation of 1-methyladenine (1mA) and 3-methylcytosine (3mC) in DNA and RNA by releasing the unmodified bases and the methyl group as formaldehyde. (E) Histone demethylases (DMTase) remove the methyl groups from specific Lys (or Arg) in histones. (F) Proposed general mechanism of Fe(II)/2-oxoglutarate oxygenases

In higher eukaryotes, chemical oxidation of 5mC and thymine was thought to be the major source of these hydroxymethylated nucleobases (Fig. 5). hmC can be formed at 5mC sites in response to oxidative stress in vitro.7 Oxidation of thymine residues in DNA to hmU was considered to be an important source of endogenous oxidative DNA damage. 5mC is slightly more reactive toward hydroxyl radical attack than is thymine and it was predicted that in human cells, ~20 5mC residues would be oxidized to hmC per cell per day (8 and references therein).

Fig. 5.

Fig. 5

Hydroxyl radical induced oxidative damage of 5mC in DNA.7, 19

The presence of hmC in genomic DNA extracted from the brains of adult mice, rats and frogs was first reported in the early seventies.9 However, no confirmation of this finding was obtained in other labs.10 Assuming that hmC may have originated from spontaneous oxidative damage of DNA during sample handling, the significance of this finding had not been appreciated for a period of almost 40 years. In 2009, two groups independently re-confirmed the existence of hmC in mouse brain cells and mouse embryonic stem (ES) cells.3, 4 By using thin layer chromatography (TLC) Kriaučionis and Heinz found in multiple isolates that hmC constitutes 0.6% of total nucleotides in Purkinje cells and ~0.2 % in granule cells.3 Rao and co-workers arrived there from a different side: based on knowledge of the abovementioned trypanosome proteins JBP1 and JBP2, they identified their mammalian counterparts, the Ten-Eleven-Translocation (TET) proteins Tet1, Tet2 and Tet3 as potential modifiers of 5mC to hmC.4 The three homologs contain the common features of 2-oxoglutarate (2OG)- and Fe(II)-dependent oxygenases in their C-terminal part (Fig. 3B). In the N-terminal part, the Tet proteins possess a CXXC domain, a binuclear Zn-chelating domain found in certain chromatin-associated proteins such as Dnmt1 methyltransferase, and other elements which mediate interactions with multiple components in the cell.11, 12 Importantly Tet1, Tet2 and Tet3 were shown to be capable of converting 5mC to hmC in vitro and in vivo (Fig. 4B). Altogether this evidence convincingly demonstrated that hmC is indeed an endogenous biological component of genomic DNA, rather than an in vitro artifact of chemical DNA oxidation. Furthermore, it was shown that the Tet proteins further oxidize the formed hmC to 5-formylcytosine (fC) and 5-carboxycytosine (caC) pointing to the idea that hmC is an intermediate in the controversial and long searched pathway of active demethylation.13, 14 The discovery of hmC and its derivatives sparked studies to re-evaluate the reactions catalyzed by known cytosine-modifying enzymes and to map genomic distribution of these modified cytosine bases in various cell types and tissues.

On the other hand, some chemical evidence points to other possible sources of hmC in DNA besides the Tet-mediated pathway. For example, recent in vitro experiments with representative AdoMet-dependent C5-MTases showed that these enzymes catalyze covalent addition of exogenous formaldehyde to the C5-position of their target cytosine residues in DNA yielding hmC (Fig. 2B).15 These reactions occur in the absence of AdoMet cofactor and require the presence of the catalytic cysteine residue in the enzyme. A hint of potential involvement of MTases in hmC dynamics is found in eukaryotic mitochondria. The mitochondrial isoform of the Dnmt1 gene contains an alternative translation start site that encodes a leader peptide responsible for the import of the Dnmt1 protein into mitochondria. In contrast, the Tet proteins lack similar leader sequences, and apparently are excluded from the mitochondrial DNA.16 These observations indirectly point at Tet-independent formation of hmC in certain cases, which await experimental confirmation.

3. Methods for production and analysis of hmC in DNA

3.1 Production of hmC in DNA in vitro

Production of DNA substrates containing hmC residues is required for studies of chemo-enzymatic transformations of this modified nucleotide as well as for the development and validation new analytical techniques. Chemical synthesis of hmC containing oligonucleotides was developed by the Sommers group in the nineties,8 and is currently used by the majority of commercial supplies. 5-hydroxymethyl group of hmC is protected during nucleotide coupling with a 2-cyanoethyl group. However, we and others17 have observed that deprotection of the 5-hydroxymethyl group is often incomplete using standard protocols (overnight, 60°C, conc. aq. NH3);8 moreover, we find that typical commercially supplied oligonucleotides contain 20–50% of hmC residues in the 2-cyanoethyl-protected form (Liutkevičiūtė and Klimašauskas, unpublished observations), and much harsher treatment with sodium hydroxide is required for complete deprotection. This shortcoming prompted the development of alternative protection strategies, which permit efficient deprotection under milder although non-standard conditions.17, 18 The performance of these strategies has yet to be verified on a massive production scale.

Enzymatic incorporation of hmC into DNA is possible using commercial 2’-deoxynucleoside-5’-triphosphate (dhmCTP) in variety of DNA polymerase-dependent protocols including PCR. This approach replaces all C nucleotides with hmC in newly synthesized DNA strands. Sequence specific incorporation of hmC into DNA duplexes is possible using a recently discovered atypical chemo-enzymatic reaction of DNA cytosine-5 methyltransferases (Fig. 2B),15 which catalyze the coupling of formaldehyde to their target cytosines residues in vitro. Although the reaction is highly specific for methyltransferase target sites, the efficiency of hydroxymethylation may be dependent on an enzyme used. Obviously, the Tet proteins could be used for production of hmC DNA from methylated DNA in vitro and in vivo but the reaction should be firmly controlled to avoid further oxidation products.

3.2 Methods for analysis of genomic hmC

A myriad of methods had been developed over the past several years to investigate DNA methylation profiles across large DNA regions and entire genomes. However, all such methods are binary - i.e. designed to distinguish only the two epigenetic states of cytosine: methylated versus unmodified. Therefore, following the discovery of hmC all the existing methods needed to be reevaluated for their ability to discriminate hmC and 5mC in DNA. Many of the past methylation studies relied on bisulfite sequencing.20 In the presence of bisulfite, unmodified cytosine readily forms a 5,6-dihydro-6-sulfonyl adduct which undergoes hydrolytic deamination to uracil, and thus appears as T in DNA sequencing (Fig. 6A). 5mC is very stable to bisulfite-promoted deamination and is subsequently read as normal C (Fig. 6B). It turns out that bisulfite conversion cannot discriminate between 5mC and hmC 20-22. hmC appears as C, since the bisulfite attack predominantly occurs at the 5-hydroxymethyl group yielding a hydrolytically stable 5-sulfonylmethylcytosine (smC, also called cytosine 5-methylenesulfonate) (Fig. 6C).20 In contrast, the products of hmC oxidation, caC and fC, are interpreted as unmodified C (reads in the T lane) (Fig. 6D-E),14, 23 owing to their transient bisulfite-induced conversion to C. Recently, two methods have been proposed that permit detection of hmC in bisulfite sequencing after chemical or enzymatic modifications of hmC.23, 24

Fig.6.

Fig.6

Reaction products and DNA sequencing reads after sodium bisulfite treatment of C (A), 5mC (B), hmC (C), caC (D) and fC (E). C, caC and fC undergo hydrolytic deamination to uracil in the presence of bisulfite to read as T in DNA sequencing.

Another popular tool in 5mC analysis is methyl-sensitive CG-specific restriction endonucleases that differentially cleave unmethylated and methylated target sites. Most of the restriction enzymes do not clearly discriminate between 5mC and hmC,22 although some hmC-specific restriction endonucleases have recently been described. 25, 26 Obviously, such analyses would be further complicated by the presence of fC and caC, along with hmC and 5mC, at the CpG sites.

Several techniques have been developed or adapted for detection or mapping of hmC in DNA. Thin layer chromatography (TLC) of radiolabeled nucleotides had been previously widely used for analysis of RNA and DNA modifications.27 Such analyses can be performed in a standard biochemical lab. Owing to the high specific activity and high decay energy of the 32P or 33P radionuclides, autoradiographic detection of modified nucleotides on TLC plates requires little material. Two radiolabeling strategies have been used for analysis of hmC in DNA. In one such approach named “nearest neighbour analysis”, single stranded DNA breaks are randomly introduced by DNaseI and then DNA polymerase and [α-33P]-dGTP is used to incorporate the radioactively labeled G nucleotide at the DNA breaks in the case when C is in the opposite strand. Then DNA is enzymatically hydrolysed to yield 3’-nucleotides such that the radiolabeled phosphate group appears on a 5’-neighbouring nucleotide.3 Alternatively, the modified cytosine in CG dinucleotides can be enriched by using R.MspI or R.TaqI restriction endonucleases, which cleave CCGG or TCGA sites, respectively between the two pyrimidines regardless of whether C, 5mC or hmC present in the second position. The produced DNA fragments with 5'-terminal CG sites are 33P-labeled and DNA is degraded to 5'-NMPs for TLC.4, 13 The sensitivity of the latter approach is generally higher due to increased selectivity of 33P-labeling (only cytosines in the CCGG or TCGA context are seen), but this also limits the coverage of analysis to a fraction of CG sites.

A complete nucleoside composition analysis of unlabeled DNA samples can be performed using liquid chromatography (HPLC). DNA is enzymatically hydrolysed to nucleosides which are resolved by reversed-phase liquid chromatography. Notably, hmC 2’-deoxynucleoside elutes closely following, and is partially obscured by, a major peak of 2’-deoxycytidine. Similar elution times on reverse-phase columns and similar UV spectra of C and hmC may complicate detection of hmC in routine analyses of mammalian DNA using UV detection.13, 28 A higher selectivity and sensitivity can be achieved with modern mass spectrometry detectors, and reliable quantitation of the nucleosides is possible using synthetic stable-isotope labeled internal standards.29 These methods are best suited for global assessment of hmC, fC and caC levels in genomic DNA, however requires specialized equipment and expertise, and thus cannot be performed in a typical biochemical laboratory.

Methods based on affinity enrichment were widely used to detect genome-wide localization of hmC (summarized in Fig. 7). Such methods rely on selective binding of short hmC containing DNA fragments (200–800 bp) to hmC-specific antibodies or other hmC-binding proteins permitting their physical extraction from the rest of DNA for analysis using quantitative PCR, DNA microarrays or sequencing. Polyclonal and monoclonal antibodies have been raised against hmC itself or the product of bisulfite treatment of hmC, 5-sulfonylmethylcytosine (smC).30-32 Similar to its predecessor, methylated-DNA immunoprecipitation (MeDIP), hMeDIP (hydroxymethylated-DNA immunoprecipitation) accounts for a large fraction of hmC profiling studies performed during the past three years. On the other hand, the conclusions of these studies are not entirely concordant.33, 34 A typical shortcoming of antibody-based pull down approaches is a density-dependent capture bias. Another possible limitation is cross-reactivity with methylated and unmodified cytosines, as these bases bear very few chemical differences for discrimination. Antibodies raised against smC or glc-hmC might be more specific and effective for the identification of hmC, however, neither one is currently commercially available.

Fig. 7.

Fig. 7

Analytical strategies for labelling and enrichment of hydroxymethylated DNA.

A particularly useful analytical modification of hmC is its enzymatic glucosylation to a much bulkier and distinctive residue, 5-glucosyloxymethylcytosine, using T4 beta-glucosyltransferase (BGT) and the uridine-5’-diphospho-D-glucose (UDP-glc) cofactor. The BGT reaction is robust and highly selective for the 5-hydroxymethyl group of hmC (or hmU). Such treatment can be used to attach tritium labeled glucose moieties from UDP-[3H]glucose to DNA permitting direct quantification of hmC by scintillation counting.35 Moreover, JBP1 protein, which naturally recognizes the J base in kinetoplastids, can selectively bind DNA fragments containing glucosylated hmC. This interaction can be exploited for selective isolation and analysis of hmC containing DNA as discussed above.36 Another simple approach combines glucosylation of hmC and MspI restriction endonuclease digestion. R.MspI cleaves the CCGG target site if the second cytosine is unmethylated, methylated or hydroxymethylated, but glucosylation of hmC residues renders the sites resistant to MspI cleavage. Thus, glc-hmC DNA can be enriched and analysed using qPCR,37 microarrays or next-generation sequencing. Although analysis is restricted to sparsely distributed tetranucleotide target sites, it uniquely permits a single-nucleotide resolution mapping of hmC residues in the genome.

A further advance in enrichment strategies is associated with chemical capture of glucosylated hmC residues.32 Oxidation of the glucose moiety with NaIO4 creates two reactive aldehyde groups which can be subsequently modified by commercially available aldehyde reactive probes containing biotin, allowing selective enrichment of hmC containing DNA.32 To avoid the tedious oxidation step, chemically modified cofactor analogues of the cofactor UDP-glc containing an azide group (UDP-6-N3-glc)38 or a keto group (UDP-6-keto-glc)39 can be used. These methods proved efficient for genome-wide profiling of hmC. An alternative chemo-enzymatic method that requires no prior glucosylation of hmC has recently been proposed by Liutkevičiūtė et al.40 It was found that DNA C5-MTases can surprisingly catalyse the attachment of alkylthiol/alkylselenol moieties with functional (amino, thiol) groups to hmC residues located at the target position. The attached functional group can be chemically ligated to biotin for selective enrichment of hmC containing DNA. The method can potentially analyse all CG dinucleotides using M.SssI methyltransferase, but has not yet been validated in genome-wide experiments. The potential of a single-base resolution genome analysis has recently been demonstrated41 by combining selective chemical labelling of hmC with single-molecule real time sequencing (SMART). Nanopore sequencing has also been shown to directly discriminate hmC from 5mC without any further chemical modifications of the base.42, 43 The latter technique was demonstrated to work on model DNA substrates (proof-of-principle) and awaits further validation in large-scale genomic studies.

4. Biological roles of hmC

Despite many efforts to elucidate the distribution and involvement of hmC and Tet proteins in many cellular processes, the biological function of hmC is still under extensive debate. The relative abundance of hmC and its generation from the precursor 5mC points to the roles of this new mark in modulating 5mC-dependent gene regulation. DNA methylation is considered to be a bi-directional and dynamically regulated process. The dynamic nature of the genome is well observed during embryogenesis, or upon rapid reactivation of previously silenced genes in response to changes in extrinsic signals. The involvement of hmC in these cellular processes is discussed below.

4.1. DNA demethylation

In theory, decreasing levels of 5mC in genomic DNA can be generated through a passive or an active DNA demethylation pathway. In the first scenario, the methylation marks are “passively” diluted from DNA in a replication-dependent manner, in the absence of the methylation maintenance activity in the newly synthesized daughter strands (Fig. 8). Alternatively, 5mC can be “actively” converted to unmodified C by a demethylase activity in the framework of the same DNA strand regardless of DNA replication. The existence of active DNA demethylation has been well documented in plants. In Arabidopsis thaliana, a group of DNA glycosylases named Demeter excise 5mC by cleaving the N-glycosydic bond, resulting in an apyrimidinic (AP) site in the DNA strand. Then, AP lyases and AP endonucleases form a single nucleotide gap that is subsequently filled by action of DNA polymerases and ligases. However, to date no such glycosylases have been proven to act directly on 5mC in mammals (reviewed in44). The reality of active demethylation in vertebrates had long remained elusive, giving much ground for widespread skepticism, but it all of a sudden became quite clear with the discovery of hmC in DNA. A classic example of active DNA demethylation is a global loss of 5mC in paternal DNA after fertilization in mammals while the erasure of the methylation mark in maternal DNA proceeds via passive DNA demethylation (reviewed in 44). The loss of 5mC from the paternal genome in the fertilized egg correlates with an increase in hmC levels, whereas the female pronucleus remains methylated and contains low levels of hmC. It was shown recently that Tet3 facilitates the loss of 5mC which proceeds before the first cell division and goes through the Tet3-dependent conversion of 5mC to hmC and later to both fC and caC followed by replication dependent dilution.45-48 Notably, further oxidation forms of 5mC are relatively stable and persist to at least the 4-cell stage. Another firm evidence for active demethylation can be found in non-dividing cells, such as neurons in the brain, which excludes a passive demethylation scenario in these tissues. Indeed, rapid demethylation has been observed at the promoters of Bdnf (brain derived neurotrophic factor, which is important for adult neural plasticity) and Fgf1 (fibroblast growth factor 1) in postmitotic neurons as part of a physiological response to electroconvulsive stimulation.49 It has been shown that Tet1 may contribute to this process by initiating the conversion of 5mC to hmC which is followed by deamination to hmU and further replacement into C through the base excision repair (BER) pathway. Rapid demethylation of 5mC in T lymphocytes in response to interleukin-2 stimulation50 provides yet another example.

Fig. 8.

Fig. 8

Mechanisms of de novo and maintenance methylation, hydroxymethylation and replication-dependent dilution of epigenetic marks in genomic DNA. Methylation patterns are initially established by the de novo DNA methyltransferases Dnmt3a and Dnmt3b. After DNA replication and cell division, the 5mC marks are recreated (maintained) in daughter cells by the maintenance methyltransferase, Dnmt1, which predominantly acts on hemimethylated CpG sites in DNA. Enzymatic conversion of 5mC to hmC by Tet proteins creates hydroxymethylated CpG sites which are poorly recognized by Dnmt1 and do not support Dnmt1-dependent methylation of the daughter strands, leading to gradual loss of the epigenetic mark in subsequent rounds of DNA replication (passive demethylation pathway).

4.2 Chemistry of DNA demethylation

In contrast to the well-studied biology of DNA methylation in mammals, the enzymatic mechanism of active demethylation had long remained elusive and controversial (reviewed in 44, 51). The fundamental chemical problem for direct removal of the 5-methyl group from the pyrimidine ring is a high stability of the C5–CH3 bond in water under physiological conditions. To get around the unfavorable nature of the direct cleavage of the bond, a cascade of coupled reactions can be used. For example, certain DNA repair enzymes can reverse N-alkylation damage to DNA via a two-step mechanism, which involves an enzymatic oxidation of N-alkylated nucleobases (N3-alkylcytosine, N1-alkyladenine) to corresponding N-(1-hydroxyalkyl) derivatives (Fig. 4D). These intermediates then undergo spontaneous hydrolytic release of an aldehyde from the ring nitrogen to directly generate the original unmodified base. Demethylation of biological methyl marks in histones occurs through a similar route (Fig. 4E) (reviewed in 52). This illustrates that oxygenation of the methylated products leads to a substantial weakening of the C-N bonds. However, it turns out that hydroxymethyl groups attached to the 5-position of pyrimidine bases are yet chemically stable and long-lived under physiological conditions.

From biological standpoint, the generated hmC presents a kind of cytosine in which the proper 5-methyl group is no longer present, but the exocyclic 5-substitutent is not removed either. How is this chemically stable epigenetic state of cytosine resolved? Notably, hmC is not recognized by methyl-CpG binding domain proteins (MBD), such as the transcriptional repressor MeCP2, MBD1 and MBD221, 53 suggesting the possibility that conversion of 5mC to hmC is sufficient for the reversal of the gene silencing effect of 5mC. Even in the presence of maintenance methylases such as Dnmt1, hmC would not be maintained after replication (passively removed) (Fig. 8)53, 54 and would be treated as “unmodified” cytosine (with a difference that it cannot be directly re-methylated without prior removal of the 5-hydroxymethyl group). It is reasonable to assume that, although being produced from a primary epigenetic mark (5mC), hmC may play its own regulatory role as a secondary epigenetic mark in DNA (see examples below).

Although this scenario is operational in certain cases, substantial evidence indicates that hmC may be further processed in vivo to ultimately yield unmodified cytosine (active demethylation). It has been shown recently that Tet proteins have the capacity to further oxidize hmC forming fC and caC in vivo (Fig. 4B),13, 14 and small quantities of these products are detectable in genomic DNA of mouse ES cells, embyoid bodies and zygotes.13, 14, 28, 45 Similarly, enzymatic removal of the 5-methyl group in the so-called thymidine salvage pathway of fungi (Fig. 4C) is achieved by thymine-7-hydroxylase (T7H), which carries out three consecutive oxidation reactions to hydroxymethyl, and then formyl and carboxyl groups yielding 5-carboxyuracil (or iso-orotate). Iso-orotate is finally processed by a decarboxylase to give uracil (reviewed in).44, 52 To date, no orthologous decarboxylase or deformylase activity has been described to remove the oxidation products of 5mC in DNA in vivo, which would provide a final chain in a full demethylation pathway (Fig. 9). Where should we look for possible candidates? A useful hint is provided by DNA C5-MTases, which are not only capable of coupling of formaldehyde to cytosine (Fig. 2B), but can also promote conversion of hmC to C releasing formaldehyde in vitro.15 The MTase-directed reaction proceeds via a covalent intermediate at C6 (Fig. 10A) resembling the light-induced two-step conversion of hmC to cytosine (Fig. 10B)19 or bisulfite-mediated decarboxylation of caU and deformylation of fC (Fig. 10C).23, 55 These chemical precedents suggest that certain DNA C5-MTases or some dedicated enzymes may in principle perform the removal of the oxidized groups (5-hydroxymethyl, 5-formyl or 5-carboxyl) to give unmodified cytosine residues.28, 44 Although providing plausible chemical precedents for direct enzymatic exchange of one carbon units on the C5-atom of pyrimidines, the significance of these reactions in active DNA demethylation in vivo has not yet been demonstrated.

Fig. 9.

Fig. 9

Formation and removal of epigenetic marks in mammalian DNA. Cytosine (C) is converted to 5-methylcytosine (5mC) by action of endogenous DNA MTases of Dnmt1 and Dnmt3 families (green pathway). Several mechanisms for DNA demethylation, in which 5-methylcytosine (5mC) is converted back to C, have been proposed. Horizontal arrows represent oxidation-based pathways performed by Tet proteins: methyl group of mC is consecutively oxidized to hydroxymethyl, formyl and carboxy groups forming 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC) and 5-carboxycytosine (caC), respectively. Bent plain arrows show deamination-based pathways where hmC is deaminated to 5-hydroxymethyluracil (hmU) in the presence of AID/APOBEC family deaminases, and direct base excision repair (BER) pathways involving TDG, MBD4 and SMUG1 glycosylases, which all lead to transient formation of apyrimidinic (AP) sites in DNA. Dashed arrows denote the newly discovered hydroxymethylation and dehydroxymethylation reactions performed by cytosine-5 methyltransferases in vitro and putative enzymes (deformylase and decarboxylase) which could directly remove the formyl and carboxy groups from fC and caC, respectively.

Fig. 10.

Fig. 10

Removal of 5-hydroxymethyl, 5-formyl or 5-carboxyl groups is facilitated by transient nucleophilic addition at C6 of the pyrimidine ring. (A) DNA cytosine-5 methyltransferase directed conversion of hmC to C residues in DNA. 15 (B) Light-induced dehydroxymethylation of cytosine in DNA.19 (C) Decarboxylation of caU nucleoside55 and (D) deformylation of fC in DNA23 in the presence of bisulfite.

Notably, further oxidation of the hydroxymethyl group to a formyl or carboxyl group substantially alters electronic properties of the nucleobase (charge distribution, stability of the N-glycosidic bond, tautomeric properties).56, 57 This may facilitate the excision of the modified cytosine by some DNA repair glycosylases. One such enzyme is thymine-DNA glycosylase (TDG) whose primary function was thought to repair T-G mismatches produced in DNA upon sporadic hydrolytic deamination 5mC. Recently it was found that TDG can directly excise fC and caC, while leaving 5mC and hmC untouched.14, 57 The removal of fC and caC in a G:fC or G:caC pair in vitro proceeds at rates exceeding or similar to that of a T-G mispair.57 Based on these observations and structural insights,58 a model of active demethylation involving iterative 5mC oxidation by Tet proteins coupled with TDG-mediated base excision repair (BER) has been proposed (Fig. 9). However, currently this mechanism leaves a number of unanswered questions. First of all, how the cell deals with many lesions and AP sites that would appear after the BER mediated demethylation in zygotes or in CpG islands. To date, neither of the analyses has demonstrated that fC and caC is present in promoters undergoing demethylation. Furthermore, He et al.14 demonstrated that 5mC and hmC are almost fully (90%) converted to caC by Tet1 and Tet2 without appearance of fC, whereas Ito et al.13 report that fC accumulates relative to caC. In both studies, Tet3 did not perform hmC oxidation effectively. Since it is known that only Tet3 is expressed in zygotes and oocytes, this raises a question whether Tet3 alone is responsible for appearance of fC and caC in zygotes. Altogether, these results revealed that the efficiency and the final product of hmC oxidation steps performed by Tet proteins depend on special conditions which are not fully understood yet.

Another debated pathway for hmC processing is based on the ability of some DNA glycosylases to excise hmU, which might occur via enzymatic deamination of hmC (Fig. 9).49 Interestingly, some DNA glycosylases, single-stranded monofunctional uracil-DNA glycosylase 1 (SMUG1) and MBD4 have no significant activity for excision of caC or hmC,14 but they efficiently remove hmU base, a deamination product of hmC.53, 59 TDG was also shown to exhibit excision activity against hmU-G mispairs in dsDNA.59 Enzymatic deamination of cytosine by the activation-induced deaminases (AID) is an important process in the generation of antibody diversity (reviewed in 44), which involves massive but localised mutagenesis of DNA through the deamination of cytosine to uracil by AID/apolipoprotein B mRNA editing enzyme complex (APOBEC) family proteins. Consistent with this scenario, overexpression of AID/APOBEC deaminases in neural cells promotes the decay of hmC and accumulation of 5-hydroxymethyluracil (hmU) in DNA.49 hmU can be further excised by TDG, SMUG1 or MBD4 glycosylases as mentioned before. This mechanism relies on the assumption that the AID/APOBEC deaminases can affectively deaminate hmC in duplex DNA in vivo. However, in vitro studies indicate that these enzymes show a strong preference for unmodified cytosines located in single stranded DNA. Therefore, the deamination-mediated pathway still requires strong biochemical evidence (44, 51 and references therein).

4.3. Regulatory roles of hmC

The assessment of global amounts of hmC in mouse and humans demonstrates its obvious tissue-dependency. The highest amounts of hmC are found in the brain (0.15–0.6% of total nucleotides or ~10–40% of 5mC),4, 38, 60 although vary in different parts and even different cell types of the brain.3, 29 Significant amounts of hmC (0.01–0.2%) are also found in other tissues such as breast, kidney, or heart, although the reported numbers vary among studies.22, 60-62 The observed discrepancy may derive from technical difficulties in accurate determination of small amounts of this modified component, but may also reflect inherent biological variation and dynamics of hmC in DNA. This is in contrast to fairly steady levels of detectable 5mC (0.6–1.5% and 0.7–1% of all nucleotides in human and mouse organs respectively).60, 61 Interestingly it was shown, that not only the total amount but also locus-specific distribution of hmC is tissue-specific.63 In ES cells, hmC is found at a level of 0.04% of all nucleotides and in human cell lines HeLa and HEK293FT hmC is reduced to 0.008%.38

4.3.1. hmC in embryonic stem cells

Since comparatively high levels of hmC were detected in mouse and human ES cells, extensive recent studies have focused on elucidating the role of hmC and Tet1 in these cells. Some key features were uncovered suggesting involvement of hmC in transcriptional regulation and maintenance of pluripotency of these cells. In the past year, a series of genome-wide profiling studies of hmC and Tet1-binding sites have been performed using a variety of techniques. 11, 31, 32, 34, 64-67 Despite some inconsistencies between the reported data, the studies revealed important characteristics that are unique to ES cells. It was found that both Tet1 and hmC are enriched within gene bodies (specifically at exons) and at transcription start sites (TSS) and promoters. At the promoter level, hmC maps to regions with intermediate and high CpG content, i.e. CpG islands, which are usually depleted of 5mC and are related to actively transcribed genes. A surprising revelation was that the transcriptional activator effect of Tet1 and hmC was less extensive than their repressive function. hmC was found to be enriched for genes that are associated with developmental regulation and are kept in a transcriptionally ‘poised’ state such that they can be rapidly switched on-off depending on a differentiation pathway. These are bivalent gene promoters of lineage-specific transcription factors that are repressed by the Polycomb repression complex 2 (PRC2). The fact that the bivalent promoters are rich in hmC, but depleted in 5mC is even more surprising, since the latter modification is normally involved in gene silencing; it suggests that hmC might act as an independent repressive mark at gene promoters in ES cells. Identification of specific hmC binding proteins that interact with hmC but not 5mC would provide direct evidence for an independent regulatory role of hmC. The most recent work of Yildirim et al.68 demonstrated that this role might be performed by one of the four methyl-CpG binding proteins. Although MBD3 has a so-called methyl-CpG binding domain (MBD), it binds methylated DNA at least two orders of magnitude weaker than other MBD proteins.53 Mapping of MBD3 in ES cells showed that it is strongly enriched at Tet1-bound and hmC-rich Polycomb target genes and that MBD3 localization requires active Tet1, suggesting that Tet1-mediated hydroxymethylation might play a role in MBD3 recruitment in vivo.68 However, in vitro MBD3 binding assays show no clear preference of the protein towards hmC containing DNA as compared to methylated or unmodified DNA.53, 68 Thus it is not currently clear whether in vivo MBD3 binds hmC residues directly or requires Tet1.

The levels of hmC and Tet1/Tet2 transcripts are relatively high in ES cells and Tet3 is expressed at very low levels, but Tet1/Tet2 are downregulated following differentiation, while the expression of Tet3 increases.69, 70 These observations pointed at an idea that Tet1/Tet2 and hmC could participate in regulating the pluripotency and differentiation potential of the cells.70 Indeed, hmC enriched regions frequently map to pluripotency-related transcription factor binding sites, and some of these transcription factors appear to control the transcription of Tet1 and Tet2. Moreover, several pluripotency-related transcription factors, such as Nanog, Tcl1, and Esrrb, are downregulated upon Tet1 depletion.71 However, Dawalaty et al. showed that in Tet1-knockout cells neither pluripotency nor the expression of the pluripotency markers was affected72 suggesting that Tet1 is not the key player in the pluripotency maintenance. Therefore, more studies are needed in order to elucidate whether the hydroxylation of 5mC by the Tet proteins is required for differentiation of pluripotent cells.

4.3.2. hmC and brain function

Similar genome-wide mapping of hmC in mouse and adult human cerebellum and human brain front lobe tissue revealed some important features of hmC distribution in the brain.30, 38, 73 Unlike 5mC, which is abundant all over the genome, hmC was enriched in gene bodies, regions proximal to transcription start sites (TSS) as well as transcription end sites (TTS) of highly expressed genes pointing to a role in maintenance of gene expression. The genomic distribution of hmC in the brain differs from that in ES cells, although both have some common features. hmC is more substantially distributed throughout gene bodies of active genes in the brain than is in ES cells. Contrary to ES cells, hmC is largely depleted from TSS, and are enriched in intragenic regions and intragenic CG of intermediate CpG content, thus largely resembling the profile of 5mC. It is likely that the enrichment of hmC in gene bodies is a general feature of hmC, whereas its occurrence at promoters may be characteristic to pluripotent cells. Apart from association with the bodies of actively transcribed genes, repeat elements SINE (short interspersed nuclear element) and mouse LTR (long tandem repeat) revealed enrichment for hmC. This is quite surprising, as DNA methylation is critical at repetitive elements and serves a role in modulating repeat-mediated genomic instability. However, somatic retrotransposition of LINEs has been observed in the brain suggesting that hydroxymethylation of transposable elements may have some functions in neurogenesis (73 and the references therein).

The importance of hmC in brain development and aging was highlighted by studies of the hmC dynamics in mouse cerebellum and hippocampus.38, 73 It was found that the hmC levels increase in different stages of development. A set of genes that acquire the hmC mark during aging has been identified in mouse cerebellum, and among the genes many are implicated in hypoxia, angiogenesis and age-related neurodegenerative disorders. Since the oxidation of 5mC to hmC by the Tet proteins requires oxygen, the above-mentioned relation to hypoxia raises a possibility that changes in hmC levels may be related to mechanisms of oxygen-sensing and regulation.

4.3.3. hmC and human disease

A link between hmC and neuronal function was highlighted by studying MeCP2-associated disorders.73 The MeCP2 protein (methylcytosine-binding protein 2) is a transcription factor, whose loss-of-function mutations cause Rett syndrome (an autism disorder characterized by severe deterioration of neuronal function after birth).73 It was found that MeCP2 protects methylated DNA from Tet1-dependent formation of hmC in vitro.53, 73 In mouse models of Rett syndrome, a MeCP2 deficiency gave an increased level of hmC, and, conversely, a decrease was observed in MeCP2-overexpressing animals. The MeCP2 dosage variation leads to overlapping, but distinct, neuropsychiatric disorders suggesting that a proper balance in genomic 5mC and hmC is crucial for normal brain function.

The role of Tet proteins and hmC has also been studied in the context of haematopoiesis and cancer. Aberrant DNA methylation is a hallmark of cancer, and cancer cells often display global hypomethylation and promoter hypermethylation.74 Hence, it is tempting to assume that loss-of-function mutations of the Tet proteins may contribute to cancer development. The Tet1 gene was originally identified through its translocation in acute myeloid leukemia (AML).75, 76 Later, many studies identified somatic Tet2 mutations in patients with a variety myeloid malignancies, including myelodysplastic syndromes (MDS), chronic myelomonocytic leukemia (CMML), acute myeloid leukemias and many others (77 and references therein). Studies of leukemia cases found lower hmC levels in genomic DNA derived from patients carrying Tet2 mutations as compared with healthy controls. Since depletion of the Tet protein should protect 5mC sites from oxidation, it was quite surprising to detect global hypomethylation at CpG sites in Tet2 mutations carrying myeloid tumors. In contrast, Figueroa et al demonstrated that Tet2 mutations in AML patients are predominantly associated with a DNA hypermethylation phenotype.78 Differences in hmC profiles in various myeloid malignancies indicate that Tet2 may control DNA methylation indirectly, perhaps via recruitment of one or more DNA methyltransferases.

Another example of a likely involvement of hmC in cancer is presented by recent analyses of the isocitrate dehydrogenase (IDH) gene. IDH catalyzes the conversion of isocitrate to 2-oxoglutarate (2OG). Interestingly, in AML patients, the mutations in the isocitrate dehydrogenase gene IDH1/2 were identified leading to the accumulation of 2-hydroxyglutarate (2-HG) in cells.79 This metabolite impairs catalytic activity of many 2-oxoglutarate-dependent enzymes, including Tet proteins, by competing with its co-substrate 2OG. Thus, these results suggest that the deficiencies in both IDH and Tet genes contribute to cancer development via a common disease mechanism that leads to altered 5mC and hmC patterns. Alternatively, it was demonstrated that Tet2 and hmC are required for regulation of normal hematopoiesis.77, 78, 80 The loss-of-function mutations of Tet2 may reactivate a stem-cell state characterized by general hypomethylation and genomic instability.81 Indeed, Tet2-null mice display an increase in hematopoietic stem cell numbers80, 81 as shown by increased expression of stem cell marker genes. Thus, Tet and hmC play a critical role in regulating normal hematopoietic differentiation, which is in contrast to their role in ES cells (the maintenance of the pluripotent state), indicating that hmC is involved in distinct functions in different cell types.

5. Conclusions

History repeats itself

Since the discovery of 5mC and then 6mA in DNA back in the late forties and early fifties,82, 83 it had long been known that C and A can exist in two chemical states: methylated or unmethylated. It took nearly forty years to realize that C can be methylated in two distinct ways, following the discovery of 4mC modification (see Fig. 1) in microbial DNA in 1983.84 This unexpected finding provoked vigorous reevaluation of the methylation effects on DNA physical and chemical properties, and interactions with DNA binding proteins and in particular on ecology of restriction-modification systems in microorganisms. However, 4mC was not detectable in samples of mammalian DNA, and the idea of monotypic cytosine modification in vertebrate genomes thrived for another 26 years, despite the reports of hmC presence in the brain in early seventies.9 But this “premature” finding failed to turn into a discovery, largely due to a commonly accepted perception of 5mC as the sole modified (epigenetic) base in eukaryotes on one hand, and due to the association of hmC with oxidative DNA damage (an in vitro artifact) on the other. One technical factor is that, in DNA composition analysis using a core technique - reversed-phase HPLC, a minor peak of hmC 2’-deoxynucleoside elutes closely following, and is substantially obscured by a major peak of dCyd, which requires increased care to adequately interpret UV traces of HPLC chromatograms. It is thus no surprise that hmC has been discovered using a less technically advanced, but well suited in this case, TLC analysis of labeled nucleotides. However, the major player was the readiness to accept a new modification in light of prolonged and controversial search for mechanisms of DNA demethylation, thereby resolving the mounting pressure in the epigenetics community. The discovery of a missing chain was greeted with much enthusiasm and created an immense wave of studies in the new area, in which the multiplicity of epigenetic states carried by cytosine was a key principle. Three years past its discovery, hmC is commonly accepted as the sixth base of DNA. The role of hmC as a product of 5mC oxidation en route to unmodified cytosine is now firmly established, thereby transforming our understanding of the regulatory role of the well-established primary epigenetic mark – DNA methylation. Important insights into regulatory roles of hmC as a secondary epigenetic mark have also been obtained. Numerous emerging chemical fates and interactions of the new base observed in vivo leaves its concrete functional roles to be established in future studies.

References

  • 1.Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo Q-M, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR. Nature. 2009;462:315–322. doi: 10.1038/nature08514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bird A. Genes Dev. 2002;16:6–21. doi: 10.1101/gad.947102. [DOI] [PubMed] [Google Scholar]
  • 3.Kriaučionis S, Heintz N. Science. 2009;324:929–930. doi: 10.1126/science.1169786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tahiliani M, Koh KP, Shen YH, Pastor WA, Bandukwala H, Brudno Y, Agarwal S, Iyer LM, Liu DR, Aravind L, Rao A. Science. 2009;324:930–935. doi: 10.1126/science.1170116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Warren RA. Annu Rev Microbiol. 1980;34:137–158. doi: 10.1146/annurev.mi.34.100180.001033. [DOI] [PubMed] [Google Scholar]
  • 6.Borst P, Sabatini R. Annu Rev Microbiol. 2008;62:235–251. doi: 10.1146/annurev.micro.62.081307.162750. [DOI] [PubMed] [Google Scholar]
  • 7.Wagner JR, Cadet J. Acc Chem Res. 2010;43:564–571. doi: 10.1021/ar9002637. [DOI] [PubMed] [Google Scholar]
  • 8.Tardy-Planechaud S, Fujimoto J, Lin SS, Sowers LC. Nucleic Acids Res. 1997;25:553–559. doi: 10.1093/nar/25.3.553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Penn NW, Suwalski R, O'Riley C, Bojanowski K, Yura R. Biochem J. 1972;126:781–790. doi: 10.1042/bj1260781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kothari RM, Shankar V. J Mol Evol. 1976;7:325–329. doi: 10.1007/BF01743628. [DOI] [PubMed] [Google Scholar]
  • 11.Xu YF, Wu FZ, Tan L, Kong LC, Xiong LJ, Deng J, Barbera AJ, Zheng LJ, Zhang HK, Huang S, Min JR, Nicholson T, Chen TP, Xu GL, Shi Y, Zhang K, Shi YG. Mol Cell. 2011;42:451–464. doi: 10.1016/j.molcel.2011.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang H, Zhang X, Clark E, Mulcahey M, Huang S, Shi YG. Cell Res. 2010;20:1390–1393. doi: 10.1038/cr.2010.156. [DOI] [PubMed] [Google Scholar]
  • 13.Ito S, Shen L, Dai Q, Wu SC, Collins LB, Swenberg JA, He C, Zhang Y. Science. 2011;333:1300–1303. doi: 10.1126/science.1210597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.He YF, Li BZ, Li Z, Liu P, Wang Y, Tang QY, Ding JP, Jia YY, Chen ZC, Li L, Sun Y, Li XX, Dai Q, Song CX, Zhang KL, He C, Xu GL. Science. 2011;333:1303–1307. doi: 10.1126/science.1210944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Liutkevičiūtė Z, Lukinavičius G, Masevičius V, Daujotytė D, Klimašauskas S. Nat Chem Biol. 2009;5:400–402. doi: 10.1038/nchembio.172. [DOI] [PubMed] [Google Scholar]
  • 16.Shock LS, Thakkar PV, Peterson EJ, Moran RG, Taylor SM. Proc Natl Acad Sci U S A. 2011 doi: 10.1073/pnas.1012311108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Munzel M, Globisch D, Trindler C, Carell T. Org Lett. 2010;12:5671–5673. doi: 10.1021/ol102408t. [DOI] [PubMed] [Google Scholar]
  • 18.Dai Q, Song CX, Pan T, He C. J. Org. Chem. 2011;76:4182–4188. doi: 10.1021/jo200566d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Privat E, Sowers LC. Chem. Res. Toxicol. 1996;9:745–750. doi: 10.1021/tx950182o. [DOI] [PubMed] [Google Scholar]
  • 20.Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A. PLoS One. 2010;5:e8888. doi: 10.1371/journal.pone.0008888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jin SG, Kadam S, Pfeifer GP. Nucleic Acids Res. 2010;38:e125. doi: 10.1093/nar/gkq223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Nestor C, Ruzov A, Meehan RR, Dunican DS. Biotechniques. 2010;48:317–319. doi: 10.2144/000113403. [DOI] [PubMed] [Google Scholar]
  • 23.Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W, Balasubramanian S. Science. 2012;336:934–937. doi: 10.1126/science.1220671. [DOI] [PubMed] [Google Scholar]
  • 24.Yu M, Hon GC, Szulwach KE, Song C-X, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B, Min J-H, Jin P, Ren B, He C. Cell. 2012;149:1368–1380. doi: 10.1016/j.cell.2012.04.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Szwagierczak A, Brachmann A, Schmidt CS, Bultmann S, Leonhardt H, Spada F. Nucleic Acids Res. 2011;39:5149–5156. doi: 10.1093/nar/gkr118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wang H, Guan S, Quimby A, Cohen-Karni D, Pradhan S, Wilson G, Roberts R, Zhu Z, Zheng Y. Nucleic Acids Res. 2011 doi: 10.1093/nar/gkr607. doi: 10.1093/nar/gkr1607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kuchino Y, Hanyu N, Nishimura S. Methods Enzymol. 1987;155:379–396. doi: 10.1016/0076-6879(87)55026-1. [DOI] [PubMed] [Google Scholar]
  • 28.Pfaffeneder T, Hackner B, Truss M, Munzel M, Muller M, Deiml CA, Hagemeier C, Carell T. Angew Chem Int Ed Engl. 2011;50:7008–7012. doi: 10.1002/anie.201103899. [DOI] [PubMed] [Google Scholar]
  • 29.Munzel M, Globisch D, Bruckl T, Wagner M, Welzmiller V, Michalakis S, Muller M, Biel M, Carell T. Angew Chem Int Ed Engl. 2010;49:5375–5377. doi: 10.1002/anie.201002033. [DOI] [PubMed] [Google Scholar]
  • 30.Jin SG, Wu X, Li AX, Pfeifer GP. Nucleic Acids Res. 2011;39:5015–5024. doi: 10.1093/nar/gkr120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ficz G, Branco MR, Seisenberger S, Santos F, Krueger F, Hore TA, Marques CJ, Andrews S, Reik W. Nature. 2011;473:398–402. doi: 10.1038/nature10008. [DOI] [PubMed] [Google Scholar]
  • 32.Pastor WA, Pape UJ, Huang Y, Henderson HR, Lister R, Ko M, McLoughlin EM, Brudno Y, Mahapatra S, Kapranov P, Tahiliani M, Daley GQ, Liu XS, Ecker JR, Milos PM, Agarwal S, Rao A. Nature. 2011;473:394–397. doi: 10.1038/nature10102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Matarese F, Carrillo-de Santa Pau E, Stunnenberg HG. Mol Syst Biol. 2011;7:562. doi: 10.1038/msb.2011.95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wu H, D'Alessio AC, Ito S, Wang ZB, Cui KR, Zhao KJ, Sun YE, Zhang Y. Genes Dev. 2011;25:679–684. doi: 10.1101/gad.2036011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Szwagierczak A, Bultmann S, Schmidt CS, Spada F, Leonhardt H. Nucleic Acids Res. 2010;38:e181. doi: 10.1093/nar/gkq684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Robertson AB, Dahl JA, Vagbo CB, Tripathi P, Krokan HE, Klungland A. Nucleic Acids Res. 2011;39:e55. doi: 10.1093/nar/gkr051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Davis T, Vaisvila R. J Vis Exp. 2011:e2661. doi: 10.3791/2661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Song CX, Szulwach KE, Fu Y, Dai Q, Yi C, Li X, Li Y, Chen CH, Zhang W, Jian X, Wang J, Zhang L, Looney TJ, Zhang B, Godley LA, Hicks LM, Lahn BT, Jin P, He C. Nat Biotechnol. 2010;29:68–72. doi: 10.1038/nbt.1732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Song CX, Sun Y, Dai Q, Lu XY, Yu M, Yang CG, He C. ChemBioChem. 2011;12:1682–1685. doi: 10.1002/cbic.201100278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Liutkevičiūtė Z, Kriukienė E, Grigaitytė I, Masevičius V, Klimašauskas S. Angew Chem Int Ed Engl. 2011;50:2090–2093. doi: 10.1002/anie.201007169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Song CX, Clark TA, Lu XY, Kislyuk A, Dai Q, Turner SW, He C, Korlach J. Nat Methods. 2012;9:75–77. doi: 10.1038/nmeth.1779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wallace EV, Stoddart D, Heron AJ, Mikhailova E, Maglia G, Donohoe TJ, Bayley H. Chem Commun (Camb) 2010;46:8195–8197. doi: 10.1039/c0cc02864a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Wanunu M, Cohen-Karni D, Johnson RR, Fields L, Benner J, Peterman N, Zheng Y, Klein ML, Drndic M. J Am Chem Soc. 2010 doi: 10.1021/ja107836t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wu SC, Zhang Y. Nat Rev Mol Cell Biol. 2010;11:607–620. doi: 10.1038/nrm2950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Inoue A, Shen L, Dai Q, He C, Zhang Y. Cell Res. 2011;21:1670–1676. doi: 10.1038/cr.2011.189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Iqbal K, Jin SG, Pfeifer GP, Szabo PE. Proc Natl Acad Sci U S A. 2011;108:3642–3647. doi: 10.1073/pnas.1014033108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wossidlo M, Nakamura T, Lepikhov K, Marques CJ, Zakhartchenko V, Boiani M, Arand J, Nakano T, Reik W, Walter J. Nat Commun. 2011;2:241. doi: 10.1038/ncomms1240. [DOI] [PubMed] [Google Scholar]
  • 48.Gu TP, Guo F, Yang H, Wu HP, Xu GF, Liu W, Xie ZG, Shi LY, He XY, Jin SG, Iqbal K, Shi YJG, Deng ZX, Szabo PE, Pfeifer GP, Li JS, Xu GL. Nature. 2011;477:606–610. doi: 10.1038/nature10443. [DOI] [PubMed] [Google Scholar]
  • 49.Guo JU, Su YJ, Zhong C, Ming GL, Song HJ. Cell. 2011;145:423–434. doi: 10.1016/j.cell.2011.03.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bruniquel D, Schwartz RH. Nat Immunol. 2003;4:235–240. doi: 10.1038/ni887. [DOI] [PubMed] [Google Scholar]
  • 51.Bhutani N, Burns DM, Blau HM. Cell. 2011;146:866–872. doi: 10.1016/j.cell.2011.08.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Simmons JM, Müller TA, Hausinger RP. Dalton Trans. 2008;14:5132–5142. doi: 10.1039/b803512a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Hashimoto H, Liu Y, Upadhyay AK, Chang Y, Howerton SB, Vertino PM, Zhang X, Cheng X. Nucleic Acids Res. 2012 doi: 10.1093/nar/gks155. doi: 10.1093/nar/gks1155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Valinluck V, Sowers LC. Cancer Res. 2007;67:946–950. doi: 10.1158/0008-5472.CAN-06-3123. [DOI] [PubMed] [Google Scholar]
  • 55.Isono K, Asahi K, Suzuki S. J Am Chem Soc. 1969;91:7490–7505. doi: 10.1021/ja01054a045. [DOI] [PubMed] [Google Scholar]
  • 56.Bennett MT, Rodgers MT, Hebert AS, Ruslander LE, Eisele L, Drohat AC. J Am Chem Soc. 2006;128:12510–12519. doi: 10.1021/ja0634829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Maiti A, Drohat A. J Biol Chem. 2011;286:35334–35338. doi: 10.1074/jbc.C111.284620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Zhang L, Lu X, Lu J, Liang H, Dai Q, Xu G-L, Luo C, Jiang H, He C. Nat Chem Biol. 2012;8:328–330. doi: 10.1038/nchembio.914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Cortellino S, Xu JF, Sannai M, Moore R, Caretti E, Cigliano A, Le Coz M, Devarajan K, Wessels A, Soprano D, Abramowitz LK, Bartolomei MS, Rambow F, Bassi MR, Bruno T, Fanciulli M, Renner C, Klein-Szanto AJ, Matsumoto Y, Kobi D, Davidson I, Alberti C, Larue L, Bellacosa A. Cell. 2011;146:67–79. doi: 10.1016/j.cell.2011.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Li W, Liu M. J Nucleic Acids. 2011;2011 doi: 10.4061/2011/870726. 10.4061/2011/870726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Globisch D, Munzel M, Muller M, Michalakis S, Wagner M, Koch S, Bruckl T, Biel M, Carell T. PLoS One. 2010;5:e15367. doi: 10.1371/journal.pone.0015367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Terragni J, Bitinaite J, Zheng Y, Pradhan S. Biochemistry. 2012;51:1009–1019. doi: 10.1021/bi2014739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Nestor CE, Ottaviano R, Reddington J, Sproul D, Reinhardt D, Dunican D, Katz E, Dixon JM, Harrison DJ, Meehan RR. Genome Res. 2012;22:467–477. doi: 10.1101/gr.126417.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Williams K, Christensen J, Pedersen MT, Johansen JV, Cloos PAC, Rappsilber J, Helin K. Nature. 2011;473:343–U472. doi: 10.1038/nature10066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Wu H, D'Alessio AC, Ito S, Xia K, Wang Z, Cui K, Zhao K, Eve Sun Y, Zhang Y. Nature. 2011;473:389–393. doi: 10.1038/nature09934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Szulwach KE, Li XK, Li YJ, Song CX, Han JW, Kim S, Namburi S, Hermetz K, Kim JJ, Rudd MK, Yoon YS, Ren B, He C, Jin P. PLoS Genetics. 2011;7:e1002154. doi: 10.1371/journal.pgen.1002154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Stroud H, Feng S, Morey Kinney S, Pradhan S, Jacobsen S. Genome Biol. 2011;12:R54. doi: 10.1186/gb-2011-12-6-r54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Yildirim O, Li R, Hung J, Chen PB, Dong X, Ee L, Weng Z, Rando OJ, Fazzio TG. Cell. 2011;147:1498–1510. doi: 10.1016/j.cell.2011.11.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Koh KP, Yabuuchi A, Rao S, Huang Y, Cunniff K, Nardone J, Laiho A, Tahiliani M, Sommer CA, Mostoslavsky G, Lahesmaa R, Orkin SH, Rodig SJ, Daley GQ, Rao A. Cell Stem Cell. 2011;8:200–213. doi: 10.1016/j.stem.2011.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Ito S, D'Alessio AC, Taranova OV, Hong K, Sowers LC, Zhang Y. Nature. 2010;466:1129–1133. doi: 10.1038/nature09303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Wu H, Zhang Y. Cell Cycle. 2011;10:2428–2436. doi: 10.4161/cc.10.15.16930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Dawlaty MM, Ganz K, Powell BE, Hu YC, Markoulaki S, Cheng AW, Gao Q, Kim J, Choi SW, Page DC, Jaenisch R. Cell Stem Cell. 2011;9:166–175. doi: 10.1016/j.stem.2011.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Szulwach KE, Li X, Li Y, Song CX, Wu H, Dai Q, Irier H, Upadhyay AK, Gearing M, Levey AI, Vasanthakumar A, Godley LA, Chang Q, Cheng X, He C, Jin P. Nat Neurosci. 2011;14:1607–1616. doi: 10.1038/nn.2959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Esteller M. N Engl J Med. 2008;358:1148–1159. doi: 10.1056/NEJMra072067. [DOI] [PubMed] [Google Scholar]
  • 75.Lorsbach RB, Moore J, Mathew S, Raimondi SC, Mukatira ST, Downing JR. Leukemia. 2003;17:637–641. doi: 10.1038/sj.leu.2402834. [DOI] [PubMed] [Google Scholar]
  • 76.Ono R, Taki T, Taketani T, Taniwaki M, Kobayashi H, Hayashi Y. 2002. [PubMed]
  • 77.Ko M, Huang Y, Jankowska AM, Pape UJ, Tahiliani M, Bandukwala HS, An J, Lamperti ED, Koh KP, Ganetzky R, Liu XS, Aravind L, Agarwal S, Maciejewski JP, Rao A. Nature. 2010;468:839–843. doi: 10.1038/nature09586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Figueroa ME, Abdel-Wahab O, Lu C, Ward PS, Patel J, Shih A, Li Y, Bhagwat N, Vasanthakumar A, Fernandez HF, Tallman MS, Sun Z, Wolniak K, Peeters JK, Liu W, Choe SE, Fantin VR, Paietta E, Lƶwenberg B, Licht JD, Godley LA, Delwel R, Valk PJM, Thompson CB, Levine RL, Melnick A. Cancer Cell. 2010;18:553–567. doi: 10.1016/j.ccr.2010.11.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Klose RJ, Kallin EM, Zhang Y. Nat Rev Genet. 2006;7:715–727. doi: 10.1038/nrg1945. [DOI] [PubMed] [Google Scholar]
  • 80.Li Z, Cai X, Cai C-L, Wang J, Zhang W, Petersen BE, Yang F-C, Xu M. Blood. 2011;118:4509–4518. doi: 10.1182/blood-2010-12-325241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Ko M, Bandukwala HS, An J, Lamperti ED, Thompson EC, Hastie R, Tsangaratou A, Rajewsky K, Koralov SB, Rao A. Proc Natl Acad Sci U S A. 2011;108:14566–14571. doi: 10.1073/pnas.1112317108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Hotchkiss R. J Biol Chem. 1948;175:315–332. [PubMed] [Google Scholar]
  • 83.Dunn DB, Smith JD. Biochem J. 1958;68:627–636. doi: 10.1042/bj0680627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Janulaitis A, Klimašauskas S, Petrušyte M, Butkus V. FEBS Lett. 1983;161:131–134. doi: 10.1016/0014-5793(83)80745-5. [DOI] [PubMed] [Google Scholar]

RESOURCES