Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Apr 15.
Published in final edited form as: J Mol Biol. 2019 Oct 15;432(6):1661–1673. doi: 10.1016/j.jmb.2019.09.013

Detection of DNA modifications by sequence-specific transcription factors

Jie Yang 1, Xing Zhang 1, Robert M Blumenthal 2,*, Xiaodong Cheng 1,*
PMCID: PMC7156337  NIHMSID: NIHMS1546954  PMID: 31626807

Abstract

The establishment, detection, and alteration or elimination of epigenetic DNA modifications are essential to controlling gene expression ranging from bacteria to mammals. The DNA methylations occurring at cytosine and adenine are carried out by SAM-dependent methyltransferases. Successive oxidations of 5-methylcytosine (5mC) by Tet dioxygenases generates 5-hydroxymethyl (5hmC), 5-formyl (5fC), and 5-carboxyl (5caC) derivatives; thus DNA elements with multiple methylation sites can have a wide range of modification states. In contrast, oxidation of N6-methyladenine by homologs of E. coli AlkB removes the methyl group directly. Both Tet and AlkB enzymes are 2-oxoglutarate- and Fe(II)-dependent dioxygenases. DNA binding proteins decode the modification status of specific genomic regions. This article centers on two families of sequence-specific transcription factors: bZIP (basic leucine-zipper) proteins, exemplified by the AP-1 and CEBPβ recognition of 5mC; and bHLH (basic helix-loop-helix) proteins, exemplified by MAX and TCF4 recognition of 5caC. We discuss the impact of template strand DNA modification on the activities of DNA and RNA polymerases, and the varied tendencies of modifications to alter base pairing and their interactions with DNA repair enzymes.

Keywords: DNA cytosine methylation, DNA adenine methylation, modification-responsive transcription factors, basic helix-loop-helix (bHLH) proteins, basic leucine-zipper (bZIP) proteins

Graphical Abstract

graphic file with name nihms-1546954-f0006.jpg

Introduction

Distribution of DNA methylation.

DNA methylation in bacteria and archaea is common, occurring at ring carbon C5 of the cytosine, the exocyclic amino groups of cytosine (at N4) and adenine (at N6) [1, 2] (Fig. 1aFig. 1c). The great majority of bacterial and archaeal DNA methyltransferases (MTases) are associated with restriction-modification systems, where the MTase protects a cell’s own DNA from digestion by its restriction endonuclease [3, 4]. Restriction-modification systems are important for defense against bacteriophage predation [5, 6], though they play other roles as well [7, 8]. Bacterial ‘orphan’ MTases - which are not paired with a restriction endonuclease as part of a restriction-modification system [9] - are sometimes involved in chromosome replication, DNA repair, and epigenetic gene regulation [10]. Examples of such regulatory orphan MTases include the DNA adenine MTase (Dam) in Escherichia coli (Gammaproteobacteria), and cell cycle-regulated DNA MTase (CcrM) in Caulobacter crescentus (Alphaproteobacteria) which are, respectively, responsible for maintenance adenine methylation of GATC or GAnTC sequences (N=any nucleotide) immediately after their replication [11, 12]. Deletion of ccrM in C. crescentus is lethal [13], while deletion of dam in Escherichia coli compromises methylation-directed mismatch repair and affects initiation of chromosome replication [10, 14, 15]. Dam also modulates the binding of RNA polymerase and transcription factors [1618]. In contrast, deletion of the orphan DNA cytosine MTase (Dcm) in E. coli causes no obvious phenotypic change [19], but may affect G:T mismatch repair through short-patch repair [20]. In addition, some MTases that are associated with restriction-modification systems can affect gene expression, for example through phasevarions [2123].

Fig. 1.

Fig. 1.

(a-c) Three types of DNA methylation occurring at carbon C5 of the cytosine ring, or the exocyclic amino groups of cytosine (at N4) or adenine (at N6). Note that thymine (T), like 5mC, contains a methyl group at pyrimidine ring carbon C5. (d-g) Enzymatic reactions carried out by Fe(II)- and 2-oxoglutarate-dependent dioxygenases: E. coli AlkB (d) and its mammalian homologs (e), Jumonji protein lysine demethylases (f), and Tet enzymes (g). (h) Examples of proteins involved in binding of modified cytosine bases. For additional examples, see [141] in this issue.

The focus of this review is on mammals, but it is worth briefly noting that DNA methylation is widespread in plants [24], and variable in fungi [25]. In the largest family of Animalia, the arthropods, DNA methylation is also variable. For example, DNA 5-methylcytosine (5mC) plays important roles in honeybees [26], but is present at barely-detectable levels in Diptera (flies and mosquitoes) [27]. Drosophila was found to contain both 5mC [28] and N6-methyladenine (N6mA) [29]; however, both observations are controversial [3032].

In mammals the epigenetic DNA methylation marks are generated and maintained by DNA cytosine-C5 MTases Dnmt1 and Dnmt3 family at CpG and/or CpA dinucleotides [33, 34]. 5mC was first reported in mammalian DNA nearly 70 years ago [35], whereas DNA adenine methylation was reported only recently. Low levels of N6mA in DNA have been reported in mouse [36, 37], human [38, 39], and human malignant brain tumor glioblastoma [40]. Other studies have failed to detect N6mA in mammalian genomes [4143]. There is also uncertainty over identification of the mammalian enzyme(s) responsible for generating the methyl marks – DNA adenine MTase(s) have not yet been convincingly established.

Demethylation of DNA.

Mouse Alkbhl and Alkbh4 – homologs of E. coli AlkB, which removes certain alkyl adducts from both DNA and RNA [44] – are demethylases for N6mA [36, 37]. Dam MTases are commonly present among enteric bacteria and some of their bacteriophages [45]. In the 1960s, the DNA of T-even bacteriophages was found to contain glucosylated 5hmC in position of normal cytosine [46]. Nearly 40-years later, 5hmC was found in the genome of higher organisms [47, 48], and the source turned out to be oxidation of 5mC. Ten-eleven translocation (Tet) dioxygenases can, in three consecutive oxidation reactions, successively oxidize 5mC to 5hmC, then 5fC, and finally 5caC [4850]. Significantly, the Tet-mediated 5mC oxidations use the same enzymatic 2-oxoglutarate- and Fe(II)-dependent reactions as do N-methyl nucleic acid demethylases including E. coli AlkB, its mammalian homologs, and the Jumonji histone lysine demethylases, and many other enzymes [51] (Fig. 1d1g). However, the mechanisms of these dioxygenase demethylases of N-methylation involve the formation of a N-hydroxymethyl intermediate, followed by the spontaneous release of formaldehyde. However, formaldehyde is not generated during Tet-mediated 5mC hydroxylation, presumably due to the fact that cytosine ring carbon atom C5 is a relatively inert carbon. On the other hand, the carbon-carbon bond in 5hmC can be broken in vitro more readily than in 5mC, via nucleophilic attack at the ring C6 carbon by DNA cytosine-C5 MTases [52] or by exogenous thiols [53]. However, 5hmC stays as a stable modification in the cell, or is further modified to 5fC and 5caC in Tet-mediated oxidation reactions, producing other stable modifications that are increasingly chemically distinct from 5mC.

Using synthetic isotope- and (R)-2′-fluorine-labeled deoxyC and deoxy5fC derivatives, incorporated into the genomic DNA of cultured mammalian cells, 5fC conversion to cytosine occurs in vivo by direct cleavage of the C-C bond [54], though the putative catalytic factor(s) is unknown. DNA 5fC has been reported to function in nucleosome organization by forming a reversible covalent Schiff base bond to lysines of histone proteins [55]. This is potentially very important, since lysine residues are often involved in DNA binding, though such covalent lysine-5fC adducts have not been reported for any other DNA binding proteins in vivo or in vitro. However, indirect support for such covalent linkages comes from observation of DNA-histone crosslinks between lysine residues in histone tails and the 5-position methyl group of thymine, following oxidatively-induced rearrangement of a phenyl selenide derivative of thymine [56].

Modification-specific transcription factors

Several classes of mammalian transcription regulators have been characterized for methylation-responsive interactions with DNA [5759] (Fig. 1h). These DNA binding proteins, for instance MBD proteins (see article by Bird in this issue), C2H2 zinc finger proteins (see article by Buck-Koehntop), SRA domain proteins (see article by Wong), and homeodomain proteins [57], act in response to diverse modifications states of cytosine, functioning as epigenetic sensors to direct downstream outcomes. At least two classes of DNA binding proteins (C2H2 zinc finger proteins and MBD) operate the same approach – a methyl-Arg-Gua triad – to contact methylated CpG dinucleotides (5mCpG) [59, 60]. This same triad also serves to mediate recognition of TpG, because, like 5mC, thymine has a methyl group attached to the pyrimidine ring carbon C5 (Fig. 1c). The human tumor suppressor protein p53 utilizes a methyl–Arg–Gua triad to interact with the TpG dinucleotide within its recognition sequence TGCC(C/T). Replacing TpG with 5mCpG increases p53 binding affinity; whereas exchange with unmodified CpG lowers p53 binding [58]. This fits a general tendency for TpG to be relatively well recognized by 5mCpG-specific transcription factors. In fact, targeted 5mC-to-T modification by APOBEC proteins [61] might serve during differentiation as means of “permanently methylating” given nucleotides [62, 63]. Here, we focus on two transcription factor families: basic helix-loop-helix (bHLH) proteins MAX and TCF4 in recognition of 5caC and basic leucine-zipper (bZIP) proteins AP-1 and CEBPβ in recognition of 5mC.

Basic helix-loop-helix (bHLH) transcription factors.

Enhancer-box (E-box) sequences have the consensus motif of 5’-CAnnTG-3’ (where N=any nucleotide). The bHLH transcription factors bind to E-boxes and recruit coactivator or corepressor complexes to regulate gene expression. A subset of these E-proteins recognizes sequence motifs having a central CpG dinucleotide (5’-CACGTG-3’); examples include the oncogenic MYC and its binding partner MAX [64]. The central CpG and the two outer CpA dinucleotides (on opposite strands) are conventional DNA methylation sites. MAX binds an unmodified E-box element, and methylation of the central CpG prevents its binding [65]; whereas oxidation to 5caC restores its binding to the level of unmodified cytosine [66]. Arg36 of MAX uses a 5caC-Arg-Gua triad to interact 5caC (Fig. 2a2b). The sequence conservation between the bHLH domains of MAX and its binding partners, particularly the invariant corresponding Arg36 (Fig. 2c), implies that the ability to bind the two carboxylate groups of symmetrically-modified 5caC:G base pairs in the central CpG also be valid to most (if not all) MAX heterodimers.

Fig. 2.

Fig. 2.

Recognition of E-box sequences by bHLH transcription factors. (a-b) MAX Arg36 recognizes the central 5caCpG using two 5caC-Arg-Gua triads. (c-d) Sequence alignments of basic regions of (c) MAX-like and (d) TCF4-like transcription factors. (e-f) TCF4 uses Arg569 and Arg576 to recognize, respectively, two negatively-charged 5caCpG motifs immediately outside of an E-box.

Like MAX, the bHLH protein TCF4 can heterodimerize with other family members. Unlike MAX, the TCF4 binding sites are enriched with E-box motifs having variable central sequences (CAnnTG; N=any nucleotide) [6769]. However, 15-25% of TCF4 binding sequences having CpG or CpA at locations outside of but adjacent to the E-box [63] (Fig. 2e). The rate of recurrence of Cp(G/A) at these locations is greater than expected (6.25%), suggesting a preference of TCF4 binding for these sites. Indeed, 5caC in a CpG dinucleotide immediately outside of an E-box (5’-CG-CAGGTG-3’) increased TCF4 binding [70]. The acquired interactions include two positively-charged residues of TCF4 (Arg569 and Arg576) that interact, respectively, with two negatively-charged 5caC bases of fully-modified CpG sites (Fig. 2f). The gained electrostatic interactions and hydrogen bonds lead to favorable binding of 5caC DNA.

Interestingly, the corresponding Arg569 and Arg576 of TCF4, particularly Arg576, are conserved in the TCF4 related subfamily members, but not in the MAX-related subfamily (Fig. 2d). The corresponding residues in MAX are a lysine (which would at least preserve appropriate charge-charge interactions) and a leucine. Similarly, the corresponding MAX arginine Arg36 in TCF4 is a hydrophobic valine. Thus, two different bHLH proteins utilize unique positively-charged arginine residues for interaction with 5caC residing at different locations of E-box sequences, either the central CpG or the CpG immediately next to the E-box. It is interesting that, among the TCF4-like transcription factors, the residue interacting with the cytosine immediately adjacent to the E-box (Arg569) is less conserved than the residue interacting with the cytosine one base farther away (Arg576; Fig. 2d2e), which might allow the protein to recognize the modified CpA/TpG site at the corresponding positions. At least one of these proteins (TWIST1, having Gln in place of Arg576) does still recognize the bases flanking the E-box [71]; however, the effects of Arg569 substitution on binding effects of DNA modifications have not to our knowledge been determined for TWIST1 or any other TCF4-like proteins.

Basic leucine-zipper (bZIP) transcription factors.

Activator protein 1 (AP-1) is a classic bZIP family of DNA binding proteins involved in gene regulation and controls of cell proliferation, oncogenesis, and apoptosis [72, 73]. Like TCF4 and MAX, AP-1 is a dimeric complex involving homodimers and heterodimers of the Fos, Jun, MAF (musculoaponeurotic fibrosarcoma), and ATF (activating transcription factor) proteins [73]. AP-1 (Fos or Jun) binds a 7-bp sequence of TPA-response element (TRE: 5’-TGAGTCA-3’) or a methylated response element (meTRE: 5’-MGAGTCA-3’ where M=5mC) with a 5’ TpG dinucleotide being substituted by a methylated CpG [74, 75]. Both TRE and meTRE motifs contain two methyl groups (from the 5’ end) at nucleotide positions 1 and 5, generating four methyl groups symmetrically located at base pair positions 1, 3, 5, and 7 (Fig. 3a). The conserved di-alanine in AP-1 dimer make van der Waals contact with these methyl groups (Ala265 and Ala266 in Jun; Fig. 3b) [62].

Fig. 3.

Fig. 3.

Methyl-dependent and spatially-specific DNA recognition by bZIP transcription factors. (a) Aligned DNA response elements with spatially-equivalent methyl groups from T or M (5mC). (b-c) Recognition of the four spatially-constrained methyl groups of meTRE by the AA dipeptide of a Jun dimer (b), or of meZRE by the AS dipeptide of a Zta dimer (c). (d) Recognition of the six spatially-constrained methyl groups by the AV dipeptide of a C/EBPβ dimer. (e) The V285A mutant of C/EBPβ permits Arg289 to interact with T2-Arg-G3 triad. (f) Sequence alignments of basic regions AA dipeptide (Jun-like) or AV dipeptide (C/EBPβ-like).

A bZIP transcription factor (Zta) of Epstein–Barr Virus, a human-specific B cell-infecting gamma-herpesvirus [76, 77], was the first example of switching epigenetic silencing and activating gene transcription by a sequence-specific transcription factor that preferentially binds methylated cytosine bases within a specific sequence [78]. The unmethylated virion genome becomes heavily methylated during the latent stage of the virus cycle [7981]. The Zta homodimer preferentially recognizing methylated promoters (known as meZRE) activates early lytic cycle, and examples of which are 5’-TGAGMCA-3’ and 5’-TGAGMGA-3’ [78, 8284]. Like the AP-1 binding site, the meZRE elements contain two methyl groups at nucleotide positions 1 and 5 (Fig. 3a), with a methylated cytosine replacing one of the inner thymine residues of the AP-1 element, generating the four spatially-equivalent methyl groups. As in the case of AP-1, the corresponding Zta residues alanine 185 and serine 186 make van der Waals contact with these methyl groups (Fig. 3c) [62]. Serine at that position (Ser186) is unique to Zta among proteins in this family. The side chain hydroxyl oxygen atom of Ser186 contacts differently the two meZRE half sites, allowing the Zta dimer to bind asymmetric sequences, whereas the corresponding Ala266 of AP-1 Jun protein has no such flexibility.

Besides binding the 7-bp TRE motif (5’-TGA-G-TCA-3’), Fos or Jun proteins form heterodimers with ATF to interact with 8-bp sequence of cAMP-response element (CRE: 5’-TGA-CG-TCA-3’) [73]. In addition, CRE-binding proteins (CREB) also recognize the CRE elements [85, 86]. Between TRE and CRE elements, the difference is a one base-pair expansion next to the central C:G base-pair to generate a CpG site (Fig. 3a). The extended central CpG potentially permits the CpG modification to have a modulating role (e.g., see MAX above). This was tested using an artificial reporter construct, in which the promoter included one CRE element incorporating modifications of the central cytosine in a hemi-methylated CpG (5’-TGA-XG-TCA-3’/3’-ACT-GM-AGT-5’, where M=5mC) [87]. The results showed that modification of the top strand (X=5mC) and its oxidized forms (X=5hmC, 5fC and 5caC) all mildly impaired CREB binding and decreased gene expression [87]. The fact that all of these modification states inhibited binding suggests that steric effects, rather than charge or polar interactions, are dominant in this case.

Examination of chromatin immunoprecipitation sequencing data (by the ENCODE consortium [88]) from RNA polymerase II associated transcription factors revealed that many bZIP transcription factors (and TCF4-like bHLH proteins) have variable binding sites corresponding to the central CpG or reduced to a single G:C base pair, but maintain a conserved outer CpA/TpG sequence [89]. This observation suggests an essential modification-dependent CpA recognition on the part of bZIP transcription factors.

Another group of bZIP proteins, C/EBP (CCAAT/enhancer binding proteins)-related transcription factors have a consensus binding sequence of 5’-TTG-CG-CAA-3’, with a central CpG dinucleotide flanked by two CpA sites. Methylation of the Cp(G/A) sites produces a DNA motif with every pyrimidine having a methyl group at the ring carbon C5 position of thymine or 5mC. However, the modifications at CpA and CpG sites do not contribute equally to the binding affinity, measured with C/EBPβ, with the CpA modifications having a dominant effect [89]. Significantly, as with meTRE and meZRE elements, CpA methylation would restore the four spatially equivalent methyl groups at base-pair positions 1, 3, 5 and 7 (Fig. 3a). Further, the C/EBP binding sequence has additional methyl groups at base-pair positions 2 and 6 (T:A). In the corresponding Ala-Ala position, C/EBP family has a unique alanine-valine dipeptide (Fig. 3f). The larger side chain of valine allows C/EBPβ V285 interaction with both methyl groups at positions 2 and 3 (Fig. 3d). To mimic the conserved Ala-Ala in many members of the bZIP family (Fig. 3f), mutation of valine 285 with alanine (V285A) in the alanine-valine dipeptide unexpectedly showed binding preference for CpA methylation over the unmodified sequence by a factor of 90. The smaller side chain of V285A variant allows arginine 289 to interact with the TpG dinucleotide at bp positions 2 and 3 [89] (Fig. 3e). This T2-Arg-G3 triad for C/EBP recognition sequence is unique among the DNA elements shown in Fig. 3a.

Recognition of DNA modifications by polymerases

The impact of cytosine modification of the DNA template has been examined for both RNA polymerase and DNA polymerase [90, 91]. Human DNA polymerase β is not detectably affected by 5mC, 5hmC, or 5fC in the DNA template; though the enzymatic efficiency of dGTP incorporation opposite 5caC in the template position is decreased ~20-fold relative to unmodified cytosine [91]. A positively-charged lysine 280 and polar residue asparagine 37 in the polymerase active site directly interact with the carboxylate moiety of 5caC (Fig. 4a). The lysine-5caC interaction is similar to that observed by the DNA binding CXXC domain of mouse Tet3 (Fig. 4b) [92, 93], the enzyme whose dioxygenase activity generates 5caC modification. In contrast, both 5caC and 5fC in the DNA template slow down transcriptional elongation by RNA polymerase Pol II, which uses a glutamine to interact with the carboxylate group of 5caC (Fig. 4c) [90]. The polar but uncharged glutamine interaction with 5caC is similar to that observed by Wilms tumor protein WT1 (Fig. 4d) [94]. These observations suggest that sequence-specific transcription factors, Tet dioxygenase itself and polymerases, function as epigenetic sensors for DNA modifications.

Fig. 4.

Fig. 4.

Recognition of 5caC by (a) DNA polymerase β, (b) Tet3, (c) RNA polymerase II, and (d) Wilms tumor protein WT1.

The altered interactions between 5caC and DNA/RNA polymerases might be related to the mutagenic potential of 5caC and 5fC [95, 96]. Both 5fC and 5caC display an intra-base hydrogen bonding interaction between the exocyclic nitrogen atom at N4 and their formyl or carboxyl oxygen atoms, respectively. This intra-base hydrogen bonding is seen both in the protein-bound state [94] and in the protein-free state [97], and and may alter the hydrogen bonding strength of C:G base pair [9799]. It is worth noting that a DNA repair enzyme TDG, named for its thymine DNA glycosylase activity, cleaves 5caC and 5fC when either is paired with a guanine, but not 5hmC or 5mC [100]. The tendency for mismatches to appear opposite 5caC and 5fC (but not 5hmC or 5mC) is possibly associated with different excision rate by TDG, though TDG has much faster rate on G:U mismatches [101, 102].

A potential link between 5hmC and DNA repair involves a protein named for its activity as an embryonic stem cell-specific 5hmC binding protein (HMCES). It was initially discovered as a 5hmC binder in a proteomics experiment, using double-stranded oligonucleotides of (GAT)2•(GAC)4•GAT sequence as a bait to pull down proteins from embryonic cell lysates [103]. HMCES represents the mammalian ortholog of the SOS response-assisted peptidase (SRAP) family, which exists in organisms from bacteria and yeast to human [104]. The HMCES SRAP domain was also reported to incise 5hmC-containing duplexes as an autopeptidase-coupled nuclease [105]. However, the SRAP-associated nuclease activity was not observed in a recent independent study [106]. Instead, the SRAP domain proteins of both E. coli and human were seen to form covalent bonds to abasic sites in single-strand DNA [106108]. We note that a 5hmC DNA glycosylase activity in mammalian tissue was reported over 30 years ago [109]. If such glycosylase activity exists, it could excise a 5hmC base and generate an abasic site for subsequent binding by HMCES.

To detect genomic 5fC, in vitro chemical labeling with malononitrile or 1,3-indanedione has been used to generate two unnatural cytosine bases [110]. The resulting 5fC adducts are recognized by DNA polymerases as thymine instead of cytosine, causing C-to-T mutations during DNA replication [111].

N6-methyladenine in mammalian DNA

As described above, N6-methyladenine (N6mA) has recently been observed in mammalian DNA, and its functions remain elusive. Unlike cytosine modification at the ring carbon atom C5, N6mA is on the exocyclic amino group that is involved in Watson-Crick base pairing hydrogen bonds (Fig. 1a). Indeed, incorporation of N6mA into a DNA template caused RNA polymerase Pol II pausing due to lower stability and slower kinetics of base pairing, and the stability of the UTP base paring with N6mA is compromised [112] (Fig. 5a). The site-specific pausing at N6mA in the kinetics of incorporation of thymine into newly synthesized DNA strand by a modified phage polymerase underlies the ability of SMRT sequencing to detect methylated adenines [113]. Not surprisingly, like TDG, human and mouse homologs of E. coli repair enzyme AlkB (Alkbh1 and Alkbh4) are sequence-independent DNA N6mA demethylases [36, 37, 40].

Fig. 5.

Fig. 5.

Recognition of N6mA by (a) RNA polymerase II, and (b-d) YTH domains.

Currently, it is unknown how N6mA DNA methyl marks in mammals are generated, read or altered. In addition to its known activity of glutamine methylation of eukaryotic release factor eRF1 [114116], mammalian HemK2 has been reported to be a DNA adenine-N6 MTase [38] (renamed as N6AMT1), as well as a histone H4 lysine-12 MTase [117] (renamed as KMT9). We recently analyzed the in vitro activities of HemK2/KMT9/N6AMT1, in complex with Trm112 (named after tRNA methylation protein), on the three substrates side-by-side and found that human HemK2-Trm112 is only active on protein glutamine and lysine, but not active on DNA [118].

The other potential candidate N6mA DNA MTases include members of a functionally diverse MT-A70 family of SAM-dependent MTases [119]. A DNA adenine MTase complex in cillates (a group of protozoans), consists of two MT-A70 proteins and methylates double-strand DNA [120]. [Both 5-methylcytosine and N6-methyladenine are found in unicellular eukaryotes [121]]. Mouse Mettl4 was recently reported to be responsible for N6-methyladenine deposition in genic elements, corresponding with transcriptional silencing [37] (though the in vitro enzymatic activity of Mettl4 was not reported). In addition, the human Mettl3-Mettl14 heterodimer possesses methyl transfer activity on adenine of single-stranded mRNA ([122] and references therein), whereas human Mettl5-Trm112 might be responsible for 18S rRNA adenine methylation [123]. We emphasize that many nucleic acid modifying enzymes are able to act on both DNA and RNA, including members of the AlkB family that directly dealkylate DNA and RNA [124], and family members of the Apobec cytidine deaminases [125]. Tet2, one of the ten-eleven translocation proteins initially discovered as DNA 5mC dioxygenases [48, 126], mediates oxidation of 5mC in mRNA [127, 128]. We expect the characterization of potential mammalian DNA N6mA MTase(s) to be an active research area for the coming years.

Regarding N6mA-specific binding, the better-characterized recognition of N6mA in RNA provided a possible readout mechanism. N6mA-specific RNA-binding YTH domain, a conserved 100–150-residue polypeptide [129], binds the N6mA methyl group in an aromatic ‘cage’ [130134] (Fig. 5b). In addition, YTH-ssRNA complexes could form dimers through RNA:RNA base stacking [134] (Fig. 5c) or base paring (Fig. 5d). This unusual dimer assembly via the nucleic acids could have relevance for RNA-protein structures in general.

Conclusions and Future Directions

In the era of epigenomics, many of the challenges of interpreting DNA methylations states of both cytosines and adenines themselves, in addition to effects on transcription factor binding or polymerase activity, have been summarized [135, 136]. One of the greatest such challenges is deconvoluting tissue-specific and disease-specific methylation patterns, and varied degrees of methylation even within a specific tissue. In addition, new and more sensitive detection techniques, genomic sample preparation and potential contamination from methylated bacterial DNA, are prone to inaccuracies in measurement of rare noncanonical DNA modifications in metazoan genomes [32, 137]. These challenges are further complicated by subtle differences intrinsic to transcription factors themselves in their interactions with DNA modifications within a particular sequence [59, 138]. Many transcription factors bind to DNA with high affinity while recognizing sequences with high variability within sequence motifs [139, 140]. This property can be traced to the ability of specific protein residues to adopt alternative conformations to establish versatile hydrogen-bonds with some DNA bases, but not with others [139, 140].

We can add to the above the effects of modification states. A motif such as E-box sequences, with four methylatable cytosines on two different strands could, counting Tet effects, have 20 possible states (all combinations of C, 5mC, 5hmC, 5fC, and 5caC at each of the four positions). On top of that, there can be flanking methylatable sites, each one adding fivefold diversity in possible states. Finally, different transcription heteromers can respond to many of these states in distinct ways. Such a diversity of states complicates the role of scientists in understanding the functional consequences of protein interactions with modified DNA elements, but such complexity is likely to be necessary for the complex processes of differentiation and homeostasis in mammals, and other organisms as well.

Research Highlights:

  • Basic leucine-zipper (bZIP) proteins, exemplified by AP-1 and CEBPβ recognition of 5mC.

  • Basic helix-loop-helix (bHLH) proteins, exemplified by MAX and TCF4 recognition of 5caC.

  • 5caC and N6mA of template strand DNA modification have impact on the activities of DNA and RNA polymerases.

  • Modifications that alter base pairing strength can affect interaction with DNA repair enzymes.

  • DNA elements with multiple methylation sites can have a wide range of modification states.

Acknowledgements –

National Institutes of Health (grant GM049245) and Cancer Prevention and Research Institute of Texas (RR160029) supported the work in the Cheng Laboratory. We thank the members of the Cheng laboratory for discussion. X.C. is a CPRIT Scholar in Cancer Research

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Conflict of interest statement - The authors declare no conflict of interest.

References

  • [1].Ehrlich M, Gama-Sosa MA, Carreira LH, Ljungdahl LG, Kuo KC, Gehrke CW. DNA methylation in thermophilic bacteria: N4-methylcytosine, 5-methylcytosine, and N6-methyladenine. Nucleic Acids Res. 13 (1985) 1399–1412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Sanchez-Romero MA, Cota I, Casadesus J. DNA methylation in bacteria: from the methyl group to the methylome. Current opinion in microbiology. 25 (2015) 9–16. [DOI] [PubMed] [Google Scholar]
  • [3].Casadesus J Bacterial DNA Methylation and Methylomes. Advances in experimental medicine and biology. 945 (2016) 35–61. [DOI] [PubMed] [Google Scholar]
  • [4].Roberts RJ, Vincze T, Posfai J, Macelis D. REBASE--a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res. 43 (2015) D298–299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Loenen WA, Dryden DT, Raleigh EA, Wilson GG, Murray NE. Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res. 42 (2014) 3–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Pleska M, Guet CC. Effects of mutations in phage restriction sites during escape from restriction-modification. Biol. Lett 13 (2017) 20170646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Vasu K, Nagaraja V. Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol. Mol. Biol. Rev 77 (2013) 53–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Oliveira PH, Touchon M, Rocha EP. Regulation of genetic flux between bacteria by restriction-modification systems. Proc. Natl. Acad. Sci. U S A. 113 (2016) 5658–5663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Pingoud A, Wilson GG, Wende W. Type II restriction endonucleases--a historical perspective and more. Nucleic Acids Res. 42 (2014) 7489–7527. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Marinus MG, Casadesus J. Roles of DNA adenine methylation in host-pathogen interactions: mismatch repair, transcriptional regulation, and more. FEMS microbiology reviews. 33 (2009) 488–503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Messer W, Noyer-Weidner M. Timing and targeting: the biological functions of Dam methylation in E. coli. Cell. 54 (1988) 735–737. [DOI] [PubMed] [Google Scholar]
  • [12].Berdis AJ, Lee I, Coward JK, Stephens C, Wright R, Shapiro L, et al. A cell cycle-regulated adenine DNA methyltransferase from Caulobacter crescentus processively methylates GANTC sites on hemimethylated DNA. Proc. Natl. Acad. Sci. U S A. 95 (1998) 2874–2879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Stephens C, Reisenauer A, Wright R, Shapiro L. A cell cycle-regulated bacterial DNA methyltransferase is essential for viability. Proc. Natl. Acad. Sci. U S A. 93 (1996) 1210–1214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Glickman BW, Radman M. Escherichia coli mutator mutants deficient in methylation-instructed DNA mismatch correction. Proc. Natl. Acad. Sci. U S A. 77 (1980) 1063–1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Raghunathan N, Goswami S, Leela JK, Pandiyan A, Gowrishankar J. A new role for Escherichia coli Dam DNA methylase in prevention of aberrant chromosomal replication. Nucleic Acids Res. 47 (2019) 5698–5711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Roberts D, Hoopes BC, McClure WR, Kleckner N. IS10 transposition is regulated by DNA adenine methylation. Cell. 43 (1985) 117–130. [DOI] [PubMed] [Google Scholar]
  • [17].Henderson IR, Owen P. The major phase-variable outer membrane protein of Escherichia coli structurally resembles the immunoglobulin A1 protease class of exported protein and is regulated by a novel mechanism involving Dam and oxyR. Journal of bacteriology. 181 (1999) 2132–2141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Hernday A, Krabbe M, Braaten B, Low D. Self-perpetuating epigenetic pili switches in bacteria. Proc. Natl. Acad. Sci. U S A. 99 Suppl 4 (2002) 16470–16476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Palmer BR, Marinus MG. The dam and dcm strains of Escherichia coli--a review. Gene. 143 (1994) 1–12. [DOI] [PubMed] [Google Scholar]
  • [20].Bhagwat AS, Lieb M. Cooperation and competition in mismatch repair: very short-patch repair and methyl-directed mismatch repair in Escherichia coli. Molecular microbiology. 44 (2002) 1421–1428. [DOI] [PubMed] [Google Scholar]
  • [21].Gawthorne JA, Beatson SA, Srikhanta YN, Fox KL, Jennings MP. Origin of the diversity in DNA recognition domains in phasevarion associated modA genes of pathogenic Neisseria and Haemophilus influenzae. PLoS One. 7 (2012) e32337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Nye TM, Jacob KM, Holley EK, Nevarez JM, Dawid S, Simmons LA, et al. DNA methylation from a Type I restriction modification system influences gene expression and virulence in Streptococcus pyogenes. PLoS pathogens. 15 (2019) e1007841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Phillips ZN, Tram G, Seib KL, Atack JM. Phase-variable bacterial loci: how bacteria gamble to maximise fitness in changing environments. Biochem. Soc. Trans 47 (2019) 1131–1141. [DOI] [PubMed] [Google Scholar]
  • [24].Brautigam K, Cronk Q. DNA Methylation and the Evolution of Developmental Complexity in Plants. Front. Plant. Sci 9 (2018) 1447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Bewick AJ, Hofmeister BT, Powers RA, Mondo SJ, Grigoriev IV, James TY, et al. Diversity of cytosine methylation across the fungal tree of life. Nat. Ecol. Evol 3 (2019) 479–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Wedd L, Maleszka R. DNA Methylation and Gene Regulation in Honeybees: From Genome-Wide Analyses to Obligatory Epialleles. Advances in experimental medicine and biology. 945 (2016) 193–211. [DOI] [PubMed] [Google Scholar]
  • [27].Bewick AJ, Vogel KJ, Moore AJ, Schmitz RJ. Evolution of DNA Methylation across Insects. Molecular biology and evolution. 34 (2017) 654–665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Lyko F, Ramsahoye BH, Jaenisch R. DNA methylation in Drosophila melanogaster. Nature. 408 (2000) 538–540. [DOI] [PubMed] [Google Scholar]
  • [29].Zhang G, Huang H, Liu D, Cheng Y, Liu X, Zhang W, et al. N(6)-methyladenine DNA modification in Drosophila. Cell. 161 (2015) 893–906. [DOI] [PubMed] [Google Scholar]
  • [30].Lyko F The DNA methyltransferase family: a versatile toolkit for epigenetic regulation. Nat. Rev. Genet 19 (2018) 81–92. [DOI] [PubMed] [Google Scholar]
  • [31].Jeltsch A, Ehrenhofer-Murray A, Jurkowski TP, Lyko F, Reuter G, Ankri S, et al. Mechanism and biological role of Dnmt2 in Nucleic Acid Methylation. RNA Biol. 14 (2017) 1108–1123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].O’Brown ZK, Boulias K, Wang J, Wang SY, O’Brown NM, Hao Z, et al. Sources of artifact in measurements of 6mA and 4mC abundance in eukaryotic genomic DNA. BMC Genomics. 20 (2019) 445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Bestor TH. The DNA methyltransferases of mammals. Hum. Mol. Genet 2000;9:2395–402. [DOI] [PubMed] [Google Scholar]
  • [34].Cheng X, Blumenthal RM. Mammalian DNA methyltransferases: a structural perspective. Structure. 16 (2008) 341–350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Wyatt GR. Recognition and estimation of 5-methylcytosine in nucleic acids. Biochem. J 48 (1951) 581–584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Wu TP, Wang T, Seetin MG, Lai Y, Zhu S, Lin K, et al. DNA methylation on N(6)-adenine in mammalian embryonic stem cells. Nature. 532 (2016) 329–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Kweon SM, Chen Y, Moon E, Kvederaviciute K, Klimasauskas S, Feldman DE. An Adversarial DNA N(6)-Methyladenine-Sensor Network Preserves Polycomb Silencing. Mol. Cell 74 (2019) 1138–1147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Xiao CL, Zhu S, He M, Chen, Zhang Q, Chen Y, et al. N(6)-Methyladenine DNA Modification in the Human Genome. Mol. Cell 71 (2018) 306–318. [DOI] [PubMed] [Google Scholar]
  • [39].Koh CWQ, Goh YT, Toh JDW, Neo SP, Ng SB, Gunaratne J, et al. Single-nucleotide-resolution sequencing of human N6-methyldeoxyadenosine reveals strand-asymmetric clusters associated with SSBP1 on the mitochondrial genome. Nucleic Acids Res. 46 (2018) 11659–11670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].Xie Q, Wu TP, Gimple RC, Li Z, Prager BC, Wu Q, et al. N(6)-methyladenine DNA Modification in Glioblastoma. Cell. 175 (2018) 1228–1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Ratel D, Ravanat JL, Charles MP, Platet N, Breuillaud L, Lunardi J, et al. Undetectable levels of N6-methyl adenine in mouse DNA: Cloning and analysis of PRED28, a gene coding for a putative mammalian DNA adenine methyltransferase. FEBS Lett. 580 (2006) 3179–3184. [DOI] [PubMed] [Google Scholar]
  • [42].Schiffers S, Ebert C, Rahimoff R, Kosmatchev O, Steinbacher J, Bohne AV, et al. Quantitative LC-MS Provides No Evidence for m(6) dA or m(4) dC in the Genome of Mouse Embryonic Stem Cells and Tissues. Angewandte Chemie. 56 (2017) 11268–11271. [DOI] [PubMed] [Google Scholar]
  • [43].Liu B, Liu X, Lai W, Wang H. Metabolically Generated Stable Isotope-Labeled Deoxynucleoside Code for Tracing DNA N(6)-Methyladenine in Human Cells. Anal. Chem 89 (2017) 6202–6209. [DOI] [PubMed] [Google Scholar]
  • [44].Aas PA, Otterlei M, Falnes PO, Vagbo CB, Skorpen F, Akbari M, et al. Human and bacterial oxidative demethylases repair alkylation damage in both RNA and DNA. Nature. 421 (2003) 859–63. [DOI] [PubMed] [Google Scholar]
  • [45].Hattman S, Malygin EG. Bacteriophage T2Dam and T4Dam DNA-[N6-adenine]-methyltransferases. Progress in nucleic acid research and molecular biology. 77 (2004) 67–126. [DOI] [PubMed] [Google Scholar]
  • [46].Wiberg JS, Buchanan JM. Studies on Labile Deoxycytidylate Hydroxymethylases from Escherichia Coli B Infected with Temperature-Sensitive Mutants of Bacteriophage T4. Proc. Natl. Acad. Sci. U S A. 51: (1964) 421–428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Kriaucionis S, Heintz N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science. 324 (2009) 929–930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. 324 (2009) 930–935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Ito S, Shen L, Dai Q, Wu SC, Collins LB, Swenberg JA, et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science. 333 (2011) 1300–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science. 333 (2011) 1303–1307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Herr CQ, Hausinger RP. Amazing diversity in biochemical roles of Fe(II)/2-oxoglutarate oxygenases. Trends Biochem. Sci 43 (2018) 517–532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Liutkeviciute Z, Lukinavicius G, Masevicius V, Daujotyte D, Klimasauskas S. Cytosine-5-methyltransferases add aldehydes to DNA. Nat. Chem. Biol 5 (2009) 400–402. [DOI] [PubMed] [Google Scholar]
  • [53].Schiesser S, Pfaffeneder T, Sadeghian K, Hackner B, Steigenberger B, Schroder AS, et al. Deamination, oxidation, and C-C bond cleavage reactivity of 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxycytosine. J. Am. Chem. Soc 135 (2013) 14593–14599. [DOI] [PubMed] [Google Scholar]
  • [54].Iwan K, Rahimoff R, Kirchner A, Spada F, Schroder AS, Kosmatchev O, et al. 5-Formylcytosine to cytosine conversion by C-C bond cleavage in vivo. Nat. Chem. Biol 14 (2018) 72–78. [DOI] [PubMed] [Google Scholar]
  • [55].Raiber EA, Portella G, Martinez Cuesta S, Hardisty R, Murat P, Li Z, et al. 5-Formylcytosine organizes nucleosomes and forms Schiff base interactions with histones in mouse embryonic stem cells. Nature chemistry. 10 (2018) 1258–1266. [DOI] [PubMed] [Google Scholar]
  • [56].Weng L, Zhou C, Greenberg MM. Probing interactions between lysine residues in histone tails and nucleosomal DNA via product and kinetic analysis. ACS Chem. Biol 10 (2015) 622–630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [57].Yin Y, Morgunova E, Jolma A, Kaasinen E, Sahu B, Khund-Sayeed S, et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science. 356 (2017) eaaj2239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [58].Kribelbauer JF, Laptenko O, Chen S, Martini GD, Freed-Pastor WA, Prives C, et al. Quantitative Analysis of the DNA Methylation Sensitivity of Transcription Factor Complexes. Cell reports. 19 (2017) 2383–2395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [59].Ren R, Horton JR, Zhang X, Blumenthal RM, Cheng X. Detecting and interpreting DNA methylation marks. Curr. Opin. Struct. Biol 53 (2018) 88–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [60].Liu Y, Zhang X, Blumenthal RM, Cheng X. A common mode of recognition for methylated CpG. Trends Biochem. Sci 38 (2013) 177–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [61].Wijesinghe P, Bhagwat AS. Efficient deamination of 5-methylcytosines in DNA by human APOBEC3A, but not by AID or APOBEC3G. Nucleic Acids Res. 40 (2012) 9206–9217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [62].Hong S, Wang D, Horton JR, Zhang X, Speck SH, Blumenthal RM, et al. Methyl-dependent and spatial-specific DNA recognition by the orthologous transcription factors human AP-1 and Epstein-Barr virus Zta. Nucleic Acids Res. 45 (2017) 2503–2315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [63].Yang J, Horton JR, Li J, Huang Y, Zhang X, Blumenthal RM, et al. Structural basis for preferential binding of human TCF4 to DNA containing 5-carboxylcytosine. Nucleic Acids Res. (2019) gkz381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [64].Blackwood EM, Eisenman RN. Max: a helix-loop-helix zipper protein that forms a sequence-specific DNA-binding complex with Myc. Science. 251 (1991) 1211–1217. [DOI] [PubMed] [Google Scholar]
  • [65].Prendergast GC, Lawe D, Ziff EB. Association of Myn, the murine homolog of max, with c-Myc stimulates methylation-sensitive DNA binding and ras cotransformation. Cell. 65 (1991) 395–407. [DOI] [PubMed] [Google Scholar]
  • [66].Wang D, Hashimoto H, Zhang X, Barwick BG, Lonial S, Boise LH, et al. MAX is an epigenetic sensor of 5-carboxylcytosine and is altered in multiple myeloma. Nucleic Acids Res. 45 (2017) 2396–2407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [67].Forrest MP, Hill MJ, Kavanagh DH, Tansey KE, Waite AJ, Blake DJ. The Psychiatric Risk Gene Transcription Factor 4 (TCF4) Regulates Neurodevelopmental Pathways Associated With Schizophrenia, Autism, and Intellectual Disability. Schizophr Bull. 44 (2018) 1100–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [68].Xia H, Jahr FM, Kim NK, Xie L, Shabalin AA, Bryois J, et al. Building a schizophrenia genetic network: transcription factor 4 regulates genes involved in neuronal development and schizophrenia risk. Hum. Mol. Genet 27 (2018) 3246–3256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [69].Ceribelli M, Hou ZE, Kelly PN, Huang DW, Wright G, Ganapathi K, et al. A Druggable TCF4- and BRD4-Dependent Transcriptional Network Sustains Malignancy in Blastic Plasmacytoid Dendritic Cell Neoplasm. Cancer Cell. 30 (2016) 764–778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [70].Golla JP, Zhao J, Mann IK, Sayeed SK, Mandal A, Rose RB, et al. Carboxylation of cytosine (5caC) in the CG dinucleotide in the E-box motif (CGCAG|GTG) increases binding of the Tcf3|Ascl1 helix-loop-helix heterodimer 10-fold. Biochem. Biophys. Res. Commun 449 (2014) 248–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [71].Bouard C, Terreux R, Honorat M, Manship B, Ansieau S, Vigneron AM, et al. Deciphering the molecular mechanisms underlying the binding of the TWIST1/E12 complex to regulatory E-box sequences. Nucleic Acids Res. 44 (2016) 5470–5489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [72].Karin M, Liu Z, Zandi E. AP-1 function and regulation. Curr. Opin. Cell. Biol 9 (1997) 240–246. [DOI] [PubMed] [Google Scholar]
  • [73].Eferl R, Wagner EF. AP-1: a double-edged sword in tumorigenesis. Nat. Rev. Cancer. 3 (2003) 859–868. [DOI] [PubMed] [Google Scholar]
  • [74].Tulchinsky EM, Georgiev GP, Lukanidin EM. Novel AP-1 binding site created by DNA-methylation. Oncogene. 12 (1996) 1737–1745. [PubMed] [Google Scholar]
  • [75].Gustems M, Woellmer A, Rothbauer U, Eck SH, Wieland T, Lutter D, et al. c-Jun/c-Fos heterodimers regulate cellular genes via a newly identified class of methylated DNA sequence motifs. Nucleic Acids Res. 42 (2014) 3059–3072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [76].Chiu YF, Sugden B. Epstein-Barr Virus: The Path from Latent to Productive Infection. Annu. Rev. Virol 3 (2016) 359–372. [DOI] [PubMed] [Google Scholar]
  • [77].Sugden B Epstein-Barr virus: the path from association to causality for a ubiquitous human pathogen. PLoS Biol. 12 (2014) e1001939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [78].Bhende PM, Seaman WT, Delecluse HJ, Kenney SC. The EBV lytic switch protein, Z, preferentially binds to and activates the methylated viral genome. Nat. Genet 36 (2004) 1099–1104. [DOI] [PubMed] [Google Scholar]
  • [79].Paulson EJ, Speck SH. Differential methylation of Epstein-Barr virus latency promoters facilitates viral persistence in healthy seropositive individuals. J. Virol 73 (1999) 9959–9968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [80].Fernandez AF, Rosales C, Lopez-Nieva P, Grana O, Ballestar E, Ropero S, et al. The dynamic DNA methylomes of double-stranded DNA viruses associated with human cancer. Genome Res. 19 (2009) 438–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [81].Kenney SC, Mertz JE. Regulation of the latent-lytic switch in Epstein-Barr virus. Semin. Cancer Biol 26 (2014) 60–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [82].Farrell PJ, Rowe DT, Rooney CM, Kouzarides T. Epstein-Barr virus BZLF1 trans-activator specifically binds to a consensus AP-1 site and is related to c-fos. EMBO J. 8 (1989) 127–132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [83].Bergbauer M, Kalla M, Schmeinck A, Gobel C, Rothbauer U, Eck S, et al. CpG-methylation regulates a class of Epstein-Barr virus promoters. PLoS pathogens. 6 (2010) e1001114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [84].Yu KP, Heston L, Park R, Ding Z, Wang’ondu R, Delecluse HJ, et al. Latency of Epstein-Barr virus is disrupted by gain-of-function mutant cellular AP-1 proteins that preferentially bind methylated DNA. Proc. Natl. Acad. Sci U S A. 110 (2013) 8176–8181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [85].Montminy MR, Sevarino KA, Wagner JA, Mandel G, Goodman RH. Identification of a cyclic-AMP-responsive element within the rat somatostatin gene. Proc. Natl. Acad. Sci. U S A. 83 (1986) 6682–6686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [86].Schumacher MA, Goodman RH, Brennan RG. The structure of a CREB bZIP.somatostatin CRE complex reveals the basis for selective dimerization and divalent cation-enhanced DNA binding. J. Biol. Chem 275 (2000) 35242–35247. [DOI] [PubMed] [Google Scholar]
  • [87].Kitsera N, Allgayer J, Parsa E, Geier N, Rossa M, Carell T, et al. Functional impacts of 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxycytosine at a single hemi-modified CpG dinucleotide in a gene promoter. Nucleic Acids Res. 45 (2017) 11033–11042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [88].Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 22 (2012) 1798–1812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [89].Yang J, Horton JR, Wang D, Ren R, Li J, Sun D, et al. Structural basis for effects of CpA modifications on C/EBPbeta binding of DNA. Nucleic Acids Res. 47 (2019) 1774–1785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [90].Wang L, Zhou Y, Xu L, Xiao R, Lu X, Chen L, et al. Molecular basis for 5-carboxycytosine recognition by RNA polymerase II elongation complex. Nature. 523 (2015) 621–625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [91].Howard MJ, Foley KG, Shock DD, Batra VK, Wilson SH. Molecular basis for the faithful replication of 5-methylcytosine and its oxidized forms by DNA polymerase beta. J. Biol. Chem 294 (2019) 7194–7201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [92].Jin SG, Zhang ZM, Dunwell TL, Harter MR, Wu X, Johnson J, et al. Tet3 Reads 5-Carboxylcytosine through Its CXXC Domain and Is a Potential Guardian against Neurodegeneration. Cell reports. 14 (2016) 493–505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [93].Xu Y, Xu C, Kato A, Tempel W, Abreu JG, Bian C, et al. Tet3 CXXC domain and dioxygenase activity cooperatively regulate key genes for Xenopus eye and neural development. Cell. 151 (2012) 1200–1213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [94].Hashimoto H, Olanrewaju YO, Zheng Y, Wilson GG, Zhang X, Cheng X. Wilms tumor protein recognizes 5-carboxylcytosine within a specific DNA sequence. Genes Dev. 28 (2014) 2304–2313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [95].Kamiya H, Tsuchiya H, Karino N, Ueno Y, Matsuda A, Harashima H. Mutagenicity of 5-formylcytosine, an oxidation product of 5-methylcytosine, in DNA in mammalian cells. J. Biochem 132 (2002) 551–555. [DOI] [PubMed] [Google Scholar]
  • [96].Kellinger MW, Song CX, Chong J, Lu XY, He C, Wang D. 5-formylcytosine and 5-carboxylcytosine reduce the rate and substrate specificity of RNA polymerase II transcription. Nat. Struct. Mol. Biol 19 (2012) 831–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [97].Szulik MW, Pallan PS, Nocek B, Voehler M, Banerjee S, Brooks S, et al. Differential stabilities and sequence-dependent base pair opening dynamics of watson-crick base pairs with 5-hydroxymethylcytosine, 5-formylcytosine, or 5-carboxylcytosine. Biochemistry. 54 (2015) 1294–1305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [98].Nakayama A, Yamazaki S, Taketsugu T. Quantum chemical investigations on the nonradiative deactivation pathways of cytosine derivatives. J. Phys. Chem. A 118 (2014) 9429–9437. [DOI] [PubMed] [Google Scholar]
  • [99].Dai Q, Sanstead PJ, Peng CS, Han D, He C, Tokmakoff A. Weakened N3 Hydrogen Bonding by 5-Formylcytosine and 5-Carboxylcytosine Reduces Their Base-Pairing Stability. ACS Chem. Biol 11 (2016) 470–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [100].Maiti A, Drohat AC. Thymine DNA glycosylase can rapidly excise 5-formylcytosine and 5-carboxylcytosine: potential implications for active demethylation of CpG sites. J. Biol. Chem 286(2011)35334–35338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [101].Hashimoto H, Hong S, Bhagwat AS, Zhang X, Cheng X. Excision of 5-hydroxymethyluracil and 5-carboxylcytosine by the thymine DNA glycosylase domain: its structural basis and implications for active DNA demethylation. Nucleic Acids Res. 40 (2012) 10203–10214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [102].Bennett MT, Rodgers MT, Hebert AS, Ruslander LE, Eisele L, Drohat AC. Specificity of human thymine DNA glycosylase depends on N-glycosidic bond stability. J. Am. Chem. Soc 128 (2006) 12510–12519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [103].Spruijt CG, Gnerlich F, Smits AH, Pfaffeneder T, Jansen PW, Bauer C, et al. Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell. 152 (2013) 1146–1159. [DOI] [PubMed] [Google Scholar]
  • [104].Aravind L, Anand S, Iyer LM. Novel autoproteolytic and DNA-damage sensing components in the bacterial SOS response and oxidized methylcytosine-induced eukaryotic DNA demethylation systems. Biology direct. 8 (2013) 20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [105].Kweon SM, Zhu B, Chen Y, Aravind L, Xu SY, Feldman DE. Erasure of Tet-Oxidized 5-Methylcytosine by a SRAP Nuclease. Cell reports. 21 (2017) 482–494. [DOI] [PubMed] [Google Scholar]
  • [106].Thompson PS, Amidon KM, Mohni KN, Cortez D, Eichman BF. Protection of abasic sites during DNA replication by a stable thiazolidine protein-DNA cross-link. Nat. Struct. Mol. Biol 26 (2019) 613–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [107].Mohni KN, Wessel SR, Zhao R, Wojciechowski AC, Luzwick JW, Layden H, et al. HMCES Maintains Genome Integrity by Shielding Abasic Sites in Single-Strand DNA. Cell. 176 (2019) 144–153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [108].Halabelian L, Ravichandran M, Li Y, Zeng H, Rao A, Aravind L, et al. Structural basis of HMCES interactions with abasic DNA and multivalent substrate recognition. Nat. Struct. Mol. Biol 26 (2019) 607–612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [109].Cannon SV, Cummings A, Teebor GW. 5-Hydroxymethylcytosine DNA glycosylase activity in mammalian tissue. Biochem. Biophys. Res. Commun 151 (1988) 1173–1179. [DOI] [PubMed] [Google Scholar]
  • [110].Xia B, Han D, Lu X, Sun Z, Zhou A, Yin Q, et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods. 12 (2015) 1047–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [111].Zeng H, Mondal M, Song R, Zhang J, Xia B, Liu M, et al. Unnatural Cytosine Bases Recognized as Thymines by DNA Polymerases by the Formation of the Watson-Crick Geometry. Angewandte Chemie. 58 (2019) 130–133. [DOI] [PubMed] [Google Scholar]
  • [112].Wang W, Xu L, Hu L, Chong J, He C, Wang D. Epigenetic DNA Modification N(6)-Methyladenine Causes Site-Specific RNA Polymerase II Transcriptional Pausing. J. Am. Chem. Soc 139 (2017) 14436–14442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [113].Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods. 7 (2010) 461–465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [114].Figaro S, Scrima N, Buckingham RH, Heurgue-Hamard V. HemK2 protein, encoded on human chromosome 21, methylates translation termination factor eRF1. FEBS Lett. 582 (2008) 2352–2356. [DOI] [PubMed] [Google Scholar]
  • [115].Liu P, Nie S, Li B, Yang ZQ, Xu ZM, Fei J, et al. Deficiency in a glutamine-specific methyltransferase for release factor causes mouse embryonic lethality. Mol. Cell. Biol 30 (2010) 4245–4253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [116].Kusevic D, Kudithipudi S, Jeltsch A. Substrate Specificity of the HEMK2 Protein Glutamine Methyltransferase and Identification of Novel Substrates. J. Biol. Chem 291 (2016) 6124–6133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [117].Metzger E, Wang S, Urban S, Willmann D, Schmidt A, Offermann A, et al. KMT9 monomethylates histone H4 lysine 12 and controls proliferation of prostate cancer cells. Nat. Struct. Mol. Biol 26 (2019) 361–371. [DOI] [PubMed] [Google Scholar]
  • [118].Woodcock CB, Yu D, Zhang X, Cheng X. Human HemK2/KMT9/N6AMT1 is an active protein methyltransferase, but does not act on DNA in vitro, in the presence of Trm112. Cell Discovery. 5 (2019) 50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [119].Bujnicki JM, Feder M, Radlinska M, Blumenthal RM. Structure prediction and phylogenetic analysis of a functionally diverse family of proteins homologous to the MT-A70 subunit of the human mRNA:m(6)A methyltransferase. Journal of molecular evolution. 55 (2002) 431–444. [DOI] [PubMed] [Google Scholar]
  • [120].Beh LY, Debelouchina GT, Clay DM, Thompson RE, Lindblad KA, Hutton ER, et al. Identification of a DNA N6-Adenine Methyltransferase Complex and Its Impact on Chromatin Organization. Cell. 177 (2019) 1781–1796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [121].Hattman S, Kenny C, Berger L, Pratt K. Comparative study of DNA methylation in three unicellular eucaryotes. Journal of bacteriology. 135 (1978) 1156–1157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [122].Balacco DL, Soller M. The m(6)A Writer: Rise of a Machine for Growing Tasks. Biochemistry. 58 (2019) 363–378. [DOI] [PubMed] [Google Scholar]
  • [123].van Tran N, Ernst FGM, Hawley BR, Zorbas C, Ulryck N, Hackert P, et al. The human 18S rRNA m6A methyltransferase METTL5 is stabilized by TRMT112. Nucleic Acids Res. 47 (2019) 7719–7733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [124].Fedeles BI, Singh V, Delaney JC, Li D, Essigmann JM. The AlkB Family of Fe(II)/alpha-Ketoglutarate-dependent Dioxygenases: Repairing Nucleic Acid Alkylation Damage and Beyond. J. Biol. Chem 290 (2015) 20734–20742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [125].Yang B, Li X, Lei L, Chen J. APOBEC: From mutator to editor. J. Genet. Genomics. 44 (2017) 423–437. [DOI] [PubMed] [Google Scholar]
  • [126].Ko M, An J, Bandukwala HS, Chavez L, Aijo T, Pastor WA, et al. Modulation of TET2 expression and 5-methylcytosine oxidation by the CXXC domain protein IDAX. Nature. 497 (2013) 122–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [127].Shen Q, Zhang Q, Shi Y, Shi Q, Jiang Y, Gu Y, et al. Tet2 promotes pathogen infection-induced myelopoiesis through mRNA oxidation. Nature. 554 (2018) 123–127. [DOI] [PubMed] [Google Scholar]
  • [128].Guallar D, Bi X, Pardavila JA, Huang X, Saenz C, Shi X, et al. RNA-dependent chromatin targeting of TET2 for endogenous retrovirus control in pluripotent stem cells. Nat. Genet 50 (2018) 443–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [129].Zhang Z, Theler D, Kaminska KH, Hiller M, de la Grange P, Pudimat R, et al. The YTH domain is a novel RNA binding domain. J. Biol. Chem 285 (2010) 14701–14710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [130].Theler D, Dominguez C, Blatter M, Boudet J, Allain FH. Solution structure of the YTH domain in complex with N6-methyladenosine RNA: a reader of methylated RNA. Nucleic Acids Res. 42 (2014) 13911–13919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [131].Zhu T, Roundtree IA, Wang P, Wang X, Wang L, Sun C, et al. Crystal structure of the YTH domain of YTHDF2 reveals mechanism for recognition of N6-methyladenosine. Cell Res. 24 (2014) 1493–1496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [132].Li F, Zhao D, Wu J, Shi Y. Structure of the YTH domain of human YTHDF2 in complex with an m(6)A mononucleotide reveals an aromatic cage for m(6)A recognition. Cell Res. 24 (2014) 1490–1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [133].Xu C, Wang X, Liu K, Roundtree IA, Tempel W, Li Y, et al. Structural basis for selective binding of m6A RNA by the YTHDC1 YTH domain. Nat. Chem. Biol 10 (2014) 927–929. [DOI] [PubMed] [Google Scholar]
  • [134].Luo S, Tong L. Molecular basis for the recognition of methylated adenines in RNA by the eukaryotic YTH domain. Proc. Natl. Acad. Sci. U S A. 111 (2014) 13834–13839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [135].Teschendorff AE, Relton CL. Statistical and integrative system-level analysis of DNA methylation data. Nat. Rev. Genet 19 (2018) 129–147. [DOI] [PubMed] [Google Scholar]
  • [136].Do C, Shearer A, Suzuki M, Terry MB, Gelernter J, Greally JM, et al. Genetic-epigenetic interactions in cis: a major focus in the post-GWAS era. Genome Biol. 18 (2017) 120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [137].Traube FR, Schiffers S, Iwan K, Kellner S, Spada F, Muller M, et al. Isotope-dilution mass spectrometry for exact quantification of noncanonical DNA nucleosides. Nat. Protoc 14 (2019) 283–312. [DOI] [PubMed] [Google Scholar]
  • [138].Rogers JM, Bulyk ML. Diversification of transcription factor-DNA interactions and the evolution of gene regulatory networks. Wiley Interdiscip. Rev. Syst. Biol. Med (2018) e1423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [139].Patel A, Horton JR, Wilson GG, Zhang X, Cheng X. Structural basis for human PRDM9 action at recombination hot spots. Genes Dev. 30 (2016) 257–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [140].Hashimoto H, Wang D, Horton JR, Zhang X, Corces VG, Cheng X. Structural Basis for the Versatile and Methylation-Dependent Binding of CTCF to DNA. Mol. Cell 66 (2017) 711–720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [141].Pfeifer GP, Szabo PE, Song J. Protein Interactions at Oxidized 5-Methylcytosine Bases. J Mol Biol. (2019) S0022-2836(19)30501-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES