Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Feb 1.
Published in final edited form as: Nat Chem Biol. 2016 Dec 5;13(2):181–187. doi: 10.1038/nchembio.2250

Mutations along a TET2 active site scaffold stall oxidation at 5-hydroxymethylcytosine

Monica Yun Liu 1,2, Hedieh Torabifard 3, Daniel J Crawford 1,2, Jamie E DeNizio 1,2, Xing-Jun Cao 2, Benjamin A Garcia 2, G Andrés Cisneros 3, Rahul M Kohli 1,2
PMCID: PMC5370579  NIHMSID: NIHMS854104  PMID: 27918559

Abstract

Ten-eleven translocation (TET) enzymes catalyze stepwise oxidation of 5-methylcytosine (mC) to yield 5-hydroxymethylcytosine (hmC) and the rarer bases 5-formylcytosine (fC) and 5-carboxylcytosine (caC). Stepwise oxidation obscures how each individual base forms and functions in epigenetic regulation and prompts the question of whether TET enzymes primarily serve to generate hmC, or whether they are adapted to produce fC and caC as well. By mutating a single, conserved active site residue in human TET2, Thr1372, we uncovered enzyme variants that permit oxidation to hmC but largely eliminate fC/caC. Biochemical analyses, combined with molecular dynamics simulations, elucidated an active site scaffold that is required for WT stepwise oxidation and that, when perturbed, explains the mutants’ hmC-stalling phenotype. Our results suggest that the TET2 active site is shaped to enable higher-order oxidation and provide the first TET variants that could be used to probe the biological functions of hmC separately from fC and caC.

Graphical abstract

graphic file with name nihms854104u1.jpg


The discovery of ten-eleven translocation (TET) enzymes transformed the known repertoire of epigenetic DNA modifications1. TET enzymes catalyze the oxidation of 5-methylcytosine (mC), the mainstay of the epigenome, into three additional bases: 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxylcytosine (caC)16. Mounting evidence suggests that these oxidized mC (ox-mC) bases stably populate mammalian genomes, aid in DNA demethylation, and potentially encode unique epigenetic information711. The central questions now facing the field involve the functions of each individual base and the mechanisms governing their formation.

The overall catalytic mechanism of TET enzymes (TET1–3 in mammals) has been largely inferred from related proteins in the Fe(II)/α-ketoglutarate (α-KG)-dependent family of dioxygenases, such as AlkB12. Enzymes in this family couple decarboxylation of α-KG with substrate oxidation via a transient Fe(IV)-oxo intermediate, with succinate and CO2 as byproducts. TET enzymes apply this general mechanism to not one but three stepwise reactions, raising the question of whether these enzymes are specialized for one particular step of oxidation, or for three-step oxidation as a whole. Moreover, stepwise oxidation obscures the function of individual ox-mC’s, creating a need to break the linkage between steps in order to study each base in isolation.

The first step of oxidation, conversion of mC to hmC, has so far drawn the most attention, as it best explains the physiological levels of cytosine modifications: in the human genome, mC accounts for approximately 0.6–1% of all bases, hmC is typically 1–5% of mC, and fC and caC are at least 1–2 orders of magnitude rarer than hmC10. Consistent with these observations, biochemical studies have shown that mC substrate is preferred over hmC and fC, with 2-to 5-fold differences in KM and kcat reported for human TET213. Crystal structures did not reveal substrate-specific interactions that could explain these differences13,14, but computational modeling suggested that hydrogen abstraction is more efficient on mC than on hmC and fC, which adopt unfavorable conformations13,15. Together, these studies portray TET enzymes as predominantly serving to generate hmC; in this case, decreased capacity for further oxidation would help to maintain stable levels of hmC for epigenetic functions. Indeed, most functional studies on ox-mC bases have focused on hmC in health and disease, with fC/caC considered as fairly negligible.

However, this view does not explain why fC and caC are present at all, and it contrasts with evidence for the importance of higher-order oxidation. Most notably, fC and caC, but not hmC, are substrates for base excision by thymine DNA glycosylase (TDG); the resulting abasic site can be repaired to regenerate unmodified cytosine5,16,17. This is the leading candidate pathway for active DNA demethylation7. Apart from being intermediates in demethylation, fC/caC potentially also function as stable epigenetic marks. Genomic sequencing has mapped fC/caC to gene regulatory regions separate from hmC10, and proteomic analysis has described distinct “reader” proteins for each ox-mC base18,19. Furthermore, mouse Tet2 is capable of iterative oxidation: it can catalyze multiple rounds of oxidation upon a single encounter with mC-containing DNA, without releasing the hmC-containing DNA strand20. Although the prevalence of genomic hmC implies that most encounters are not iterative, this mechanism could allow TET enzymes to generate fC and caC marks without first accumulating hmC. Together, these studies encourage the alternate view that TET enzymes are specialized for making not only hmC but fC and caC as well—even that conversion of hmC to fC could be the key “committed” step to DNA demethylation.

To resolve these competing views of TET function, one question comes to the fore: whether TET enzymes are adapted to facilitate higher-order oxidation. The mC-to-hmC step is most favored, but if fC and caC serve important functions, mechanisms should be in place to permit their formation, yet these mechanisms remain largely unknown. They could be extrinsic to TET—e.g. other proteins could recruit TET enzymes or regulate their activity. However, intrinsic features, especially structure-function support for higher-order oxidation, would suggest an enzyme specifically shaped to generate not one but three epigenetic bases.

We examined the active site of human TET2 for potential structure-function determinants of stepwise oxidation. In the crystal structures of TET2 bound to DNA, the enzyme is truncated to the minimal regions necessary for catalytic activity (hTET2-CS, residues 1129–1936 Δ1481–1843) (Fig. 1a)13,14. The target nucleobase is everted out of the DNA duplex and occupies a tunnel-like space in the active site, with the 5-modified group pointing toward the α-KG analogue and Fe(II) (Fig. 1b). Although the residues that form this tunnel have no obvious interaction with the 5-modified groups13,14, we hypothesized that they could impact the progress of stepwise oxidation by hydrogen bonding or steric interactions. We therefore targeted two conserved residues located close to the 5-methyl group (Fig. 1a,b). By substituting all 20 amino acids at these positions, notably Thr1372, we uncovered a relationship between the side chain properties and stepwise oxidation activity, including variants that stall oxidation at hmC, with little to no fC/caC formed. Molecular dynamics simulations, coupled with biochemical analyses, revealed that a conserved Thr1372-Tyr1902 active site scaffold is required for efficient fC/caC formation, providing the first evidence that wild-type TET2 is specifically shaped to enable higher-order oxidation. We further show that mutations along this core scaffold can reconfigure active site interactions to stall oxidation at hmC, which opens opportunities to test the importance of hmC versus fC/caC in biological and pathological systems.

Figure 1.

Figure 1

Thr1372 and Val1900 were targeted for their potential role in TET2-catalyzed cytosine oxidation. (a) Schematic of the hTET2-CS construct (drawn to scale, adapted from ref. 14). The two Cys-rich domains are shown in pink and purple, and the double-stranded β-helix (DSBH) domain is in green; residues are numbered as in the complete hTET2 protein. Both Thr1372 and Val1900 are conserved across mouse and human TET proteins. (b) Structure of the hTET2-CS active site (PDB 4NM6) highlighting the targets for mutagenesis, Thr1372 and Val1900. The mC base flips into the active site pocket, pointing toward Fe(II) and the a-KG analogue N-oxalylglycine. Shown are the nearest distances between the residues and the 5-methyl carbon.

RESULTS

Saturation mutagenesis at Thr1372

We interrogated the active site of human TET2 by performing saturation mutagenesis, which can comprehensively capture structure-function relationships at a particular residue. Using the hTET2-CS construct, we generated plasmids encoding all 20 natural amino acids at either the Thr1372 or Val1900 positions. The plasmids were transiently transfected into HEK293T cells, and genomic DNA (gDNA) was purified from the cells after 48 hr. Using dot blotting to assess the qualitative pattern of genomic cytosine modifications, we found that the Val1900 position is fairly tolerant to mutation, with a variety of mutants showing WT-like stepwise oxidation or reduced overall activity, while bulky and charged residues largely inactivate the enzyme (Supplementary Results, Supplementary Fig. 1a).

We focused our attention on the Thr1372 mutants. TET2 overexpression was confirmed to be uniform by Western blot of cell lysates, with only T1372P having slightly reduced expression (Supplementary Fig. 1b). Dot blotting showed that, more so than for Val1900, mutations at Thr1372 produced distinctive patterns of cytosine oxidation, which cluster based on the biochemical properties of the side chain (Fig. 2a). Replacing Thr1372 with a proline, positively charged (H, K, R), or bulkier hydrophobic residue (I, F, L, M, W, Y) renders TET2 inactive. Only the T1372S mutant, which preserves the side chain hydroxyl group, exhibits WT-like activity. Smaller residues (A, C, G) are proficient at oxidation to fC and caC, but at reduced levels compared to WT. Most remarkably, the acidic or related polar residues (D, E, N, Q) and the nearly isosteric valine permit WT-like formation of hmC but no fC or caC, as detected by dot blot. Given this stalling of oxidation at hmC, Thr1372 appeared to play a unique role in stepwise oxidation.

Figure 2.

Figure 2

Screen for mutant activity. (a) Dot blots for mC, hmC, fC, and caC in 400 ng of genomic DNA isolated from transfected HEK293T cells. DNA from cells transfected with WT hTET2-CS or empty vector was spotted first, followed by the Thr1372 mutants in alphabetical order (uncropped image in Supplementary Fig. 10a). Further analysis of mutant phenotypes focused on variants that were capable of oxidation at least to hmC. (b) Genomic levels of mC, hmC, fC, and caC modifications produced by catalytically active Thr1372 mutants, quantified by LC-MS/MS as the percent of total C modifications. Mutants are approximately presented in decreasing order of activity, from WT-like T1372S, to A/C/G that form highly oxidized bases at reduced levels, to E/Q/N/D/V that largely stall at hmC. Shown are the mean and s.d. from independent experiments (WT n = 7, vec n = 6, mutants n = 3, T1372I n = 2).

Nucleoside LC-MS/MS quantifies range of mutant activity

We quantified the cellular activity of all Thr1372 mutants capable of oxidizing at least to hmC. The gDNA was degraded to component nucleosides and analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS) (Supplementary Fig. 2). In 0.1 μg of HEK293T gDNA, limits of detection in the low femtomole range enabled reliable quantification of 1 in 103–104 of all cytosines. While the total modified cytosine bases (mC + ox-mCs) were similar across all conditions, the distribution of specific modifications differed significantly. In vector-transfected cells, ox-mC products are minimal: 1.6 ± 1.0% of total cytosine modifications are hmC, with no fC or caC detected (Fig. 2b). Cells overexpressing WT hTET2-CS contain 15.2 ± 2.8% hmC, 6.0 ± 1.9% fC, and 5.7 ± 1.8% caC, demonstrating robust TET-dependent oxidation at a genomic level.

The mutants exhibit a gradient of activity reflected in the fraction of genomic ox-mC bases (Fig. 2b). T1372S is the only mutant with WT-like levels of fC and caC, and hmC levels slightly higher than WT. T1372A/C/G mutants generate WT-like levels of hmC but only one-third to one-half as much fC and barely detectable caC. Further down the activity gradient, the E/Q/N/D/V mutants produce hmC at levels at least half that of WT, but fC and caC are near or below detection limits, consistent with the dot blotting results. Among this group, T1372E appears to have the highest activity with WT-like hmC levels and <1% fC, while T1372V is lowest, generating half as much hmC but no fC. Finally, the slightly bulkier T1372I mutant resembles the vector control, underscoring the steric constraints at this position. Thus, the LC-MS/MS results more clearly elucidated the patterns seen on dot blot, showing a spectrum of activity among the Thr1372 mutants correlating with the side chain properties, with E/Q/N/D/V mutants stalling oxidation at hmC.

Computational modeling reveals Thr1372–Tyr1902 scaffold

To probe potential mechanisms behind the mutants’ effects, we turned to classical molecular dynamics (MD) simulations of all the active Thr1372 variants. We drew from our experience with AlkB21,22 to model WT hTET2-CS and the Thr1372 mutants bound to each of the four cytosine derivatives (see Supplementary Figs. 5–9, Supplementary Tables 2–8 for details). Our simulations were based on the crystal structure of TET2 in complex with DNA containing mC (PDB 4NM6)14, using α-KG and an Fe(II) surrogate (Mg(II)). Our WT models with hmC and fC proved mostly consistent with the more recently published structures of TET2 with these bases13; we observe all the key interactions between the enzyme, α-KG, active site metal ion, and DNA substrate for varying durations across our simulations. Furthermore, energy decomposition analysis (EDA) and the root-mean-square deviation (RMSD) comparing the simulations to the reference crystal structure show that the cytosine bases stably occupy the active site across time in all our models.

The hmC models in particular revealed distinct patterns of active site interactions in WT, A/C/G, and E/Q/N/D/V mutants, consistent with hmC being the fulcrum of the observed stalling effect. These patterns helped us to define a key structural scaffold in the WT enzyme that is required for efficient stepwise oxidation. This WT active site scaffold consists of a Thr1372–Tyr1902 hydrogen bond that critically supports optimal non-bonded interactions between Tyr1902 and the substrate cytosine base (Fig. 3a). The Thr1372–Tyr1902 hydrogen bond is observed in 65% of the simulation time (average over five runs of 50 ns each), and the total non-bonded interaction energy between these residues is −3.37 kcal/mol (Fig. 3b). Tyr1902, thus oriented by Thr1372, shows significant non-bonded interaction with the hmC base (−6.10 kcal/mol). This core scaffold is present across all WT models bound to mC/hmC/fC/caC and remains fully intact in the T1372S mutant, consistent with this mutant’s WT-like activity in cells.

Figure 3.

Figure 3

Molecular dynamics modeling reveals a critical Thr1372-Tyr1902 scaffold that is disrupted in the low-efficiency and hmC-dominant mutants. (a) Selected snapshots from MD simulations highlighting key active site components and hydrogen bonds. In WT enzyme (and T1372S), Thr1372 forms a hydrogen bond (black arrow) with Tyr1902, which orients Tyr1902 for optimal non-bonded interactions with the substrate. Low-efficiency mutants such as T1372A disrupt this scaffold, while hmC-dominant mutants such as T1372E/V not only disrupt the scaffold but also elicit new hydrogen bonds (red arrows) with the 5-hydroxymethyl group of hmC. (b) Simplified scheme of interactions between key residues and hmC, as determined by MD. Hydrogen bonds (dashed lines) are quantified as percentage of simulation time observed. The values are an average over 2–5 simulation runs of 50 ns each (see Methods). Non-bonded interactions are indicated in gray and total energies of interaction are given in kcal/mol. (Additional modeling data in Supplementary Figs. 5–9 and Supplementary Tables 2–8.)

All the other mutants eliminate the Thr1372–Tyr1902 hydrogen bond, perturbing the interaction between Y1902 and the substrate base, with a corresponding loss of enzymatic activity. For the A/C/G mutants, loss of the Thr1372–Tyr1902 scaffold appears to weaken interactions between misaligned active site components, as exemplified by T1372A (Fig. 3a,b). Combined with the gDNA results, we term the A/C/G phenotype “low-efficiency,” since these mutants permit higher-order oxidation but at reduced levels compared to WT.

In our modeling, the E/Q/N/D/V mutants go a step further: they not only eliminate the Thr1372–Tyr1902 scaffold but also elicit new hydrogen bonds specifically with hmC. These new interactions, not present in WT models, position hmC in a different orientation relative to Tyr1902 (Fig. 3a,b). For instance, in T1372E, the Glu1372 hydrogen bonds directly with the 5-hydroxymethyl group for 88% of the simulation time (average over two runs of 50 ns each). Direct hydrogen bonding to hmC is also observed in T1372D and Q, whereas in T1372N and V, the new hydrogen bond is between hmC and other nearby residues (Supplementary Fig. 5, Supplementary Tables 3b, 4b). For example, T1372V elicits an hmC-Asp1384 hydrogen bond (38% of simulation time, average over two runs of 50 ns each). We suggest that the loss of the Thr1372–Tyr1902 scaffold, together with new interactions specific to hmC, could contribute to the unique stalling phenotype of T1372E/Q/N/D/V mutants, which we term “hmC-dominant.”

Biochemical characterization of TET2 variants

With results from cells and MD showing that side chain properties can define WT, low-efficiency, and hmC-dominant phenotypes, we subjected the TET variants to rigorous comparison in vitro. We first used driving conditions to compare the maximum extent of the variants’ activity and then used limiting conditions to compare the reactivity on mC versus hmC. Representative hTET2-CS variants—WT and T1372S, A, E, and V—were expressed and purified from Sf9 insect cells (Supplementary Fig. 3a). To drive oxidation forward, we reacted excess enzyme with limiting substrate: 27-bp oligonucleotides containing a single reactive mC, hmC, or fC duplexed to an unmodified complementary strand. The reaction products were quantified by LC-MS/MS and the results corroborated by three complementary, chemoenzymatic assays (Supplementary Fig. 3b,c).

In reactions with 20 nM mC-containing duplexes, 30 μg/mL (maximally 0.57 μM) of WT, T1372S, and T1372A convert nearly all substrate to oxidized products in 30 min (Fig. 4a). However, while WT and T1372S advance efficiently through stepwise oxidation, turning over ~93% of substrate to fC and caC, T1372A lags behind, forming predominantly hmC (30%) and fC (54%) and only 13% caC. This aligns with the gDNA and modeling results, indicating that low-efficiency mutants are capable of oxidation to caC but at reduced levels compared to WT.

Figure 4.

Figure 4

Biochemical characterization of purified hTET2 mutants. (a) TET2 variants (30 μg/mL) were reacted with 20 nM dsDNA substrates containing mC, hmC, or fC for 30 min. The reaction products were purified, degraded to nucleosides, and quantified by LC-MS/MS. WT and T1372E were also generated in the full catalytic domain of TET2 (FCD and FE, respectively) to confirm that the phenotypes are the same as in the hTET2-CS constructs. Mean values are plotted (n = 2), and error bars represent the range. (b) Time course for reactions of 30 pg/mL purified TET2 on 25 nM mC substrates. Mean values are plotted (WT n = 3, mutants n = 2), and error bars represent the range.

The hmC-dominant T1372E and V mutants show noticeably reduced activity on mC (54% and 76% of mC substrate remaining), and oxidation products are strongly restricted to hmC, with 4% and 1% conversion to fC, respectively (Fig. 4a). Compared to the gDNA results, where the levels of hmC produced by the E/V mutants are within 2-fold of WT (Fig. 2b), this indicates that other factors can likely tune the activity of TET2 and/or the levels of hmC in cells. Importantly, the patterns of oxidation and hmC stalling hold true in cells and in vitro. T1372E is observed to be slightly more active than T1372V, consistent with the gDNA results and suggesting a trade-off between more hmC production and better stringency of stalling. Time course analysis further demonstrates that overall reactivity on mC decreases from WT to the low-efficiency T1372A, and the hmC-dominant E/V mutants fail to produce significant fC even after 3 hours (Fig. 4b, Supplementary Fig. 3d). To validate that the hmC-dominant phenotype is not restricted to the truncated CS form of the protein, we also generated the T1372E mutation in the full catalytic domain of TET2 (hTET2-FCD, residues 1129–2002) and noted similar results (Fig. 4a).

When all available substrate is hmC, WT and T1372S again convert >93% of substrate to fC and caC. T1372A produces 65% fC/caC, while T1372E and T1372V are able to produce only 8% and 3% fC, respectively. When starting with fC substrate under the same conditions, WT enzymes convert about half of fC to caC, corroborating that the final step of oxidation is the least efficient13,23. T1372A generates 19% caC, ~1/3 of the WT level, while E/V mutants make <3% caC, near or below the detection limits of our assays. These results strongly support our model that the Thr1372–Tyr1902 scaffold is required for WT TET2 activity. Loss of the active site scaffold decreases the activity of low-efficiency mutants and has a more severe effect on hmC-dominant mutants, which do not make significant fC/caC even under driving reaction conditions.

Since TET2 is known to prefer mC over hmC, we next turned to enzyme-limiting conditions to distinguish whether the decrease in overall activity alone was sufficient to explain the restriction of oxidation products to hmC. We compared the reactivity of WT, T1372A, and T1372E mutants on mC versus hmC by titrating enzyme against 745-bp substrates fully modified with mC or hmC. We chose to simplify our kinetic analysis to measure total oxidation products (i.e. substrate consumed), since iterative oxidation links the kinetics of each oxidation step in ways not easily dissected20. By this analysis, WT TET2 consumes 2.9 ± 0.2 nmol of mC substrate per mg enzyme per minute, while activity on hmC decreases 2.6-fold to 1.1 ± 0.1 nmol/mg/min (Table 1, Supplementary Fig. 4). This mild decrease in activity on hmC is consistent with previously published observations13. The T1372A mutant displays similar activity on mC and is only 5.5-fold slower in hmC-to-fC conversion, in line with this mutant’s capacity for less efficient higher-order oxidation. By contrast, relative to the most proficient WT reaction, the T1372E mutant is 5.9-fold slower in mC-to-hmC conversion but 48-fold slower in hmC-to-fC conversion. Thus, the hmC-dominant mutant exhibits decreased activity overall, but the usual mild preference for mC substrate is not sufficient to explain the larger loss of activity on hmC, which underlies the stalling effect.

Table 1.

Activity of representative TET2 variants on mC and hmC. Values are mean ± s.e.m. from three independent experiments.

Substrate consumed (nmol/mg/min) WT T1372A T1372E Y1902F T1372A/Y1902F
mC 2.9 ± 0.2 2.9 ± 0.1 0.48 ± 0.02 0.29 ± 0.03 1.0 ± 0.1
hmC 1.1 ± 0.1 0.51 ± 0.03 0.059 ± 0.006 0.079 ± 0.025 0.20 ± 0.02

Tyr1902 mutagenesis strongly supports our model

Our MD simulations suggested that active site scaffold mutations could introduce aberrant interactions that contribute to hmC stalling. We were cognizant of the challenges to modeling new interactions with classical MD and therefore subjected this model to an independent test: mutating the other scaffold residue, Tyr1902, to Phe. Our modeling predicts that Y1902F would liberate Thr1372 to form a hydrogen bond directly with hmC (18% of simulation time, average over two runs of 50 ns each), potentially favoring an hmC-dominant phenotype (Fig. 5a). Taking the hypothesis one step further, by adding a T1372A mutation to Y1902F, our modeling predicts that the T1372A/Y1902F double mutant could rescue activity by alleviating the aberrant hydrogen bonding interaction.

Figure 5.

Figure 5

T1372A/Y1902F double mutant rescues the hmC-dominant phenotype by configuring active site interactions. (a) Our modeling predicts that in the Y1902F single mutant, Thr1372 would hydrogen bond instead with hmC, producing an hmC-dominant phenotype. Addition of a T1372A mutation to Y1902F would remove hydrogen bonding, which is predicted to restore activity. The values shown are an average over 2–3 simulation runs of 50 ns each (see Methods). (b) Reaction of 30 μg/mL purified mutants on 20 nM mC substrate, analyzed by LC-MS/MS. Mean values are plotted (n = 2), and error bars represent the range. As predicted by our model, Y1902F mimics hmC-dominant mutants, with relatively low activity on mC and little fC formed. The double mutant (TA/YF) restores activity to resemble the T1372A single mutant. (c) To highlight fC and caC in the reaction products, the purified oligos were treated with recombinant TDG. After alkaline-mediated cleavage at the resulting abasic sites, denaturing PAGE was used to separate intact oligos containing mC and hmC from cleaved oligos that contained fC and caC (uncropped image in Supplementary Fig. 10b).

To test these predictions, we compared the activities of purified T1372A, Y1902F, and T1372A/Y1902F enzymes in vitro. The results strikingly confirmed our predictions. Compared to the WT mC-to-hmC reaction, the Y1902F single mutant is 9.9-fold slower in mC-to-hmC conversion and 36-fold slower in hmC-to-fC conversion (Table 1, Supplementary Fig. 4). Addition of the second T1372A mutation partially restores activity, so that the double mutant is only 2.8-fold slower in mC-to-hmC conversion and 14-fold slower in hmC-to-fC conversion. Under driving conditions, the Y1902F mutant leaves 38% of mC substrate unreacted, with products consisting of 49% hmC, 13% fC, and no caC (Fig. 5b)—similar to T1372E/V but with less stringent stalling at hmC. The introduction of a second mutation in the T1372A/Y1902F double mutant rescues activity, such that 97% of mC substrate is consumed, like the T1372A single mutant.

To complement these LC-MS/MS results, rather than digesting the reaction products to nucleosides, we treated the intact oligonucleotides with purified TDG followed by DNA gel electrophoresis to differentiate strands containing mC or hmC from strands containing fC or caC (Fig. 5c). While Y1902F shows only trace generation of fC/caC, the addition of the second mutation in T1372A/Y1902F restores stepwise oxidation and mirrors the results for T1372A. Thus, our structural modeling correctly predicts the biochemical behavior of the Y1902F and T1372A/Y1902F mutants, strongly supporting both the requirement of the Thr1372–Tyr1902 scaffold for WT stepwise oxidation and the contribution of aberrant active site interactions to the hmC-dominant phenotype.

DISCUSSION

TET-catalyzed stepwise oxidation populates the mammalian epigenome with three ox-mC bases, making it critical to dissect how each individual base forms and functions. Previous studies have elucidated various biases in favor of the first oxidation step, mC-to-hmC conversion, implying that TET enzymes may be primarily adapted for making hmC, with fC/caC as rare oxidative “overflow” products. However, in light of evidence for the importance of fC/caC in active DNA demethylation and as stable epigenetic marks, we asked whether TET enzymes bear structural features that specifically support fC/caC formation. We have now shown that a conserved Thr1372-Tyr1902 active site scaffold is required for efficient higher-order oxidation by human TET2, suggesting that the enzyme is shaped to enable production of not only hmC but fC/caC as well. We further uncover Thr1372 mutations that effectively abrogate higher-order oxidation by disrupting the active site scaffold; these are the first human TET variants that dissociate the steps of oxidation, providing a new tool to directly test the functions of hmC versus fC/caC.

As a structure-function determinant in TET2, the Thr1372 –Tyr1902 scaffold invites comparison to known TET homologues. The Thr-Tyr pair is perfectly conserved across mouse and human TET1, 2, and 3 (Supplementary Fig. 1c), raising the possibility that corresponding mutations in TET1 and 3 could likewise tune TET activity. Notably, while a large number of TET mutations have been identified in various malignancies24,25, we are not aware of any mutations at the Thr1372 or Tyr1902 positions. As an example in more distant homologues, the trypanosomal J-binding protein JBP1 is predicted to have a Thr-Tyr pair while JBP2 has Ser-Tyr (Supplementary Fig. 1c). JBP1 and 2 are capable of oxidizing thymine to both 5-hydroxymethyluracil (hmU) and 5-formyluracil, though a glucosyltransferase normally diverts hmU to form base J as part of the trypanosome’s mechanism for immune evasion26,27.

A particularly intriguing exception to the conserved Thr1372–Tyr1902 scaffold is the Naegleria Tet-like protein NgTetl, which is also capable of higher-order oxidation. Using a structure-based algorithm28 recently borne out by crystal structures23, we found that Thr1372 and Tyr1902 align with Ala212 and Phe295 in NgTetl, respectively (Supplementary Fig. 1c), making NgTetl analogous to our T1372A/Y1902F double mutant. In NgTetl, it was proposed that these residues form a hydrophobic pocket that accommodates hmC as it rotates from a product to a substrate conformation for further oxidation23. The active site mutations A212V/N, which could sterically hinder hmC binding within this pocket, were found to partially stall hmC oxidation. In human TET2, the Ala-Phe double mutant only permits low-efficiency stepwise oxidation, suggesting that the Thr-Tyr dyad may have evolved to fine-tune efficient fC/caC generation. By leveraging this scaffold, our results offer the first variants that produce distinct stepwise oxidation patterns in human TET enzymes.

Our combined computational and biochemical approach shows how T1372E/Q/N/D/V mutants could reconfigure active site interactions to produce the hmC-dominant phenotype, characterized by moderate loss of overall catalytic activity as well as a specific decrement at the hmC-to-fC step. To account for the additional loss of activity on hmC, our modeling most prominently implicates new hydrogen bonding to hmC in these mutants. Our calculations correctly predict the hmC-dominant behavior of the Y1902F single mutant, as well as rescued activity in the T1372A/Y1902F double mutant. Indeed, it is quite unusual that the addition of a second mutation rescues activity of the first, helping to bolster our mechanistic model. We note, however, that other related mechanisms could also play a role and are not mutually exclusive with this model. Such mechanisms include restriction of substrate/product rotation15,23, changes in protein dynamics (Supplementary Fig. 9), and/or altered accessibility of the active site to DNA and α-KG—all of which could occur in concert with aberrant hydrogen bonding to hmC. These possibilities reflect the complex dynamics of TET-DNA interactions, which remain priorities for future research. Importantly, independent of the mechanism of action, the hmC-dominant Thr1372 mutants fill the need for experimental tools to dissect the individual steps of mC oxidation.

These new TET variants potentially allow for the first direct studies of the epigenetic functions of hmC as distinct from fC and caC. Until now, functional studies have by necessity been all-or-none, showing that loss of one or more TET isozymes can produce diverse phenotypes. Examples range from inability of TET tripleknockout mouse embryonic fibroblasts to undergo reprogramming29, to cancer cell proliferation with loss of TET1 or TET230,31, to neonatal lethality in TET3-deleted mice32, among others. In many cases, reintroduction of a single active TET isozyme can fully rescue the phenotype. Such systems provide ideal opportunities to introduce low-efficiency and hmC-dominant TET variants to probe whether hmC alone is sufficient to rescue the defect, whether fC/caC are required, or whether interacting enzymes such as TDG are actually the key players.

These in vivo applications will bring new challenges as well, such as examining the mutants’ activity under more physiological conditions. Our study illustrates one limit to predicting cellular outcomes based on biochemical properties: although all the mutants perturb the enzyme’s reactivity in vitro, in HEK293T cells the amount of hmC generated can be close to WT. Many explanations are possible, including that HEK293T cell overexpression likely represents a non-steady state system, in which all cytosine modifications reach unusually high levels with limited means of removing these marks. It will be interesting to see whether normal cells, expressing endogenous TET enzymes, maintain a homeostatic level of ox-mC bases. It will also be important to determine whether the mutant phenotypes in TET2 translate to other TET isoforms, which is needed both for applying the mutants in various biological systems and for helping to address whether TET1/2/3 have similar or distinct mechanisms of action. Finally, given recently reported structures of TET2 bound to ox-mC bases13, we envision that chemical biology approaches, including additional mutagenesis or unnatural modifications along the Thr1372–Tyr1902-cytosine scaffold, could further hone selectivity for particular bases and potentially uncover TET variants that stringently stall at fC as well or accelerate conversion to caC.

ONLINE METHODS

Saturation cassette mutagenesis

A codon-optimized hTET2-CS construct (residues 1129–1936 Δ1481–1843) was designed with an N-terminal FLAG tag and unique restriction sites flanking the Thr1372 and Val1900 codons, purchased as a gene block from Integrated DNA Technologies (IDT), and cloned into a pLEXm vector for mammalian expression. Thirty-eight pairs of complementary oligos encoding all amino acid substitutions at both positions (as well as the Y1902F mutation) were ordered, annealed, and cloned by cassette mutagenesis in place of the WT sequence (Supplementary Table 1). Mutations were confirmed by gene sequencing and/or digestion at a unique restriction site within the oligo.

TET2 overexpression in HEK293T cells

HEK293T cells (mycoplasma tested and verified by ATCC) were cultured in Dulbecco’s Modified Eagle Medium (DMEM) with GlutaMAX (Thermo Fisher Scientific) and 10% fetal bovine serum (Sigma). Cells were transfected with WT or mutant hTET2-CS, or an empty vector control, using Lipofectamine 2000 (Thermo) according to the manufacturer’s protocol. Media was changed 24 h after transfection, cells were harvested by trypsinization 48 h after transfection and resuspended in phosphate-buffered saline, and genomic DNA (gDNA) was purified from four-fifths of the collected cells using the DNeasy Blood & Tissue Kit (Qiagen).

Western blot for FLAG-tagged hTET2-CS

One-fifth portion of the transfected cells was lysed using CytoBuster Protein Extraction Reagent (EMD Millipore). The clarified lysates were diluted 50-fold into CytoBuster and run on two 8% SDS-PAGE gels, with WT sample as a standard on each gel. To further standardize the blots, the gels were cut at the 70-kDa marker, so that the upper half contained the Hsp90 control band and the bottom half hTET2-CS. The Hsp90 halves of both gels were transferred together onto a single PVDF membrane, and the two TET halves were transferred onto another membrane, using an iBlot Gel Transfer Device (Thermo). Membranes were blocked for 2 h at room temperature with 5% (w/v) milk in Tris-buffered saline with 0.1% (v/v) Tween-20 (TBST), washed 3× with TBST, blotted with primary 1:10,000 anti-FLAG M2 (Sigma, cat. no. F1804) or 1:1,000 anti-Hsp90α/β (Santa Cruz Biotechnology, cat. no. sc-13119) antibodies at 4 °C overnight, washed, blotted with secondary 1:5,000 goat anti-mouse-HRP (Santa Cruz Biotechnology, cat. no. sc-2005) for 2 h, washed, and imaged with Immobilon Western Chemiluminescent HRP Substrate (Millipore) on a Fujifilm LAS-1000 imager with 30-s exposures.

Dot blot for cytosine modifications in gDNA

Purified gDNA from HEK293T cells was diluted to 10 ng/μL in Tris-EDTA (TE) buffer, pH 8.0. To this was added ¼ volume of 2 M NaOH/50 mM EDTA. The DNA was denatured for 10 min at 95 °C and transferred quickly to ice, followed by addition of 1:1 ice cold 2 M ammonium acetate. Sequi-Blot PVDF membranes (Bio-Rad) were cut to size, wet with MeOH and equilibrated in TE buffer, then assembled into a 96-well Bio-Dot microfiltration apparatus (Bio-Rad). Each well was washed with 400 μL TE drawn through with gentle vacuum, and 400 ng of gDNA was loaded, followed by another TE wash. Membranes were blocked for 2 h in 5% milk/TBST, washed 3× with TBST, and blotted at 4 °C overnight with primary antibodies against each modified cytosine (Active Motif)—1:5,000 mouse anti-mC (cat. no. 39649); 1:10,000 rabbit anti-hmC (cat. no. 39769); 1:5,000 rabbit anti-fC (cat. no. 61223); 1:5,000 rabbit anti-caC (cat. no. 61225). Blots were then washed, incubated with secondary 1:2,000 goat anti-mouse-HRP or 1:5,000 goat anti-rabbit-HRP (Santa Cruz Biotechnology, cat. no. sc-2004) for 2 h, washed, and imaged as described above.

Nano LC-MS/MS analysis of gDNA

Based on published protocols33, we adapted and optimized LC-MS/MS methods for our systems. To quantify genomic levels of cytosine modifications in HEK293T cells, 20 μg of purified gDNA was concentrated by ethanol precipitation and degraded to component nucleosides with 20 U DNA Degradase Plus (Zymo) in 20 μL at 37 °C overnight. A 150 μm × 17 cm precolumn and 100μm × 26 cm analytical reverse phase column were made from fused-silica tubing (New Objective) with a Kasil frit: The column was dipped into a 1:3 formamide:Kasil 1624 potassium silicate solution (PQ Corporation), polymerized at 100 °C overnight and trimmed to ~3 mm. Using a pressure injection cell, the columns were packed with Supelcosil LC-18-S resin (Sigma). Using this column setup equilibrated in Buffer A1 (0.1% formic acid in H2O), the nucleoside mixture was diluted 10-fold into 0.1% formic acid, and 1 μL was injected onto an Easy-nLC 1000 (Thermo) nano LC. The sample was desalted for 5 min over the precolumn, nucleosides resolved using a gradient of 0–30% of Buffer B1 (0.1% formic acid in acetonitrile) over 30 min at a flow rate of 600 nL/min, and tandem MS/MS performed by positive ion mode electrospray ionization on a Q Exactive hybrid quadrupole-orbitrap mass spectrometer (Thermo), with a spray voltage of 2.9 kV, capillary temperature of 275 °C, and normalized collision energy of 30%. Mass transitions were mC 242.11→126.066 m/z, hmC 258.11→124.051, fC 256.09→140.046, caC 272.09→156.041, and T 243.10→127.050. Standard curves were generated from standard nucleosides (Berry & Associates) ranging from 10 μM to 5 nM (10 pmol to 5 fmol total) (Supplementary Fig. 2). The sample peak areas were fit to the standard curve to determine amounts of each modified cytosine in the gDNA sample and expressed as the percent of total cytosine modifications in each sample.

Molecular dynamics simulations

Forty-four molecular dynamics (MD) simulations were carried out on WT and all experimentally tested mutants (T1372S/C/A/E/Q/N/D/V, Y1902F, T1372A/Y1902F) with all four cytosine derivatives (mC/hmC/fC/caC), α-KG, and Fe(II)/Mg(II) (see Supplementary Figs. 5–9 and Supplementary Tables 2–8 for details). All structures were modeled based on WT hTET2-CS bound to mC-containing DNA (PDB 4NM6)14. Initially, the PDB structure was evaluated with MOLPROBITY34 to check all possible rotamers, followed by hydrogen atom addition to every system with the Leap program35 using the ff99SB parameter set36 and solvation in a truncated octahedral box of TIP3P water37. In addition, protonation states of titratable residues were tested with PropKa3.03840, which confirmed that the default ionization at pH 7 was correct for all residues. Both coordinated histidines are protonated on ND1. All systems were explicitly neutralized with potassium counterions, which were added to the system using the Leap program. The final system size was ~60,000 total atoms with 17–21 counterions. All structures were minimized with 3,000 steps of conjugate gradient, followed by gradual warm-up to 300 K using Langevin dynamics with a collision frequency of 1.0 ps−1 in the NVT ensemble for 100 ps. All simulations were performed with the GPU version of the pmemd program in AMBER1236. The force field parameters for all cytosine derivatives (developed in house), a-KG, Fe(II)/Mg(II), and Zn that are not available in the default ff99SB set are provided in Supplementary Data Set 1. The iron cation was approximated by using Mg(II) parameters based on the precedent established by our previous studies on AlkB21,22; this approximation was also validated again for our systems (Supplementary Fig. 7, Supplementary Tables 3, 4, 8)41,42.

Once the systems achieved the target temperature, production MD simulations were performed using Langevin dynamics with a collision frequency of 1.0 ps−1 in the NPT (Canonical) ensemble with the Berendsen barostat using a 2-ps relaxation time at 300 K. The production length for each of the simulations was 50 ns, and snapshots were saved every 10 ps, and all snapshots were subjected to subsequent analysis (see below). Values reported are generally a time average over calculations from all snapshots. The most relevant simulations were performed 2–5 times for 50 ns each, with the results averaged across all simulations (the number of simulations for each system is denoted in Supplementary Table 2). All systems were simulated using the Amberff99SB force field with a 1-fs step size and a 9-Å cutoff for non-bonded interactions. SHAKE was used for all the simulations, and the smooth particle mesh Ewald (PME) method43 was employed to treat long-range Coulomb interactions. Hydrogen bond, root mean square deviation (RMSD), and distance analysis on trajectories were carried out using the CPPTRAJ module44 available in the AMBER 12 suite, and the trajectories were visualized with the VMD program45. Hydrogen bond analysis criteria were 1) angles over 120 degrees and 2) O-H distances less than 3 Å (default cpptraj settings). RMSD and distance analysis are presented in Supplementary Figures 7–8.

Additional analyses to investigate intermolecular interactions in the active site were carried out by noncovalent interaction analysis (NCI) and energy decomposition analysis (EDA). NCI is a visualization tool to identify non-covalent interactions between molecules46. The results obtained from the NCI analysis consist of surfaces between the interacting molecules. These surfaces are assigned specific colors to denote the strength and characteristic of the interactions: green surfaces denote weak interactions (e.g. van der Waals), blue surfaces strong attractive interactions (e.g. hydrogen bonds), and red surfaces strong repulsive interactions. The NCI calculations were performed with the NCI-Plot program47. We focused on the hmC systems, and a representative snapshot from every system was subjected to NCI analysis. In all cases, the hmC substrate was considered as a ligand interacting with a spherical region of 10 Å around the binding site. All calculations were obtained with a step size of 0.2 Å for the cube and a cutoff of 5 Å for the calculation of the interactions between the nucleotides and the active site. The NCI analysis for a selected snapshot of WT and all mutants in the presence of hmC are presented in Supplementary Fig. 5. We further examined the WT and T1372A/E/V mutants in the presence of mC and fC; these NCI analyses are presented in Supplementary Fig. 6. The snapshots for NCI plots have been selected to highlight the most frequent interactions relevant to the underlying mechanism.

All EDA calculations were carried out with an in-house FORTRAN90 program to determine the nonbonded interactions (Coulomb and VdW interactions) for all the residues4850. The average non-bonded interaction between a particular cytosine derivative and every other residue, ΔEint, is approximated by ΔEint=<ΔEi>, where i represents an individual residue, ΔEi represents the nonbonded interaction (Coulomb or VdW) between residue i and the particular cytosine derivative, and the broken brackets represent averages over the complete production ensemble obtained from the MD simulations. This analysis has been previously employed for QM/MM and MD simulations to study a number of protein systems21,22,5154. The EDA results for all protein residues with mC/hmC/fC/caC are presented in Supplementary Table 2, and specific non-bonded interactions are shown in Supplementary Table 3. Hydrogen bond analyses for WT and all mutants with all cytosine bases are shown in Supplementary Tables 4–7. As noted, the above-described analyses were performed on each individual snapshot over each individual simulation, and the reported data consist of the averages over all the simulations for each system.

Purification of hTET2 variants from Sf9 insect cells

WT and select hTET2-CS mutants were subcloned into a pFastBacl vector for expression in Sf9 insect cells as described previously33. WT and T1372E were also generated in the full catalytic domain (hTET2-FCD, residues 1129–2002). Proteins were expressed for 24 h, and the cell pellet from a 500-mL culture was resuspended in lysis buffer (50 mM HEPES, pH 7.5, 300 mM NaCl, 0.2% (v/v) NP-40) with cOmplete, EDTA-free Protease Inhibitor Cocktail (Roche, 1 tablet/10 mL) and 10 U/mL of Benzonase Nuclease (Millipore). Cells were lysed by one freeze-thaw cycle followed by passage through a 20-gauge and then a 25-gauge needle. The lysate was cleared by centrifugation at 20,000g for 30 min, and the supernatant was passed through a 0.2-μm syringe filter. A 250-μL column of anti-FLAG M2 affinity gel (Sigma) was prepared per manufacturer instructions and equilibrated in lysis buffer. The filtered lysate was applied twice to the column under gravity flow, and bound protein was washed with 10 mL then 2 × 5 mL of wash buffer (50 mM HEPES, pH 7.5, 150 mM NaCl, 15% (v/v) glycerol). Elutions of 250 μL were collected in wash buffer containing 100 μg/mL 3× FLAG peptide (Sigma), with each elution incubated on the column for 5 min before collection, until no protein was detected by Bio-Rad Protein Assay and SDS-PAGE. Fractions were pooled, DTT added to 1 mM, and aliquots flash frozen in liquid nitrogen and stored at −80 °C.

TET reactions in vitro

For reactions under “driving” conditions, purified TET2 enzymes were reacted with fluorescein (FAM)-labeled, 27-bp oligonucleotides containing a central reactive site (5′-GTA TCT AGT TCA ATC XGG TTC ATA GCA FAM-3′, X = mC, hmC, or fC), duplexed with a complementary strand containing an unmodified CpG. Protein concentrations were measured by the Bio-Rad Protein Assay and standardized by diluting in elution buffer. A mixture of 20–25 nM duplexed DNA, 50 mM HEPES, pH 6.5, 100 mM NaCl, 1 mM a-ketoglutarate, 1 mM DTT, and 2 mM sodium ascorbate was pre-warmed to 37 °C. Immediately before the reaction, fresh ammonium iron(II) sulfate (Sigma) was added to 75 μM, and at time t = 0, TET2 was added to a final concentration of 30 μg/mL (maximally 0.57 μM of hTET2-CS and 0.30 μM of hTET2-FCD). Reaction volumes were typically 200–350 μL. After incubation at 37 °C for 30 min (or at designated time points), the reactions were quenched by addition of 8 volumes of 100% ethanol with 2 volumes of Oligo Binding Buffer (Zymo). Reaction products were purified using the Zymo Oligo Clean & Concentrator kit, eluted in LC-MS grade H2O, and analyzed by LC-MS/MS and/or enzyme-coupled assays33.

For enzyme titration experiments, substrates were generated by PCR using 5-methyl-or 5-hydroxymethyl-dCTP and standard protocols for Taq polymerase. Each 745-bp amplicon contained a total of 391 modified cytosines (280 in CpG context) and was purified by gel extraction. Reaction conditions were the same as above, except for using 80 ng of PCR substrates and 1.856–72.5 μg/mL of enzyme in a 25-μL reaction. Following randomized analysis by LC-MS/MS, the percentage of total oxidation products (i.e. substrate consumed) was converted to nanomoles based on the known composition of the substrate. Plots were generated of total oxidation products versus enzyme concentration (Supplementary Fig. 4), and the slopes from linear regression were compiled in Table 1.

Chemoenzymatic assays of TET activity

We designed three chemoenzymatic assays to probe for specific cytosine modifications33. Concentrated, purified reaction products representing 50 μL of the TET reaction (up to 1.25 pmol) were used for each assay.

To distinguish mC-containing oligos, the restriction enzyme MspI (NEB) was used, which normally cleaves CCGG sites containing C, mC, or hmC, with partial activity on fC and no activity on caC6. A combination of aldehyde reactive probe (ARP) (Thermo) and T4 β-glucosyltransferase (βGT) (NEB) were used to protect fC and hmC, respectively, from MspI cleavage, leaving only mC susceptible. The reaction products, along with controls, were treated first with 4.4 μM ARP in 6 mM HEPES, pH 5.0 (10 μL total volume), incubated at 37 °C overnight, then diluted into 20 μL with 1× CutSmart Buffer (NEB), 2 mM uridine diphosphoglucose (UDP-Glc) and 1:25 volume of βGT for 30 min at 37 °C. To this mixture was added 50 U MspI in 1× CutSmart Buffer and digestion carried out at 37 °C for >2 h.

To visualize the extent of higher-order oxidation to fC and caC, the reaction products were treated with 25-fold molar excess of thymine DNA glycosylase (TDG) purified as described below, in TDG buffer (20 mM HEPES, pH 7.5, 100 mM NaCl, 0.2 mM EDTA, 2.5 mM MgCl2) for 2–4 h at 37 °C. After the reaction, 1:1 volume of 0.3 M NaOH/0.03 M EDTA was added and the mixture incubated at 85 °C for 15 min to cleave oligos at abasic sites. The TDG mutant N191A, which was previously found to excise fC and not caC55, was also purified and used in the same manner to identify fC specifically.

As the final step of all three chemoenzymatic processes, the samples were mixed 1:1 with formamide containing bromophenol blue loading dye, loaded onto a 7 M urea/20% acrylamide/lX TBE gel prewarmed to 50 °C, and imaged for FAM fluorescence on a Typhoon 9200 variable mode imager.

LC-MS/MS analysis of reaction products

Concentrated, purified reaction products representing 200 μL of the TET reaction (up to 5 pmol) were degraded to component nucleosides with 1 U DNA Degradase Plus (Zymo) in 10 μL at 37 °C overnight. The nucleoside mixture was diluted 10-fold into 0.1% formic acid, and 20 μL were injected onto an Agilent 1200 Series HPLC with a 5 μm, 2.1 × 250 mm Supelcosil LC-18-S analytical column (Sigma) equilibrated to 50 °C in Buffer A2 (5 mM ammonium formate, pH 4.0). The nucleosides were separated in a gradient of 0–10% Buffer B2 (4 mM ammonium formate, pH 4.0, 20% (v/v) methanol) over 7 min at a flow rate of 0.5 mL/min. Tandem MS/MS was performed by positive ion mode ESI on an Agilent 6460 triple-quadrupole mass spectrometer, with gas temperature of 175 °C, gas flow of 10 L/min, nebulizer at 35 psi, sheath gas temperature of 300 °C, sheath gas flow of 11 L/min, capillary voltage of 2,000 V, fragmentor voltage of 70 V, and delta EMV of+1,000 V. Collision energies were optimized to 10 V for mC, fC, and T; 15 V for caC; and 25 V for hmC. MRM mass transitions and data analysis were as described above.

Purification of hTDG from E. coli

We adapted a published protocol56 to express and purify WT and N191A TDG from BL21(DE3) cells. 1-L cultures were grown to OD ~0.6, cooled gradually to 16 °C, induced with 0.25 mM IPTG at OD ~0.8, and grown for another 4 h. Cells were collected by centrifugation, resuspended in 20 mL TDG lysis buffer (50 mM NaPhos, pH 8.0, 300 mM NaCl, 25 mM imidazole) with protease inhibitors, and lysed by four passes on a microfluidizer. The lysate was cleared by centrifugation at 20,000g for 20 min, then passed through a 0.22-μm syringe filter. A 1-mL column of HisPur cobalt resin (Thermo) was equilibrated in TDG lysis buffer, and the lysate bound by two applications to the column under gravity flow. The column was washed three times with 5 mL of TDG lysis buffer containing 1 M NaCl, then three times with 5 mL of regular TDG lysis buffer. Elutions of 1 mL each were collected in TDG lysis buffer containing increasing concentrations of imidazole: 50, 100, 150, 200, 250, and 500 mM imidazole. Elutions were evaluated by SDS-PAGE and dialyzed overnight at 4 °C into TDG storage buffer (20 mM HEPES, pH 7.5, 100 mM NaCl, 1 mM DTT, 0.5 mM EDTA, 1% (v/v) glycerol). Final protein concentrations were measured with the Bio-Rad Protein Assay and aliquots stored at −80 °C.

Supplementary Material

SI

Acknowledgments

We thank B. Niedziolka and the Wistar Institute Protein Expression Facility for help with protein expression in Sf9 cells and all members of our labs for insightful discussions. Computing time from Wayne State C&IT and additional mass spectrometry resources from I. Blair’s lab are gratefully acknowledged. This work was supported by the Rita Allen Foundation Scholar Award to R.M.K. and NIH grants (R01 GM110174 to B.A.G., R01 GM108583 to G.A.C., and F30 CA196097 to M.Y.L.).

Footnotes

AUTHOR CONTRIBUTIONS

R.M.K., G.A.C., M.Y.L., and H.T. conceived the experiments. R.M.K., M.Y.L., D.J.C., J.E.D., X.J.C., and B.A.G. were involved in design and optimization of biochemical and cellular experiments, which were performed and analyzed by M.Y.L., J.E.D., and R.M.K. For MD simulations, G.A.C., H.T., M.Y.L., and R.M.K. were involved in design of experiments, which were performed and analyzed by H.T. and G.A.C. The manuscript was written by M.Y.L., R.M.K., H.T., and G.A.C. and edited by all authors.

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

References

  • 1.Tahiliani M, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. 2009;324:930–935. doi: 10.1126/science.1170116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kriaucionis S, Heintz N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science. 2009;324:929–930. doi: 10.1126/science.1169786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ito S, et al. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature. 2010;466:1129–1133. doi: 10.1038/nature09303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pfaffeneder T, et al. The Discovery of 5-Formylcytosine in Embryonic Stem Cell DNA. Angew Chem Int Ed Engl. 2011;50:7008–7012. doi: 10.1002/anie.201103899. [DOI] [PubMed] [Google Scholar]
  • 5.He YF, et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science. 2011;333:1303–1307. doi: 10.1126/science.1210944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ito S, et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science. 2011;333:1300–1303. doi: 10.1126/science.1210597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kohli RM, Zhang Y. Tet, TDG and the dynamics of DNA demethylation. Nature. 2013;502:472–479. doi: 10.1038/nature12750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bachman M, et al. 5-Hydroxymethylcytosine is a predominantly stable DNA modification. Nat Chem. 2014;6:1049–1055. doi: 10.1038/nchem.2064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bachman M, et al. 5-Formylcytosine can be a stable DNA modification in mammals. Nat Chem Biol. 2015;11:555–557. doi: 10.1038/nchembio.1848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wu H, Zhang Y. Charting oxidized methylcytosines at base resolution. Nat Struct Mol Biol. 2015;22:656–661. doi: 10.1038/nsmb.3071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Liu MY, DeNizio JE, Schutsky EK, Kohli RM. The expanding scope and impact of epigenetic cytosine modifications. Curr Opin Chem Biol. 2016;33:67–73. doi: 10.1016/j.cbpa.2016.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zheng G, Fu Y, He C. Nucleic acid oxidation in DNA damage repair and epigenetics. Chem Rev. 2014;114:4602–4620. doi: 10.1021/cr400432d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hu L, et al. Structural insight into substrate preference for TET-mediated oxidation. Nature. 2015;527:118–122. doi: 10.1038/nature15713. [DOI] [PubMed] [Google Scholar]
  • 14.Hu L, et al. Crystal structure of TET2-DNA complex: insight into TET-mediated 5mC oxidation. Cell. 2013;155:1545–1555. doi: 10.1016/j.cell.2013.11.020. [DOI] [PubMed] [Google Scholar]
  • 15.Lu J, et al. A computational investigation on the substrate preference of ten-eleven-translocation 2 (TET2) Phys Chem Chem Phys. 2016;18:4728–4738. doi: 10.1039/c5cp07266b. [DOI] [PubMed] [Google Scholar]
  • 16.Maiti A, Drohat AC. Thymine DNA Glycosylase Can Rapidly Excise 5-Formylcytosine and 5-Carboxylcytosine: Potential Implications for Active Demethylation of CpG Sites. J Biol Chem. 2011;286:35334–35338. doi: 10.1074/jbc.C111.284620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Weber AR, et al. Biochemical reconstitution of TET1-TDG-BER-dependent active DNA demethylation reveals a highly coordinated mechanism. Nat Commun. 2016;7:10806. doi: 10.1038/ncomms10806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Iurlaro M, et al. A screen for hydroxymethylcytosine and formylcytosine binding proteins suggests functions in transcription and chromatin regulation. Genome Biol. 2013;14:R119. doi: 10.1186/gb-2013-14-10-r119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Spruijt CG, et al. Dynamic Readers for 5-(Hydroxy)Methylcytosine and Its Oxidized Derivatives. Cell. 2013;152:1146–1159. doi: 10.1016/j.cell.2013.02.004. [DOI] [PubMed] [Google Scholar]
  • 20.Crawford DJ, et al. Tet2 Catalyzes Stepwise 5-Methylcytosine Oxidation by an Iterative and de novo Mechanism. J Am Chem Soc. 2016;138:730–733. doi: 10.1021/jacs.5b10554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fang D, Lord RL, Cisneros GA. Ab initio QM/MM calculations show an intersystem crossing in the hydrogen abstraction step in dealkylation catalyzed by AlkB. J Phys Chem B. 2013;117:6410–6420. doi: 10.1021/jp403116e. [DOI] [PubMed] [Google Scholar]
  • 22.Fang D, Cisneros GA. Alternative Pathway for the Reaction Catalyzed by DNA Dealkylase AlkB from Ab Initio QM/MM Calculations. J Chem Theory Comput. 2014;10:5136–5148. doi: 10.1021/ct500572t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hashimoto H, et al. Structure of Naegleria Tet-like dioxygenase (NgTet1) in complexes with a reaction intermediate 5-hydroxymethylcytosine DNA. Nucleic Acids Res. 2015;43:10713–10721. doi: 10.1093/nar/gkv870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Abdel-Wahab O, et al. Genetic characterization of TET1, TET2, and TET3 alterations in myeloid malignancies. Blood. 2009;114:144–147. doi: 10.1182/blood-2009-03-210039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Scourzic L, Mouly E, Bernard OA. TET proteins and the control of cytosine demethylation in cancer. Genome Med. 2015;7:9-015-0134-6. doi: 10.1186/s13073-015-0134-6. eCollection 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cliffe LJ, et al. JBP1 and JBP2 are two distinct thymidine hydroxylases involved in J biosynthesis in genomic DNA of African trypanosomes. Nucleic Acids Res. 2009;37:1452–1462. doi: 10.1093/nar/gkn1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bullard W, Lopes da Rosa-Spiegler J, Liu S, Wang Y, Sabatini R. Identification of the glucosyltransferase that converts hydroxymethyluracil to base J in the trypanosomatid genome. J Biol Chem. 2014;289:20273–20282. doi: 10.1074/jbc.M114.579821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Pei J, Kim BH, Grishin NV. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 2008;36:2295–2300. doi: 10.1093/nar/gkn072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hu X, et al. Tet and TDG mediate DNA demethylation essential for mesenchymal-to-epithelial transition in somatic cell reprogramming. Cell Stem Cell. 2014;14:512–522. doi: 10.1016/j.stem.2014.01.001. [DOI] [PubMed] [Google Scholar]
  • 30.Lian CG, et al. Loss of 5-hydroxymethylcytosine is an epigenetic hallmark of melanoma. Cell. 2012;150:1135–1146. doi: 10.1016/j.cell.2012.07.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Neri F, et al. TET1 is a tumour suppressor that inhibits colon cancer growth by derepressing inhibitors of the WNT pathway. Oncogene. 2015;34:4168–4176. doi: 10.1038/onc.2014.356. [DOI] [PubMed] [Google Scholar]
  • 32.Gu TP, et al. The role of Tet3 DNA dioxygenase in epigenetic reprogramming by oocytes. Nature. 2011;477:606–610. doi: 10.1038/nature10443. [DOI] [PubMed] [Google Scholar]
  • 33.Liu MY, DeNizio JE, Kohli RM. Quantification of Oxidized 5-Methylcytosine Bases and TET Enzyme Activity. Methods Enzymol. 2016;573:365–385. doi: 10.1016/bs.mie.2015.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen VB, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010;66:12–21. doi: 10.1107/S0907444909042073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Schafmeister CEAF, Ross WS, Romanovski V. The leap module of AMBER. University of California; San Francisco: 1995. [Google Scholar]
  • 36.Case DA, et al. The Amber biomolecular simulation programs. J Comput Chem. 2005;26:1668–1688. doi: 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys. 1983;79:926. [Google Scholar]
  • 38.Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA. PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res. 2004;32:W665–7. doi: 10.1093/nar/gkh381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Dolinsky TJ, et al. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 2007;35:W522–5. doi: 10.1093/nar/gkm276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Olsson MH, Sondergaard CR, Rostkowski M, Jensen JH. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J Chem Theory Comput. 2011;7:525–537. doi: 10.1021/ct100578z. [DOI] [PubMed] [Google Scholar]
  • 41.Bradbrook GM, et al. X-Ray and molecular dynamics studies of concanavalin-A glucoside and mannoside complexes Relating structure to thermodynamics of binding. J Chem Soc, Faraday Trans. 1998;94:1603–1611. [Google Scholar]
  • 42.Oda A, Yamaotsu N, Hirono S. New AMBER force field parameters of heme iron for cytochrome P450s determined by quantum chemical calculations of simplified models. J Comput Chem. 2005;26:818–826. doi: 10.1002/jcc.20221. [DOI] [PubMed] [Google Scholar]
  • 43.Essmann U, et al. A smooth particle mesh Ewald method. J Chem Phys. 1995;103:8577–8593. [Google Scholar]
  • 44.Roe DR. & Cheatham, T.E.,3rd. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J Chem Theory Comput. 2013;9:3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  • 45.Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  • 46.Johnson ER, et al. Revealing noncovalent interactions. J Am Chem Soc. 2010;132:6498–6506. doi: 10.1021/ja100936w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Contreras-Garcia J, et al. NCIPLOT: a program for plotting non-covalent interaction regions. J Chem Theory Comput. 2011;7:625–632. doi: 10.1021/ct100641a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Graham SE, Syeda F, Cisneros GA. Computational Prediction of Residues Involved in Fidelity Checking for DNA Synthesis in DNA Polymerase I. Biochemistry. 2012;51:2569–2578. doi: 10.1021/bi201856m. [DOI] [PubMed] [Google Scholar]
  • 49.Elias AA, Cisneros GA. Computational study of putative residues involved in DNA synthesis fidelity checking in Thermus aquaticus DNA polymerase I. Adv Protein Chem Struct Biol. 2014;96:39–75. doi: 10.1016/bs.apcsb.2014.06.003. [DOI] [PubMed] [Google Scholar]
  • 50.Dewage SW, Cisneros GA. Computational analysis of ammonia transfer along two intramolecular tunnels in Staphylococcus aureus glutamine-dependent amidotransferase (GatCAB) J Phys Chem B. 2015;119:3669–3677. doi: 10.1021/jp5123568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cui Q, Karplus M. Catalysis and specificity in enzymes: a study of triosephosphate isomerase and comparison with methyl glyoxal synthase. Adv Protein Chem. 2003;66:315–372. doi: 10.1016/s0065-3233(03)66008-0. [DOI] [PubMed] [Google Scholar]
  • 52.Marti S, et al. Preorganization and reorganization as related factors in enzyme catalysis: the chorismate mutase case. Chemistry. 2003;9:984–991. doi: 10.1002/chem.200390121. [DOI] [PubMed] [Google Scholar]
  • 53.Senn HM, O’Hagan D, Thiel W. Insight into enzymatic C-F bond formation from QM and QM/MM calculations. J Am Chem Soc. 2005;127:13643–13655. doi: 10.1021/ja053875s. [DOI] [PubMed] [Google Scholar]
  • 54.Cisneros GA, et al. Reaction mechanism of the epsilon subunit of E. coli DNA polymerase III: insights into active site metal coordination and catalytically significant residues. J Am Chem Soc. 2009;131:1550–1556. doi: 10.1021/ja8082818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Maiti A, Michelson AZ, Armwood CJ, Lee JK, Drohat AC. Divergent mechanisms for enzymatic excision of 5-formylcytosine and 5-carboxylcytosine from DNA. J Am Chem Soc. 2013;135:15813–15822. doi: 10.1021/ja406444x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Morgan MT, Bennett MT, Drohat AC. Excision of 5-halogenated uracils by human thymine DNA glycosylase. Robust activity for DNA contexts other than CpG. J Biol Chem. 2007;282:27578–27586. doi: 10.1074/jbc.M704253200. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES