Abstract
The epigenetic control of genes by the methylation of cytosine resulting in 5-methylcytosine (5mC) has fundamental implications for human development and disease. Analysis of alterations in DNA methylation patterns is an emerging tool for cancer diagnostics and prognostics. Here we report that two thermostable DNA polymerases, namely the DNA polymerase KlenTaq derived from Thermus aquaticus and the KOD DNA polymerase from Thermococcus kodakaraensis, are able to extend 3′-mismatched primer strands more efficiently from 5 mC than from unmethylated C. This feature was advanced by generating a DNA polymerase mutant with further improved 5mC/C discrimination properties and its successful application in a novel methylation-specific PCR approach directly from untreated human genomic DNA.
Keywords: DNA methylation, DNA polymerases, enzyme engineering, polymerase chain reaction
Methylation of cytosines is a major epigenetic mark and the most abundant DNA modification in vertebrates.[1] Methylated cytosines are found as symmetrical 5-methylcytosine (5mC) of the dinucleotide CpG within promoter regions,[2] in which 75 % are methylated throughout the mammalian genome. In recent years, it has become evident that promoter methylation is crucial for activating and silencing gene expression.[3] Moreover, dynamic changes of methylation patterns are important for the development of mammals.[4] They control, for example, X-inactivation,[5] genomic imprinting,[5] and the development of primordial germ cells.[6] Furthermore, alterations in DNA methylation can be an integral event in the onset of diseases like cancer.[7] The discovery that particular hypo- or hypermethylation events are unique for human malignancy[8] renders 5mC a promising biomarker for cancer diagnosis. In reality, many DNA-methylation-based biomarkers have been evaluated in cancer research.[9] Therefore, several methods for the detection of 5mC in the human genome are employed.[10] The main approaches rely on endonuclease digestion,[11] affinity enrichment,[12] and bisulfite conversion.[13] The most common method to obtain single-nucleotide resolution is based on treatment of the sample with bisulfite, resulting in the conversion of cytosine to uracil whereas 5mC remains unchanged;[14] this is followed by DNA sequencing or PCR amplification. All available methods, however, have significant drawbacks. While affinity enrichment falls short in yielding information about individual CpG dinucleotides, endonuclease- and bisulfite-based methods are prone to false-positive results due to incomplete conversion.[15] The necessity of implementing pretreatment and purification steps results in a higher risk of contamination[16] as well as loss of DNA sample.[17] Taken together, the laborious, time-consuming,[18] and error-prone nature of DNA methylation assays is a significant barrier for their advent in practical clinical diagnostics. Here, we demonstrate the feasibility of a superior DNA methylation profiling approach directly from genomic samples.
In order to examine the behavior of two thermostable DNA polymerases when encountering a primer bearing a mismatch at its 3′-terminus opposite either C or 5mC, we investigated the extension from four primers differing in their 3′-terminal nucleotide (A-, C-, G-, or T-primer) paired with two different oligonucleotide templates that carry either a C or a 5mC opposite the primer end (Figure 1 b). Therefore, we performed single-nucleotide-incorporation experiments followed by analysis through denaturating polyacrylamide gel electrophoresis (PAGE) and visualization using autoradiography. Figure 1 c shows extensions of the four different primers by the sequence family A DNA polymerase KlenTaq.[19] Whereas there is no difference in extending from C or 5mC when the matched G-primer is used, the A- and T-primers are more efficiently extended in the case of the 5mC template. Under the chosen conditions, the extension for the A-primer was 56 % when paired with the 5mC template in comparison to 18 % when paired with the C template. The primer bearing C at its 3′-terminus was not extended for either template. Along these lines, we also tested an exonuclease-deficient variant of a DNA polymerase descending from another sequence family, the family B KOD DNA polymerase (KOD exo-)[20] originating from Thermococcus kodakaraensis. We found that discrimination between the mispaired C and 5mC templates is also observed with KOD exo-, albeit to a lower extent than with KlenTaq (47 % extension of the A-primer when paired with 5mC template as compared to 34 % with C template; Figure 1 d).
Next, we aimed at generating DNA polymerase mutants with enhanced discrimination. We inspected previously published KlenTaq and KOD exo- crystal structures with bound primer template complex,[21] looking for amino acids that might be able to interfere with the mismatched cytosine with the intention to exchange those by sterically more demanding residues. We came to the conclusion that the template binding cleft of KlenTaq is already packed with sterically demanding amino acids. For KOD exo-, however, we identified a glycine (G498) in immediate proximity to the mismatched cytosine (Figure 2 a). Noticeably, G498 is located in the template binding site where it is able to interact with the phosphate backbone of the nucleotide paired to the primer’s 3′-terminal nucleobase (Figure 2 a). We reasoned that mutating G498 into a sterically more demanding amino acid would result in a more crowded template binding site, which might effect discrimination between C and 5mC upon extension from mismatched primer strands. In fact, we found that mutating G498 into methionine results in a KOD exo- variant that exhibits the desired properties. The KOD exo- G498M mutant was obtained by site-directed mutagenesis followed by expression in E. coli BL21 as previously described.[21b, 22] The purified KOD exo- wild-type and KOD G498M enzymes were analyzed by SDS PAGE (Figure 2 b) and compared using the very same reaction conditions and enzyme concentrations. First, we applied KOD G498M in primer-extension experiments as described above. The engineered KOD G498M features considerably increased discrimination between the C and 5mC template as compared to that of the wild-type enzyme when the mismatched A-primer was used (44 % extension rate for 5mC as compared to 20 % for C) (Figure 3 a). This effect is even more pronounced for full-length primer extension. Here, 58 % of the applied primer was extended to the full-length product on the 5mC template as compared to 12 % on the C template (Figure 3 b). Interestingly, when the mismatched reaction was performed with template C a significant pausing of the enzyme was observed after incorporation of one nucleotide (Figure 3 b), a result that was not observed to this extent with the 5mC template. We determined steady-state kinetics[23] for primer extension by the incorporation of a single nucleotide (Table S1). All three enzymes were found to be equally efficient for either template when the matched G-primer was used (Table S1, Figure S1, and Figure S2). When comparing extension from the mismatched C template to the mismatched 5mC template for KOD exo- wild-type, we found that the discrimination is mainly based on differences of Km. In contrast, for KOD G498M as well as for KlenTaq wild-type we found that the differences are mainly based on kcat (Table S1, Figure S1, and Figure 3 c,d).
We examined the potential to exploit these differences in the catalytic efficiency of processing 5mC and C in PCR experiments on human genomic DNA (gDNA). Therefore, we analyzed a particular CpG site in the promoter region of NANOG in HeLa gDNA. NANOG is an epigenetically regulated gene which is associated with pluripotency of cells[24] and was found to be hypomethylated in its promoter region in metastatic human liver cancer cells.[25] Previous evaluation of the analyzed methylation site by bisulfite sequencing characterized it as unmethylated in HeLa cells.[25] As a control for full methylation we employed HeLa genomic DNA that was enzymatically methylated in vitro with a CpG methylase. We designed forward primers bearing either G or A at their 3′-termini opposite the cytosine of interest and a reverse primer binding 43 nt downstream so that PCR amplification should deliver a 86 bp amplificate (Figure S3). PCR employing the wild-type KOD exo- or KlenTaq enzymes did not show any differentiation in PCR efficiency. However, the use of KOD G498M DNA polymerase instead revealed that this variant is indeed capable of discriminating between methylated and unmethylated cytosine (Figure 4). While real-time PCR and melting curves are similar for untreated and CpG-methylated DNA when matched primers are used (i.e., terminating with G), employment of the mismatched primer (i.e., terminating with A) leads to delayed amplification as well as reduced endpoint fluorescence when the untreated DNA is compared to the CpG-methylated DNA (Figure 4 a). Melting curves of the amplified DNA confirm this decrease in PCR efficiency for the non-methylated template (Figure 4 b). Finally, quantitative analysis of the reaction mixtures by agarose gel electrophoresis substantiates these findings. As seen in Figure 4 c, the yield of a specific amplificate for the methylated template is reasonable considering we employed a mismatched primer. For the unmethylated template, however, the yield is significantly reduced. Moreover, KOD G498M is highly selective for the desired amplificate when the methylated template is used, whereas a slight band of byproduct appears when the unmethylated template is used. Primer dimers are formed in PCR without HeLa genomic DNA but do not emerge in the presence of template.
So far, detection of 5mC with single-base resolution has been restricted to the troublesome conversion of DNA by bisulfite or any other manipulation prior to analysis. Here we demonstrate the feasibility of a PCR system sensing 5mC directly from genomic DNA. The approach is based on the differential extension of mismatched primer strands by two well-studied DNA polymerases depending on whether the primer terminates opposite a template 5mC or C. KlenTaq was investigated as a prominent member of sequence family A and an exonuclease-deficient variant of KOD DNA polymerase as a representative of the sequence family B.[26] Both DNA polymerases are thermostable and commonly used in many core biotechnological applications. The discrimination between C and 5mC was greater with KlenTaq than with KOD exo-. Interestingly, KlenTaq wild-type exhibits a more sterically crowded environment close to the relevant template site in comparison to KOD exo-. This might well explain the enhanced discriminating properties of KlenTaq wild-type compared to KOD exo- wild-type. However, we were able to enhance the discrimination of KOD exo- by increasing the steric crowding of the template binding site in close vicinity to the residues bearing the C or 5mC of interest. Primer extensions in the presence of all four dNTPs indicate that KOD exo- G498M not only discriminates for C over 5mC upon extension by one nucleotide but also when the mismatch is already one position distal from the primer terminus (Figure 3 b). Furthermore, KOD G498M is capable of discriminating between C and 5mC in PCR from a genomic DNA target. On this basis, the methylation state of a single nucleotide in the entire human genome can be ascertained by a single PCR step.
As crystal structures of mismatched primer template complexes bound to DNA polymerases have not yet been described, one can only speculate about the origin of the 5mC/C discrimination. It has been proposed that when a mismatch is encountered, active misalignment of catalytic residues in the DNA polymerase results in reduced incorporation rate and therefore higher specificity of the enzyme.[27] Thus, discrimination could derive from differences in the gained substrate binding energy by misalignment of catalytic residues. Thereby, the enzyme would be kept in an inactive conformation more efficiently when bound to mismatched C rather than mismatched 5mC, resulting in a favored extension of 5mC. The differential states might be due to steric interference between the enzyme and the additional 5-methyl group in 5mC.
In the past, the possibility of different extension efficiencies was discussed as a mechanism for the increase of the mutation rates of 5mC in vivo. In 1992, Shen et al. investigated three DNA polymerases in their properties of extending mismatched primer strands at the C and 5mC position.[28] They found significant differences in mismatch extension only when using AMV reverse transcriptase. However, no significant discriminations were observed for a sequence family A DNA polymerase (i.e., the exonuclease-deficient Klenow fragment of E. coli DNA polymerase I) and a sequence family B enzyme (i.e., Drosophila DNA polymerase α). As mentioned, the herein investigated enzymes belong to sequence family A (KlenTaq) and B (KOD exo-) and show discrimination. These findings might hint at the fact that subtle changes of the enzyme scaffold might cause altered effects of C/5mC discrimination. Thus, future investigations will aim at evolving new DNA polymerase mutants with even more enhanced discrimination for application in methylation specific PCR approaches.[29]
Supporting information for this article is available on the WWW under http://dx.doi.org/10.1002/anie.201403745.
References
- [1].Bird AP. Nucleic Acids Res. 1980;8:1499–1504. doi: 10.1093/nar/8.7.1499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Ehrlich M, Wang R. Science. 1981;212:1350–1357. doi: 10.1126/science.6262918. [DOI] [PubMed] [Google Scholar]
- [3].Jones PA, Takai D. Science. 2001;293:1068–1070. doi: 10.1126/science.1063852. [DOI] [PubMed] [Google Scholar]
- [4].Li E, Bestor TH, Jaenisch R. Cell. 1992;69:915–926. doi: 10.1016/0092-8674(92)90611-f. [DOI] [PubMed] [Google Scholar]
- [5].Li E. Nat. Rev. Genet. 2002;3:662–673. doi: 10.1038/nrg887. [DOI] [PubMed] [Google Scholar]
- [6a].Ooi S, Wolf D, Hartung O, Agarwal S, Daley G, Goff S, Bestor T. Epigenet. Chromatin. 2010;3:17. doi: 10.1186/1756-8935-3-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6b].Weber M, Schübeler D. Curr. Opin. Cell Biol. 2007;19:273–280. doi: 10.1016/j.ceb.2007.04.011. [DOI] [PubMed] [Google Scholar]
- [7a].Estécio MRH, Gallegos J, Vallot C, Castoro RJ, Chung W, Maegawa S, Oki Y, Kondo Y, Jelinek J, Shen L, Hartung H, Aplan PD, Czerniak BA, Liang S, Issa J-PJ. Genome Res. 2010;20:1369–1382. doi: 10.1101/gr.107318.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7b].Jones P, Baylin S. Nat. Rev. Genet. 2002;3:415–428. doi: 10.1038/nrg816. [DOI] [PubMed] [Google Scholar]
- [8a].Esteller M, Garcia-Foncillas J, Andion E, Goodman SN, Hidalgo OF, Vanaclocha V, Baylin SB, Herman JG. N. Engl. J. Med. 2000;343:1350–1354. doi: 10.1056/NEJM200011093431901. [DOI] [PubMed] [Google Scholar]
- [8b].Esteller M, Silva JM, Dominguez G, Bonilla F, Matias-Guiu X, Lerma E, Bussaglia E, Prat J, Harkes IC, Repasky EA, Gabrielson E, Schutte M, Baylin SB, Herman JG. J. Natl. Cancer Inst. 2000;92:564–569. doi: 10.1093/jnci/92.7.564. [DOI] [PubMed] [Google Scholar]
- [9a].Heyn H, Esteller M. Nat. Rev. Genet. 2012;13:679–692. doi: 10.1038/nrg3270. [DOI] [PubMed] [Google Scholar]
- [9b].Laird P. Nat. Rev. Cancer. 2003;3:253–266. doi: 10.1038/nrc1045. [DOI] [PubMed] [Google Scholar]
- [10].Laird P. Nat. Rev. Cancer. 2010;11:191–203. doi: 10.1038/nrg2732. [DOI] [PubMed] [Google Scholar]
- [11].Kaput J, Sneider TW. Nucleic Acids Res. 1979;7:2303–2322. doi: 10.1093/nar/7.8.2303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Cross SH, Charlton JA, Nan X, Bird AP. Nat. Genet. 1994;6:236–244. doi: 10.1038/ng0394-236. [DOI] [PubMed] [Google Scholar]
- [13].Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW, Molloy PL, Paul CL. Proc. Natl. Acad. Sci. USA. 1992;89:1827–1831. doi: 10.1073/pnas.89.5.1827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Hayatsu H, Wataya Y, Kai K, Iida S. Biochemistry. 1970;9:2858–2865. doi: 10.1021/bi00816a016. [DOI] [PubMed] [Google Scholar]
- [15].Harrison J, Stirzaker C, Clark SJ. Anal. Biochem. 1998;264:129–132. doi: 10.1006/abio.1998.2833. [DOI] [PubMed] [Google Scholar]
- [16].Genereux DP, Johnson WC, Burden AF, Stöger R, Laird CD. Nucleic Acids Res. 2008;36:150. doi: 10.1093/nar/gkn691. e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Grunau C, Clark SJ, Rosenthal A. Nucleic Acids Res. 2001;29:65. doi: 10.1093/nar/29.13.e65. e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Clark SJ, Statham A, Stirzaker C, Molloy PL, Frommer M. Nat. Protoc. 2006;1:2353–2364. doi: 10.1038/nprot.2006.324. [DOI] [PubMed] [Google Scholar]
- [19a].Barnes WM. Gene. 1992;112:29–35. doi: 10.1016/0378-1119(92)90299-5. [DOI] [PubMed] [Google Scholar]
- [19b].Korolev S, Nayal M, Barnes WM, Di Cera E, Waksman G. Proc. Natl. Acad. Sci. USA. 1995;92:9264–9268. doi: 10.1073/pnas.92.20.9264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20a].Hashimoto H, Nishioka M, Fujiwara S, Takagi M, Imanaka T, Inoue T, Kai Y. J. Mol. Biol. 2001;306:469–477. doi: 10.1006/jmbi.2000.4403. [DOI] [PubMed] [Google Scholar]
- [20b].Takagi M, Nishioka M, Kakihara H, Kitabayashi M, Inoue H, Kawakami B, Oka M, Imanaka T. Appl. Environ. Microbiol. 1997;63:4504–4510. doi: 10.1128/aem.63.11.4504-4510.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20c.Nishioka M, Mizuguchi H, Fujiwara S, Komatsubara S, Kitabayashi M, Uemura H, Takagi M, Imanaka T. J. Biotechnol. 2001;88:141–149. doi: 10.1016/s0168-1656(01)00275-9. [DOI] [PubMed] [Google Scholar]
- [21a].Betz K, Malyshev DA, Lavergne T, Welte W, Diederichs K, Dwyer TJ, Phillip O, Romesberg FE, Marx A. Nat. Chem. Biol. 2012;8:612–614. doi: 10.1038/nchembio.966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21b].Bergen K, Betz K, Welte W, Diederichs K, Marx A. ChemBioChem. 2013;14:1058–1062. doi: 10.1002/cbic.201300175. [DOI] [PubMed] [Google Scholar]
- [22a].Obeid S, Blatter N, Kranaster R, Schnur A, Diederichs K, Welte W, Marx A. EMBO J. 2010;29:1738–1747. doi: 10.1038/emboj.2010.64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22b].Betz K, Streckenbach F, Schnur A, Exner T, Welte W, Diederichs K, Marx A. Angew. Chem. 2010;122:5308–5311. doi: 10.1002/anie.200905724. [DOI] [PubMed] [Google Scholar]; Angew. Chem. Int. Ed. 2010;49 [Google Scholar]
- [23a].Boosalis MS, Petruska J, Goodman MF. J. Biol. Chem. 1987;262:14689–14696. [PubMed] [Google Scholar]
- [23b].Creighton S, Huang MM, Cai H, Arnheim N, Goodman MF. J. Biol. Chem. 1992;267:2633–2639. [PubMed] [Google Scholar]
- 23c.Petruska J, Goodman MF, Boosalis MS, Sowers LC, Cheong C, Tinoco I. Proc. Natl. Acad. Sci. USA. 1988;85:6252–6256. doi: 10.1073/pnas.85.17.6252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24a].Yu J, Vodyanik MA, Smuga-Otto K, Antosiewicz-Bourget J, Frane JL, Tian S, Nie J, Jonsdottir GA, Ruotti V, Stewart R, Slukvin II, Thomson JA. Science. 2007;318:1917–1920. doi: 10.1126/science.1151526. [DOI] [PubMed] [Google Scholar]
- [24b].Tay Y, Zhang J, Thomson AM, Lim B, Rigoutsos I. Nature. 2008;455:1124. doi: 10.1038/nature07299. [DOI] [PubMed] [Google Scholar]
- [25].Wang XQ, Ng RK, Ming X, Zhang W, Chen L, Chu ACY, Pang R, Lo CM, Tsao SW, Liu X, Poon RTP, Fan ST. PLoS One. 2013;8:72435. doi: 10.1371/journal.pone.0072435. e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Braithwaite DK, Ito J. Nucleic Acids Res. 1993;21:787–802. doi: 10.1093/nar/21.4.787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Tsai Y-C, Johnson KA. Biochemistry. 2006;45:9675–9687. doi: 10.1021/bi060993z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Shen J-C, Creighton S, Jones PA, Goodman MF. Nucleic Acids Res. 1992;20:5119–5125. doi: 10.1093/nar/20.19.5119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29a].Gloeckner C, Kranaster R, Marx A. Curr. Protoc. Chem. Biol. 2010;2:89–109. doi: 10.1002/9780470559277.ch090183. [DOI] [PubMed] [Google Scholar]
- [29b].Gloeckner C, Sauter KBM, Marx A. Angew. Chem. 2007;119:3175–3178. [Google Scholar]; Angew. Chem. Int. Ed. 2007;46 [Google Scholar]
- [29c].Gieseking S, Bergen K, Di Pasquale F, Diederichs K, Welte W, Marx A. J. Biol. Chem. 2011;286:4011–4020. doi: 10.1074/jbc.M110.176826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29d].Chen T, Romesberg FE. FEBS Lett. 2014;588:219–229. doi: 10.1016/j.febslet.2013.10.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.