Spontaneous methylation of DNA by endogenous methylating agents generates a variety of genotoxic adducts, the most prevalent of which is N7-methylguanine (m7G).1 Apart from its occurrence as a natural lesion, m7G generated via deliberate DNA methylation has seen widespread use as a probe of protein–DNA interactions2 and as a key component of the Maxam–Gilbert DNA sequencing method, enabled the first experimental determination of gene sequences. Most organisms express DNA glycosylase enzymes that locate m7G residues and excise them (eq 1); the most well-studied of these are E. coli AlkA3 and human Aag,4 enzymes that are functionally equivalent but structurally dissimilar.
(1) |
The molecular basis for recognition of m7G residues amid a million-fold excess of normal G, and for the initiation of base excision repair by AlkA and Aag, is poorly understood, due in no small part to the lack of an efficient, scalable method by which to incorporate m7G into DNA; the extreme instability of m7G residues toward both the basic and acidic conditions employed in solid-phase DNA synthesis has heretofore precluded use of that method. Our laboratory previously developed a chemoenzymatic method for introducing m7G into DNA5, but that method is not applicable to the scale and sequence requirements of crystallographic studies. Here we report site-specific incorporation of m7G into DNA via solid-phase synthesis, made possible through a combination of 2′-flourination on its 2′-deoxyribose moiety to provide acid stability, and mild nonaqueous deprotection to provide base stability. Using this synthetic method in combination with a recently developed crystallographic procedure for obtaining structures of lesions in naked DNA, we report the first structures that reveal the nature of the m7G:C base pair in duplex DNA.
Fluorine substitution at the 2′-position in 2′-deoxy nucleosides is well-known to slow or even abrogate enzymatic catalysis of N-glycosidic bond cleavage, presumably via destabilization of the obligate oxocarbenium ion intermediate.6 Since both enzymatic and nonenzymatic depurination of m7G 2′-deoxyriboside (m7dG) proceed through such an intermediate, we reasoned that 2′-fluorination of m7dG would most likely retard both processes, providing a useful means to incorporate the lesion into DNA and study its enzymatic repair. We chose to 2′-fluorinate m7dG in the arabino (α) configuration because this configuration was believed not to alter sugar puckering in DNA.
The synthesis of 2′-deoxy-2′-fluoro-N7-methylguanosine (Fm7dG) phosphoramidite 7 started with the fluorination of commercially available ribose derivative 1 with (diethylamino) sulfur trifluoride (DAST) followed by bromination to give the known glycosyl bromide 2 (Scheme 1).8 N-Glycosidation of the bromide 2 with the sodium salt of 2-amino-6-chloropurine gave mainly the desired β-anomer 3, which was treated with NaOBn to install the 6-benzyloxy group with concomitant benzoate deprotection. Phe-noxyacetylation of 4 under conditions of transient TMS protection, followed by reductive removal of the benzyl moiety, afforded 5. Regioselective alkylation of 5 with iodomethane provided 2′-deoxy-2′-fluoro-N7-methyl guanosine 6 in near quantitative yield. Trityl-ation of the 5′-OH followed by phosphitylation of the 3′-OH yielded the desired Fm7dG phosphoramidite 7. Resin-bound Fm7dG-containing oligonucleotides 8 were released and deprotected by treatment with K2CO3 in MeOH and were purified by urea-PAGE gel electrophoresis. A 12-mer version of 8, 5′-GACATGA(Fm7dG)-TGCC-3′, was used for crystallographic studies, and a 25-mer, 5′-CGATAGCATCCT (Fm7dG)CCTTCTCTCCAT-3′, for biochemical assays. The mass of the 12-mer and the 25-mer were verified by MALDI-TOF analysis (the 12-mer: calculated 3701, observed 3703; the 25-mer: calculated 7560, observed 7565), and nucleoside composition analysis on both oligonucleotides revealed the presence of the Fm7dG in the expected ratio relative to the other four bases (see Supporting Information).
We hybridized the Fm7dG-containing 25-mer with its complementary strand and investigated the ability of AlkA to catalyze glycosidic bond cleavage of this 2′-fluorinated DNA substrate analogue (see Supporting Information). The glycosylase reaction catalyzed by AlkA generates an abasic site, which is degraded by treatment with aqueous NaOH to effect site-specific DNA strand scission. The cleavage product migrates faster than full-length on a polyacrylamide gel. We tested the Fm7dG 25-mer alongside a well-validated AlkA substrate, a 25-mer containing hypoxanthine (HX) in place of Fm7dG; both contained cytosine opposite the Fm7dG or HX. While the HX duplex DNA was rapidly processed by AlkA, the Fm7dG duplex DNA was inert to AlkA, showing no sign of cleavage even after 18 h.
To obtain a structure of Fm7dG-containing DNA, we employed a host–guest complex crystallization (HGC) system9 in which four molecules of AlkA bind the ends of two duplex DNA molecules.10 AlkA prefers to crystallize in this end-bound mode even with DNA containing cognate lesions in the center of the duplex, as with the Fm7dG-containing 12-mer employed in our studies. AlkA shows no detectable energetic preference to bind DNA containing a cognate lesion versus nonlesion-containing DNA,3e and this therefore provides bias toward crystallization as an end-bound complex, as opposed to a lesion-specific recognition complex. The end-bound crystal packing arrangement leaves the entire middle portion of the duplex nearly devoid of protein contacts, thereby enabling the facile crystallization and structure determination of “naked” lesion-containing DNA molecules, which are ordinarily very difficult to crystallize, especially in the physiologic B-form conformation.
We refined the structure of the AlkA-Fm7dG complex to 2.9 Å resolution and found that it had indeed crystallized as an HGC, with the protein anchoring the DNA ends via the interrogating residue Leu1253c and making limited contacts to the DNA phosphate backbone via its helix–hairpin–helix domain (Figure 1B). The structure reveals, for the first time, the base pairing characteristics of m7G:C in DNA (Figure 1C and D).11 The duplex DNA in the complex has a normal B-form conformation throughout, with an average rise of 3.3 Å per residue and no obvious deformations save for a gentle bend of 16°. The base pairing mode of m7G:C is similar to that of G:C, with similar degrees of propeller-twisting and base-stacking; the rmsd between the two pairs is 0.30 Å, with the main difference being that the distance between O6 (m7dG) and NH2 (dC) in Fm7dG:C is slightly longer than that of G:C (3.4 Å vs ~3.0 Å). The sugar of the Fm7dG has an O4′-endo pucker, instead of the more common 2′-endo pucker; we have recently observed this alternate pucker in several other structures of 2′-β-fluorinated nucleosides in DNA,10 and thus we believe it is the preferred pucker for such analogues.
The absence of any DNA structural distortion induced by m7G, together with the observation that the lesion actually has a stabilizing influence on the double helix,5 underscores the challenge faced by AlkA and AAG in conducting an efficient search of the genome for this rare lesion. It is possible that these enzymes, like oxoguanine DNA glycosylase,12 locate intrahelical m7G residues only upon performing a muscular interrogation of the DNA helix to expose the lesions hidden therein.
Acknowledgments
This work was supported by the NIH (CA100742). We thank the staff of beamlines 24-ID and 19-ID of the Advanced Photon Source at Argonne National Laboratory for expert assistance with X-ray data collection.
Footnotes
Supporting Information Available: Synthetic procedures, 1H and HRMS data for all new compounds. This material is available free of charge via the Internet at http://pubs.acs.org.
References
- 1.(a) Lindahl T. Nature. 1993;362:709. doi: 10.1038/362709a0. [DOI] [PubMed] [Google Scholar]; (b) Sedgwick B. Nat Rev Mol Cell Biol. 2004;5:148. doi: 10.1038/nrm1312. [DOI] [PubMed] [Google Scholar]
- 2.(a) Maxam AM, Gilbert W. Methods Enzymol. 1980;65:499. doi: 10.1016/s0076-6879(80)65059-9. [DOI] [PubMed] [Google Scholar]; (b) Gilbert W, Siebenlist U. Proc Natl Acad Sci USA. 1980;77:123. doi: 10.1073/pnas.77.1.122. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Hayashibara KC, Verdine GL. J Am Chem Soc. 1991;113:5104. [Google Scholar]
- 3.(a) Thomas I, Yang CH, Goldthwait DA. Biochemistry. 1982;21:1162. doi: 10.1021/bi00535a009. [DOI] [PubMed] [Google Scholar]; (b) Evensen G, Seeberg E. Nature. 1982;296:773. doi: 10.1038/296773a0. [DOI] [PubMed] [Google Scholar]; (c) Labahn J, Schärer OD, Long A, Ezaz-Nikpay K, Verdine GL, Ellenberger TE. Cell. 1996;86:321. doi: 10.1016/s0092-8674(00)80103-8. [DOI] [PubMed] [Google Scholar]; (d) Hillis T, Ichikawa Y, Ellenberger T. EMBOJ. 2000;19:758. doi: 10.1093/emboj/19.4.758. [DOI] [PMC free article] [PubMed] [Google Scholar]; (e) O’Brien PJ, Ellenberger TE. J Biol Chem. 2004;279:26876. doi: 10.1074/jbc.M403860200. [DOI] [PubMed] [Google Scholar]
- 4.(a) Lau AY, Scharer OD, Samson L, Verdine GL, Ellenberger T. Cell. 1998;95:249. doi: 10.1016/s0092-8674(00)81755-9. [DOI] [PubMed] [Google Scholar]; (b) Lau AY, Wyatt MD, Glassner BJ, Samson LD, Ellenberger T. Proc Natl Acad Sci USA. 2000;97:13573. doi: 10.1073/pnas.97.25.13573. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Saparbaev M, Laval J. Proc Natl Acad Sci USA. 1994;91:5873. doi: 10.1073/pnas.91.13.5873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ezaz-Nikpay K, Verdine GL. J Am Chem Soc. 1992;114:6562. [Google Scholar]
- 6.(a) Marquez VE, Tseng CK-H, Mitsuya H, Aoki S, Kelley JA, Ford H, Jr, Roth JS, Broder S, Johns DG, Friscoll JS. J Med Chem. 1990;33:978. doi: 10.1021/jm00165a015. [DOI] [PubMed] [Google Scholar]; (b) York JL. J Org Chem. 1981;46:2171. [Google Scholar]; (c) Schärer OD, Verdine GL. J Am Chem Soc. 1995;117:10781. [Google Scholar]; (d) Ikeda H, Fernandez R, Eilk A, Jr, Marquez VE. Nucleic Acids Res. 1998;26:2237. doi: 10.1093/nar/26.9.2237. [DOI] [PMC free article] [PubMed] [Google Scholar]; (e) Sinnot ML. Chem Rev. 1990;90:1171. [Google Scholar]
- 7.Berger I, Tereshko V, Ikeda H, Marquez VE, Egli M. Nucleic Acids Res. 1998;26:2473. doi: 10.1093/nar/26.10.2473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Howell HG, Brodfuehrer PR, Brundidge SP, Benigni DA, Sapino C., Jr J Org Chem. 1998;53:85. [Google Scholar]
- 9.Goodwin KD, Lewis MA, Tanious FA, Tidwell RR, Wilson WD, Georgiadis MM, Long EC. J Am Chem Soc. 2006;128:7846. doi: 10.1021/ja0600936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bowman BR, Lee S, Wang S, Verdine GL. Structure. 2008 in press. [Google Scholar]
- 11.The DNA duplex containing chains G and H (PDB ID: 3D4V) was chosen for graphical representation and detailed analysis, because it showed more complete electron density and substantially lower B-factors than the other DNA duplex (chains E and F) in the asymmetric unit.
- 12.Banerjee A, Yang W, Karplus M, Verdine GL. Nature. 2005;434:612. doi: 10.1038/nature03458. [DOI] [PubMed] [Google Scholar]