Abstract
Somatic hypermutation is programmed base substitutions in the variable regions of Ig genes for high-affinity antibody generation. Two motifs, RGYW and WA (R, purine; Y, pyrimidine; W, A or T), have been found to be somatic hypermutation hotspots. Overwhelming evidence suggests that DNA polymerase η (Pol η) is responsible for converting the WA motif to WG by misincorporating dGTP opposite the templating T. To elucidate the molecular mechanism, crystal structures and kinetics of human Pol η substituting dGTP for dATP in four sequence contexts, TA, AA, GA, and CA, have been determined and compared. The T:dGTP wobble base pair is stabilized by Gln-38 and Arg-61, two uniquely conserved residues among Pol η. Weak base paring of the W (T:A or A:T) at the primer end and their distinct interactions with Pol η lead to misincorporation of G in the WA motif. Between two WA motifs, our kinetic and structural data indicate that A-to-G mutation occurs more readily in the TA context than AA. Finally, Pol η can extend the T:G mispair efficiently to complete the mutagenesis.
Keywords: π–cation stacking, A-to-G transition, immunoglobulin
After V(D)J recombination, nascent antibodies produced in B cells usually have low affinity for antigens. Somatic hypermutation (SHM), which generates mutations in the variable region at a frequency far beyond the rate of spontaneous mutations, potentially changes the conformation of the antigen-binding site and can increase antigen recognition by up to 1,000-fold (1, 2). Mutations have been observed at all four bases, but two sequence motifs, RGYW and WA (R, purine; Y, pyrimidine; W, A or T), have been shown to be the mutation hotspots (3). Although the mechanism of SHM and its target selection is incompletely understood, activation-induced cytidine deaminase (AID) (4, 5), which converts cytosine into uracil, for example, in the RGYW motif, initiates SHM by converting a C:G base pair to a U:G mismatch. Removal of the U by uracil–DNA glycosylase (UNG) generates an abasic site in the DNA, which may lead to a variety of base substitutions (6–8). For the WA motif, it is suggested that after AID-dependent deamination of a cytosine, recognition of the U:G mismatch by mismatch repair protein MutSα (MSH2–MSH6 heterodimer, in which MSH stands for MutS Homolog) leads to the recruitment of DNA polymerase η (Pol η), and with an incision made by UNG the ensuing short-patch repair synthesis results in mutations of A:T pairs to G:C (Fig. 1A) (9–11).
Fig. 1.
SHM at the WA motif. (A) Model of SHM targeting the WA motif through short-patch DNA synthesis. The initial incision targeting the variable region of Ig genes depends on AID and UNG. Recruitment of Pol η is enhanced by MutSα, which recognized the U:G mismatch. The template strand is in orange, and the primer strand (subject to mutation) is shown in yellow. The WA motif is highlighted with A in red. (B) Structural determination of the four stages of dGTP misincorporation in this study. The W of the WA motif is highlighted in yellow and the A subject to replacement by G in red. The protein domains of Pol η in contact with DNA and dNTP are outlined and labeled. The two Mg2+ ions in the active site are shown as purple spheres.
Pol η is one of the highly conserved translesion synthesis (TLS) DNA polymerases found in all eukaryotes and is specialized in bypassing UV-induced cyclobutane pyrimidine dimers (CPDs) in an error-free manner (12, 13). Deficiency of Pol η in humans causes a variant form of the cancer predisposition syndrome xeroderma pigmentosum (XP-V) (14). Patients are thousands of times more likely to develop skin cancer from exposure to sunlight. Like all Y-family DNA polymerases, human Pol η has no proofreading 3′–5′ exonuclease activity (15). Purified Pol η is highly mutagenic on normal DNA and prefers to misincorporate dGTP opposite dT, thereby generating A-to-G transition on a newly synthesized strand (13, 16). The base-substitution spectrum of SHM from XP-V patients is altered with severely decreased mutations at A:T pairs (11, 17, 18). Similar suppression of A:T to G:C mutations was observed in the POLH-deficient mice (19–21). Interestingly, the MSH2–MSH6 heterodimer interacts and stimulates Pol η activity in vitro (22), and mutations of A:T base pairs were abolished in SHM when both POLH and MSH2 genes were knocked out (23).
Crystal structures of the polymerase domain of human Pol η (1–432 aa) complexed with different lesion DNA substrates were reported recently (24–27). These structures revealed a uniquely enlarged active site that can readily accommodate two normal template bases, a cis–syn thymine dimer (CPD), or to a certain extent intrastrand cisplatin cross-linked guanines (Pt-GG). In addition, the “molecular splint” of human Pol η stabilizes the upstream DNA duplex in a normal B-form conformation, even in the presence of cross-linked bases by forming numerous salt bridges and hydrogen bonds with the phosphate backbones, thus facilitating primer extension after CPD lesions (24–27). Misincorporation by Pol η, however, has not been investigated in a sequence-dependent manner or at atomic resolution.
To elucidate the molecular mechanism of Pol η in SHM, we set out to determine crystal structures of the polymerase domain of human Pol η (1–432 aa) (24–27) in the process of misincorporating dGTP opposite T in the WA motif (TA or AA) and non-WA sequences (CA or GA) as well as when extending the primer after a T:G mispair. In addition, steady-state kinetic parameters are measured to complement structural observations.
Results
Pol η Prefers to Mutate WA to WG.
We first compare the efficiency of human Pol η incorporating dATP vs. dGTP opposite a template T following a perfectly paired T, A, G, or C at the primer 3′ end (Table S1). The four sequence contexts are labeled as TA, AA, GA, and CA, respectively, where the second nucleotide, an A, represents the correct nucleotide to be incorporated, but it may become G due to misincorporation, for example, in the TA and AA cases (WA motif). The measured KM and kcat indicate that human Pol η inserts the correct base (dATP) with a similar efficiency (less than twofold difference), regardless of the sequence context. For misincorporation, the KM for dGTP increases by ∼10-fold compared with dATP in all four sequence variations, but the catalytic rates (kcat) differ with sequence contexts (Table S1). The kcat is reduced by 3.6- and 5.6-fold in the TA and AA case, respectively, and is reduced by 8.0-fold in the GA case. As for CA, dGTP misinsertion is severely inhibited, and the kcat is reduced 31-fold. Thus, the relative efficiencies of misincorporation at TA, AA, GA, and CA are 1/40, 1/52, 1/106, and 1/321, respectively, of the correct incorporation, making the WA motif twofold to eightfold more susceptible to A-to-G mutation.
Structures of dGTP Misincorporation Opposite T.
Crystal structures of human Pol η incorporating dATP or its nonreactive analog 2′-deoxyadenosine-5′-[(α,β)-imido]triphosophate (dAMPNPP) opposite T template have been reported (27, 28). Here we focus on the structures of dGTP misincorporation. Human Pol η (1–432 aa) complexed with DNA and nonhydrolyzable 2′-deoxyguanosine-5′-[(α,β)-imido]triphosophate (dGMPNPP) opposite T after A, T, G, or C at the 3′ primer end were crystallized in the P61 space group. These crystals contain one complex per asymmetric unit and are isomorphous to the Pol η ternary complexes with perfectly base-paired DNA and incoming nucleotides (Materials and Methods and Fig. 1B). The structures were refined to resolutions between 1.85 and 2.25 Å (Table 1). The overall protein structures in these misincorporation complexes (AA/G, TA/G, CA/G, and GA/G) are similar to each other and to that of the correct incorporation (T:dATP) complexes, except for a slight closing of the finger and thumb domain as if to squeeze the primer strand and dGMPNPP toward each other (Fig. 2A and Movie S1). Among the misincorporation complexes, there are small but perceptible deviations of a loop (Gln-373–Ser-379) in the little finger (LF) domain (25). The catalytic triad Asp-13, Asp-115, and Glu-116 in the palm domain that chelate the two Mg2+ ions essential for catalysis overlay well with those in the ternary complex of dATP incorporation (25). The 7-bp upstream duplex is kept in the straight B form between the thumb and LF domain as observed (27, 28) (Fig. 2A).
Table 1.
Data collection and refinement statistics of Pol η ternary complex
| TA/G | CA/G | AA/G | GA/G | Extension complex | |
| Data collection | |||||
| Space group | P 61 | P 61 | P 61 | P 61 | P 61 |
| Lattice constant | |||||
| a, b, c; Å | 98.58 | 98.43 | 98.43 | 98.66 | 99.51 |
| 98.58 | 98.43 | 98.43 | 98.66 | 99.51 | |
| 81.67 | 81.96 | 82.01 | 81.96 | 81.57 | |
| Wavelength, Å | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.5418 |
| Resolution, Å | 30.0–2.03 | 30.0–1.85 | 30.0–2.25 | 30.0–1.95 | 30.0–2.60 |
| Rsym, % | 10.2 (61.5) | 8.7 (66.4) | 9.8 (59.5) | 9.3 (42.6) | 13.0 (68.8) |
| I/σI | 12.2 (2.0) | 14.6 (2.0) | 14.5 (3.0) | 19.2 (4.8) | 12.4 (2.4) |
| Completeness, % | 96.3 (97.9) | 98.6 (96.8) | 99.7 (99.7) | 98.6 (100.0) | 99.5 (95.7) |
| Wilson B factor, Å2 | 23.1 | 18.2 | 25.8 | 18.2 | 34.1 |
| Redundancy | 4.5 (3.9) | 3.8 (3.1) | 4.4 (4.4) | 6.4 (6.3) | 4.8 (4.7) |
| Refinement | |||||
| Resolution, Å | 30.0–2.03 | 30–1.85 | 30.0–2.25 | 30.0–1.95 | 30–2.60 |
| No. of reflections | 28,419 | 38,030 | 21,434 | 32,381 | 14,145 |
| Rwork/Rfree | 16.4/18.7 | 17.2/20.8 | 18.0/22.3 | 16.5/19.9 | 20.8/23.4 |
| No. of atoms | |||||
| Protein/DNA | 3,366/386 | 3,366/366 | 3,359/368 | 3,366/363 | 3,279/383 |
| Ligand/ion | 81 | 57 | 46 | 45 | 42 |
| Water | 261 | 374 | 188 | 326 | 98 |
| B factors, Å2 | |||||
| Protein/DNA | 28.0/30.2 | 18.7/23.6 | 24.1/28.6 | 21.4/23.9 | 37.4/48.5 |
| Ligand/ion | 30.9 | 18.6 | 16.4 | 12.7 | 35.6 |
| Water | 34.3 | 25.1 | 24.0 | 26.2 | 36.6 |
| rms deviations | |||||
| Bond length, Å | 0.006 | 0.011 | 0.005 | 0.010 | 0.007 |
| Bond angle, ° | 0.981 | 1.169 | 0.917 | 1.229 | 1.102 |
Data in the highest resolution shell are shown in parentheses.
Fig. 2.
Structures of dGTP misincorporation by Pol η. (A) Structure superposition of Pol η complexed with DNA and nonhydrolyzable dGMPNPP opposite dT in all four sequence contexts. The protein is shown in ribbon diagrams with palm domain in pink, thumb in green, finger in blue, and little finger (LF) in magenta. DNA is shown as stick-and-ladder in yellow (primer) and orange (template). Mg2+ ions (purple spheres) and incoming dGMPNPP (red sticks) are also shown. (B) Correct (T:dATP) and incorrect (T:dGMPNPP) nascent base pair in complex with Pol η. The conserved residues Gln-38 and Arg-61 are shown in blue sticks. (C) Superposition of the Pol η active site in dGTP misincorporation (colored) and normal ternary complexes [gray; PDB ID code 3MR2]. Alterations of primer end and Lys-224 are indicated by red arrows. The water molecule that participates in Mg2+ coordination is shown as a red sphere. A gray double arrow indicates the cation–π stacking. (D) Superposition of the nascent (0) and (−1) base pairs in normal productive (gray) and mispaired nonproductive ternary complexes (colored). The movement of each nucleotide is indicated by red arrows.
A clear deviation in the four Pol η misincorporation complexes is observed at the T:dGMPNPP mispair surrounded by the finger and palm domains (Fig. 2A). As typically observed for a T:G wobble pair with two hydrogen bonds, the templating base T shifts toward the major groove, and the dGMPNPP shifts toward the minor groove (Fig. 2 B–D). Despite the shift of the bases, the triphosphate of dGMPNPP remains coordinated by the two active site Mg2+ ions and situated nearly the same as that of a correctly paired dAMPNPP (Fig. 2 A and C). Two residues, Gln-38 and Arg-61, are uniquely conserved among all Pol η homologs, and both appear to contribute to the misincorporation. Gln-38 interacts with both thymine and guanine bases in the minor groove (Fig. 2B); Arg-61 adopts a rotamer conformation that has not been observed in Pol η–DNA complexes before and makes hydrogen bonds with O4 of dT and O6 and N7 of dGMPNPP in the major groove (Fig. 2B).
The largest structural change in the T:dGMPNPP complexes occurs at the 3′ primer end. Perhaps due to the molecular splint effect of Pol η on the template strand, the 0.7-Å shift of the templating base T toward the major groove leads to a shift of its immediate neighbor upstream (−1 position) in the same direction, which via base pairing induces a displacement of the 3′ primer end (Fig. 2 C and D). The last base of the primer strand is no longer stacked with the incoming dGMPNPP and instead is stacked with the guanidino group of Arg-61. As a result, the 3′-OH shifts >4 Å compared with all Pol η ternary structures solved to date and is hydrogen bonded with a nonbridging oxygen of the α-phosphate of dGMPNPP (Fig. 2C). Concomitantly, a water molecule replaces the 3′-OH as a ligand for the A-site Mg2+, and Lys-224 also switches from interacting with the last phosphate group of the primer strand to interacting with the active-site carboxylate Glu-116 and a water molecule (Fig. 2C). In all four Pol η ternary complexes with the T:dGMPNPP mispair, the dominant conformation is the misaligned primer and dGMPNPP that are incompatible with the nucleophilic attack.
Unique Structural Features of the WA Motif.
The four structures of Pol η misincorporating dGTP do have differences. After close inspection of the electron density maps, we found that in the TA/G and AA/G structures, which are of the WA motif, there is distinct positive electron density in the Fo − Fc difference map that indicates a second conformation of the 3′ primer end with ∼30% occupancy (Fig. 3A). Although a minor species in the population, the second conformation of the 3′-OH is highly similar to the primer end in the normal ternary complex and is aligned with the α-phosphate of dGMPNPP for the phosphoryltransfer reaction (Fig. 3B). However, in the CA/G or GA/G complexes, the misaligned 3′ end is the only conformational species observed (Fig. 3C). By serendipity, two slightly different DNA sequences were used, one in the TA/G and GA/G and the other in the AA/G and CA/G structures (Table S2). The structural differences observed between the WA and non-WA motif complexes are therefore independent of DNA sequence.
Fig. 3.
The alternative DNA conformations observed in TA/G and AA/G that is compatible with dGTP incorporation. (A) The primer 3′ nucleotide (Thy) in the TA/G structure. The major (yellow) and the minor conformation (green) of the last nucleotide is superimposed with the 2Fo − Fc (silver; contoured at 1.0σ) and Fo − Fc omit map (green; contoured at 2.5σ), which were calculated without the minor conformation species. (B) The minor conformation of TA/G is compatible with the phosphoryltransfer reaction. Superposition of the minor conformation of the TA/G structure (colored) with the normal ternary complex (gray; PDB ID code 3MR2). (C) Interactions between Arg-61 (blue) and the −1 base pair (yellow, green primer, and orange template) in TA/G, AA/G, CA/G, and GA/G complexes. The incoming dGMPNPP is shown as semitransparent pink sticks. Hydrogen bonds and van der Waals contacts are indicated by color-coded dashes and distances.
Several factors appear to contribute to the minor but productive conformational species found with the WA motif. The first is likely the base-pair strength at the 3′ primer end. Based on the electron density, when the 3′ primer end is aligned with the incoming nucleotide in the productive conformation, its base-pairing partner in the template strand does not appear to have a second conformation to maintain the proper Watson–Crick hydrogen bonds. Therefore, the primer end has to break the hydrogen bonds with the template to assume the minor conformational species. In the CA/G and GA/G cases, the G:C or C:G pair at the 3′ primer end shares three hydrogen bonds despite the base pair being severely buckled (Figs. 3C and 4A). The reduced number of hydrogen bonds in A:T pairs and the buckle between the base pairs (Fig. 4B) may facilitate formation of the second conformational species in the WA motif. Particularly in the TA/G case, the T:A base pair retains only one hydrogen bond due to base pair opening (Fig. 3C).
Fig. 4.
Unique interactions found in TA/G and AA/G complexes. (A) Stereoview of the superposition of the four Pol η misincorporation and the normal ternary complexes. Each complex is color-coded as indicated. The upward shift of the DNA duplex in GA/G, CA/G, and AA/G complexes is obvious after the entire protein is superimposed. (B) Distinct interactions between the LF domain and the DNA found in TA/G and GA/G. The van der Waals contacts are indicated by lines formed by open circles. (C) Side-by-side views of TA/G and GA/G with Pol η shown as gray molecular surface and Ser-62 side chain in blue carbon and red oxygen. The shift of the DNA substrate in GA/G is accommodated by the rotamer change of Ser-62.
The second factor is the interaction of the base at the 3′ end (−1) with the guanidino group of Arg-61 and with the surrounding bases. It has often been observed that Arg interacts well with the major groove side O6 and N7 of guanine (Fig. S1), and such polar interactions between Arg-61 and the guanine base at the 3′ end are apparent (Fig. 3C). As a result, the 3′ guanine in the GA/G structure maintains its base stacking with the upstream neighbors and has the least buckle (Fig. 4A). When the 3′ primer end is a cytosine (CA/G), the base tilts by ∼25° to avoid electrostatic repulsion between Arg-61 and N4 of the base, thus resulting in the cytosine stacking with the guanidinium of Arg-61 instead of its neighboring bases (Fig. 3C and 4A). In both cases, the 3′ base is structurally stable and does not have an alternative conformation. When a thymine is at the primer 3′ end (TA/G), Arg-61 forms favorable polar interactions with O4 of the base, particularly when the T assumes the second conformation that is stacked with its neighboring bases (Fig. 3C). In contrast, the electrostatic repulsion between Arg-61 and N4 most likely prevents the cytosine adopting the second conformation. When an adenine is at the 3′ primer end, the electrostatic repulsion between its N6 and Arg-61 probably leads to the base offset in both conformations compared with guanine (Fig. 3C). The strong propensity of adenine to form base stacking (29) likely leads to the productive conformation in the AA/G structure, where the 3′ primer end stacks with both dGMPNPP and the upstream base (Fig. 4 A and B).
The third factor is likely the global position of the DNA substrate relative to the polymerase. Among the four misincorporation complexes, the upstream DNA in the CA/G and GA/G structures are shifted along the DNA helical axis toward the incoming nucleotide and the finger domain (Fig. 4A). However, in the TA/G complex, it is more similar to the normal ternary complex, and the DNA in the AA/G complex is in between. Favorable van der Waals interactions between the template strand bases and residues in the LF domain are observed in the WA cases (Fig. 4B), which likely stabilize the DNA in a more native-like conformation. The shift of the DNA substrate also affects the interaction between the finger domain and the downstream single-stranded DNA. In the TA case, the downstream DNA conformation is most similar to the undamaged ternary complexes (Fig. 4 A and B). In the other three cases, despite the same template length as in the undamaged ternary complexes (25, 27), the +1 nucleotide is flipped out and occupies the usual binding site of the +2 nucleotide, and the hydroxyl group of S62 turns to occupy the void (Fig. 4C).
Arg-61 Plays a Key Role in Misincorporation.
Arg-61 adopts a rotamer conformation that forms extensive hydrogen bonds with T:G mispair to favor the dGMPNPP binding. However, Arg-61 also stabilizes the displaced 3′ primer end by cation–π stacking interactions that prevent the polymerization reaction. Only when A or T is at the primer end does the 3′ primer end occasionally revert to the reactive conformation for polymerization, as is evident in the alternative conformations. The arrangement of Arg-61 stabilizing the T:G mismatch and stacking with the base preceding the G is also observed in the postreaction Pol η–product DNA binary complex when a T:G mismatch is at the DNA primer end, but not when it is T:A (Fig. 5A and Table S3). To delineate the hydrogen bonding vs. the base stacking roles of Arg-61 in SHM of the WA motif, we replaced Arg-61 by Lys, which has an amino group to mimic electrostatic interaction and hydrogen-bonding ability of Arg but has greatly reduced potential to stack with DNA bases (30).
Fig. 5.
Translocation and extension of T:G mispair. (A) Structures of postinsertion Pol η–DNA binary complexes. The wobble nascent base pair of the misincorporation is shown side by side with the normal incorporation looking down the DNA helical axis. The template base is shown in orange and the newly incorporated base in yellow (correct) or red (incorrect). Arg-61 interacts with the mispaired guanine base. (B) Structures of posttranslocation binary complexes. In the T:G mismatch complex (Left), the DNA exhibits two equal populations of translocated (multicolored) and untranslocated (gray) conformations. The Fo − Fc omit map, which was calculated with 100% translocated population and contoured at 2.5σ in green, superimposes well with the untranslocated conformation. (C) Structure of the T:G mismatch extension complex. The 2Fo − Fc map (gray; contoured at 1.0σ) corresponding to the primer end and incoming nucleotide is superimposed with the refined structure. (D) Superposition of the T:G mismatch extension (colored) and normal ternary complex structure (gray; PDB ID code 3MR2). The 3′–OH at the primer end and the α-phosphate of dAMPNPP are aligned in the mismatch extension complex, albeit slightly shifted relative to the active site of Pol η.
KM and kcat of the R61K mutant Pol η in dATP and dGTP incorporation in the TA, AA, and CA sequence contexts were measured (Table S1). When incorporating the correct dATP, KM of R61K is increased by twofold to threefold, and the overall efficiency is reduced by twofold to fourfold compared with wild-type (WT) Pol η in the three sequence contexts tested. This reduction is not surprising because the guanidino group of Arg-61 forms bifurcated hydrogen bonds with the α and β phosphates of the nucleotide (28), and Lys, being shorter than the Arg side chain, is a poor substitute for such interactions. Interestingly, the trend that the WA motifs are more susceptible to dGTP misincorporation is the same with R61K as WT Pol η, indicating that the positive charge of Arg-61 may be the determinant in influencing the mutability of the WA motif. Surprisingly, the R61K mutant compared with WT Pol η has a threefold to sixfold higher propensity to make dGTP incorporation, suggesting that stacking of Arg-61 with the primer end actually reduces the misincoporation efficiency (Table 1). Previously, it was shown that R61A mutation reduces dGTP misinsertion opposite T (27) and increases nucleotide insertion fidelity in general at the cost of reduced catalytic efficiency (31). We deduce that the positive charge of Arg-61 (or R61K) must be required to stabilize the T:dGTP mispair (Fig. 2B), thus promoting misincoporation.
Extension of a T:G Mispair by Pol η.
For the A-to-G mutation to persist in somatic cells, it is necessary that the DNA primer be extended after the mismatch during the short-patch DNA synthesis to prevent the misinserted G from being removed by the editing function of a replicative polymerase. Most polymerases are inefficient in mismatch extension (32). To extend a T:G mismatch, both the translocation step and the primer extension step are examined (Fig. 1B). Human Pol η was cocrystallized with a T:A pair or mismatched T:G at the DNA duplex end as the posttranslocation binary complexes (Fig. 1B and Table S3). These crystals diffracted X-rays to 1.95 Å (T:A) and 2.35 Å (T:G), respectively. In these binary complexes, however, a subpopulation of DNA duplex is observed to remain in the product state rather than fully translocated (Fig. 5B). We suspect that stacking of Arg-61 with the purines at the primer 3′ end may cause the sluggish translocation because with a pyrimidine at the 3′ end there is no sign of untranslocated species (26). The incomplete translocation is more severe with an T:G mismatch than T:A base pair (Fig. S2), which may contribute to the 12-fold increase in KM when extending a T:G mismatch compared with normal extension (Table S1).
We have also obtained ternary-complex crystals of Pol η incorporating dAMPNPP after a T:G mismatch, and the structure was determined at 2.6 Å (Table 1). In the presence of an incoming dAMPNPP, the 3′ primer end resumes the near-normal position despite the wobble T:G pair. The reactants are more or less superimposable with the perfectly matched substrates (Fig. 4C). To validate the reaction-ready nature of this structure, we measured the extension efficiency of the T:G mispair in solution by Pol η and showed that the catalytic efficiency (kcat/KM) is reduced by 32-fold compared with normal DNA synthesis (Table S1). Pol η is thus more efficient in primer extension after a mismatched base pair than replicative and B-family TLS polymerases, whose catalytic efficiency is reduced by 10,000 and 100 folds, respectively (32).
Discussion
Pol η binds dGTP tightly during misincorporation with a KM of ∼10 µM regardless of sequence context (Table S1), indicating there is no sequence context preference in the dGTP binding step. The crystal structures of dGTP misincorporation in all four sequence contexts show the common feature that the T:G wobble pair fits well in the active site and interacts snugly with the conserved Gln-38 and Arg-61 in the major and minor groove (Fig. 2B). These interactions are rather different from other Y-family DNA polymerases, for example, Pol ι or Dpo4. The incoming dGTP is unable to stack with the primer 3′ end in the Dpo4 misincorporation ternary structure (33) (Fig. S3A). In the Pol ι case, the mismatched T and G maintains anti–anti conformation, but the templating base is displaced from its normal position and is not paired with the incoming dGTP (34) (Fig. S3B). Gln-59, which is conserved among Pol ι homologs and equivalent of Gln-38 in human Pol η, forms a hydrogen bond with only the N2 atom of the guanine base but not with the template T. Steady-state kinetic measurement indicates that replacing either Gln-38 or Arg-61 by Ala in Pol η dramatically inhibits the misincorporation as well as bypassing of CPDs (27). We find that replacing Arg-61 with Lys also greatly increases the misincorporation frequency and reduces the catalytic efficiency. The equivalent of Arg-61 in Pol ι is a Lys (35). Nature through evolution probably has selected Arg-61 and Gln-38 in Pol η to maximize the efficiency and accuracy for UV–lesion bypass. Pol η-dependent dGTP misincorporation at the WA motif in SHM is likely a byproduct that takes advantage of the conserved Arg-61 and Gln-38 late in the evolutionary history.
The kcat of dGTP misincorporation is significantly reduced compared with the correct incorporation and differs according to the base-pair sequence at the primer end. The catalytic efficiencies of dGTP misincorporation in the TA and AA contexts are higher than the GA and CA, and the relative efficiency is TA > AA > GA >> CA (Table S1). These results correlate well with the published SHM spectrum, which shows that the TA mutations are strongly favored over AA mutations by Pol η (36). The reduced kcat correlates with the displacement of the primer end due to cation–π interaction mediated by Arg-61. For efficient catalysis, it is essential that the 3′-OH group of the primer end and the α-phosphate of the incoming dNTP be perfectly aligned. Misalignment between primer end and incoming nucleotide, even slightly, will inhibit the nucleotidyl-transfer reaction (25). Arg-61 also provides another barrier in misincorporation by impeding the translocation step as observed in our binary complexes after misincorporation and before the next round of incorporation. Misincorporation is much enhanced by the R61K mutant Pol η. Lys, which has a shorter side chain than Arg and reduced capability to form cation–π stacking with the primer end (Fig. 2C), is more prone to misincorporate dGTP than WT Pol η (Table 1). Besides the dominant misaligned conformation, the electron density in our structures revealed that there is a second population of primer end in the TA/G and AA/G, but not CA/G or GA/G, complexes that superimposes well with the normal ternary complex and supports the chemistry.
Because the R61K mutant polymerase still favors dGTP misincorporation in the WA motif, we suspect that the cation–π stacking between Arg-61 and the primer end is not the main reason for the WA motif to be an SHM hotspot. The different stability of A:T and T:A vs. G:C and C:G base pairs most likely underlies the high efficiency of dGTP misincorporation in the WA motif. In addition, the stacking propensity of the 3′ base with its neighbors and its electrostatic interactions with Arg-61 may influence whether the 3′-OH can revert to the reactive conformation and also the probability of such reversion. T and A at the primer 3′ end are thus found to be more able than G and C to align with the incoming dGTP and form a productive complex for misincorporation to take place. Together, the strong dGTP binding, even when it is a mismatch for the template T, weaker hydrogen bonding between A and T at the 3′ primer strand end, and efficient mispair extension by Pol η provide the molecular rationale for the conversion of WA motifs to WG during SHM.
Materials and Methods
Crystallization and Structure Determination.
Site-directed mutagenesis, protein expression, and purification of WT or R61K human Pol η (1–432 aa) were performed as described (27). Purified Pol η was stored in 20 mM Tris·HCl, (pH 7.5), 450 mM KCl, and 3 mM DTT. After mixing Pol η and DNA at a 1:1.05 molar ratio, the complex was transferred into 20 mM Tris·HCl (pH 7.5), 150 mM KCl, 3 mM DTT, and 5 mM MgCl2 and concentrated to 3 mg/mL Pol η. To make ternary complexes, an appropriate dNTP analog (purchased from Jena Bioscience) was added. Crystals were grown by the hanging-drop vapor-diffusion method with optimized reservoir buffer containing 0.1 M Mes (pH 6.0), 5 mM MgCl2, 19–21% (wt/vol) MPEG2000 (polyethylene glycol monomethyl ether 2000) (27). The oligos and incoming nucleotides used in crystallization are summarized in Table S2. Crystals grew to maximal dimensions with diffraction quality in 3–7 d. Diffraction data were collected at beamline 22-BM at Advanced Photon Source, Argonne National Laboratory, processed with HKL2000 (37) or XDS (38), and converted to structure factors by using TRUNCATE (39). Refinement and structure analyses (Table 1 and Table S3) were performed by using COOT (40), PHENIX (41), and PyMOL (www.pymol.org).
Kinetic Measurements.
The primer extension assay and steady-state KM and kcat measurements were carried out by using a 5′-fluorescein labeled primer and normal template oligonucleotides as described (25). The DNA sequences and nucleotides used in these assays are shown in Table S1. Quantification and curve fitting was also performed as described (32).
Supplementary Material
Acknowledgments
We thank Drs. R. Craigie and D. Leahy for critical reading of the manuscript. This work was supported by the Intramural Research Program of National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (Y.Z., M.T.G., C.B., and W.Y.); a Chinese Ministry of Education scholarship (Y.Z.); National Natural Science Foundation of China Grant 31210103904 (to Y.-J.H.); and Grants-in-Aid for Scientific Research from the Ministry of Education (KAKENHI) (to F.H.).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The atomic coordinates, structure factors, and diffraction data have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 4J9K–4J9S).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1303126110/-/DCSupplemental.
References
- 1.Rajewsky K, Förster I, Cumano A. Evolutionary and somatic selection of the antibody repertoire in the mouse. Science. 1987;238(4830):1088–1094. doi: 10.1126/science.3317826. [DOI] [PubMed] [Google Scholar]
- 2.Kim S, Davis M, Sinn E, Patten P, Hood L. Antibody diversity: Somatic hypermutation of rearranged VH genes. Cell. 1981;27(3 Pt 2):573–581. doi: 10.1016/0092-8674(81)90399-8. [DOI] [PubMed] [Google Scholar]
- 3.Di Noia JM, Neuberger MS. Molecular mechanisms of antibody somatic hypermutation. Annu Rev Biochem. 2007;76:1–22. doi: 10.1146/annurev.biochem.76.061705.090740. [DOI] [PubMed] [Google Scholar]
- 4.Sohail A, Klapacz J, Samaranayake M, Ullah A, Bhagwat AS. Human activation-induced cytidine deaminase causes transcription-dependent, strand-biased C to U deaminations. Nucleic Acids Res. 2003;31(12):2990–2994. doi: 10.1093/nar/gkg464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Muramatsu M, Nagaoka H, Shinkura R, Begum NA, Honjo T. Discovery of activation-induced cytidine deaminase, the engraver of antibody memory. Adv Immunol. 2007;94:1–36. doi: 10.1016/S0065-2776(06)94001-2. [DOI] [PubMed] [Google Scholar]
- 6.Rada C, Di Noia JM, Neuberger MS. Mismatch recognition and uracil excision provide complementary paths to both Ig switching and the A/T-focused phase of somatic mutation. Mol Cell. 2004;16(2):163–171. doi: 10.1016/j.molcel.2004.10.011. [DOI] [PubMed] [Google Scholar]
- 7.Saribasak H, et al. Uracil DNA glycosylase disruption blocks Ig gene conversion and induces transition mutations. J Immunol. 2006;176(1):365–371. doi: 10.4049/jimmunol.176.1.365. [DOI] [PubMed] [Google Scholar]
- 8.Schanz S, Castor D, Fischer F, Jiricny J. Interference of mismatch and base excision repair during the processing of adjacent U/G mispairs may play a key role in somatic hypermutation. Proc Natl Acad Sci USA. 2009;106(14):5593–5598. doi: 10.1073/pnas.0901726106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jansen JG, et al. Strand-biased defect in C/G transversions in hypermutating immunoglobulin genes in Rev1-deficient mice. J Exp Med. 2006;203(2):319–323. doi: 10.1084/jem.20052227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Masuda K, et al. DNA polymerase theta contributes to the generation of C/G mutations during somatic hypermutation of Ig genes. Proc Natl Acad Sci USA. 2005;102(39):13986–13991. doi: 10.1073/pnas.0505636102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zeng X, et al. DNA polymerase eta is an A-T mutator in somatic hypermutation of immunoglobulin variable genes. Nat Immunol. 2001;2(6):537–541. doi: 10.1038/88740. [DOI] [PubMed] [Google Scholar]
- 12.Masutani C, Kusumoto R, Iwai S, Hanaoka F. Mechanisms of accurate translesion synthesis by human DNA polymerase eta. EMBO J. 2000;19(12):3100–3109. doi: 10.1093/emboj/19.12.3100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Johnson RE, Washington MT, Prakash S, Prakash L. Fidelity of human DNA polymerase eta. J Biol Chem. 2000;275(11):7447–7450. doi: 10.1074/jbc.275.11.7447. [DOI] [PubMed] [Google Scholar]
- 14.Masutani C, et al. The XPV (xeroderma pigmentosum variant) gene encodes human DNA polymerase eta. Nature. 1999;399(6737):700–704. doi: 10.1038/21447. [DOI] [PubMed] [Google Scholar]
- 15.Yang W, Woodgate R. What a difference a decade makes: Insights into translesion DNA synthesis. Proc Natl Acad Sci USA. 2007;104(40):15591–15598. doi: 10.1073/pnas.0704219104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Matsuda T, Bebenek K, Masutani C, Hanaoka F, Kunkel TA. Low fidelity DNA synthesis by human DNA polymerase-eta. Nature. 2000;404(6781):1011–1013. doi: 10.1038/35010014. [DOI] [PubMed] [Google Scholar]
- 17.Faili A, et al. DNA polymerase eta is involved in hypermutation occurring during immunoglobulin class switch recombination. J Exp Med. 2004;199(2):265–270. doi: 10.1084/jem.20031831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zeng X, Negrete GA, Kasmer C, Yang WW, Gearhart PJ. Absence of DNA polymerase eta reveals targeting of C mutations on the nontranscribed strand in immunoglobulin switch regions. J Exp Med. 2004;199(7):917–924. doi: 10.1084/jem.20032022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Delbos F, et al. Contribution of DNA polymerase eta to immunoglobulin gene hypermutation in the mouse. J Exp Med. 2005;201(8):1191–1196. doi: 10.1084/jem.20050292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Pavlov YI, et al. Correlation of somatic hypermutation specificity and A-T base pair substitution errors by DNA polymerase eta during copying of a mouse immunoglobulin kappa light chain transgene. Proc Natl Acad Sci USA. 2002;99(15):9954–9959. doi: 10.1073/pnas.152126799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Rogozin IB, Pavlov YI, Bebenek K, Matsuda T, Kunkel TA. Somatic mutation hotspots correlate with DNA polymerase eta error spectrum. Nat Immunol. 2001;2(6):530–536. doi: 10.1038/88732. [DOI] [PubMed] [Google Scholar]
- 22.Wilson TM, et al. MSH2-MSH6 stimulates DNA polymerase eta, suggesting a role for A:T mutations in antibody genes. J Exp Med. 2005;201(4):637–645. doi: 10.1084/jem.20042066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Delbos F, Aoufouchi S, Faili A, Weill JC, Reynaud CA. DNA polymerase eta is the sole contributor of A/T modifications during immunoglobulin gene hypermutation in the mouse. J Exp Med. 2007;204(1):17–23. doi: 10.1084/jem.20062131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ummat A, et al. Structural basis for cisplatin DNA damage tolerance by human polymerase η during cancer chemotherapy. Nat Struct Mol Biol. 2012;19(6):628–632. doi: 10.1038/nsmb.2295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhao Y, et al. Structural basis of human DNA polymerase η-mediated chemoresistance to cisplatin. Proc Natl Acad Sci USA. 2012;109(19):7269–7274. doi: 10.1073/pnas.1202681109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ummat A, et al. Human DNA polymerase η is pre-aligned for dNTP binding and catalysis. J Mol Biol. 2012;415(4):627–634. doi: 10.1016/j.jmb.2011.11.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Biertümpfel C, et al. Structure and mechanism of human DNA polymerase eta. Nature. 2010;465(7301):1044–1048. doi: 10.1038/nature09196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Nakamura T, Zhao Y, Yamagata Y, Hua YJ, Yang W. Watching DNA polymerase η make a phosphodiester bond. Nature. 2012;487(7406):196–201. doi: 10.1038/nature11181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kool ET. Hydrogen bonding, base stacking, and steric effects in DNA replication. Annu Rev Biophys Biomol Struct. 2001;30:1–22. doi: 10.1146/annurev.biophys.30.1.1. [DOI] [PubMed] [Google Scholar]
- 30.Sathyapriya R, Vishveshwara S. Interaction of DNA with clusters of amino acids in proteins. Nucleic Acids Res. 2004;32(14):4109–4118. doi: 10.1093/nar/gkh733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kondo Y. 2005. Studies of in vivo and in vitro function of DNA polymerase eta. PhD dissertation (Osaka Univ, Osaka)
- 32.Wang F, Yang W. Structural insight into translesion synthesis by DNA Pol II. Cell. 2009;139(7):1279–1289. doi: 10.1016/j.cell.2009.11.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Vaisman A, Ling H, Woodgate R, Yang W. Fidelity of Dpo4: Effect of metal ions, nucleotide selection and pyrophosphorolysis. EMBO J. 2005;24(17):2957–2967. doi: 10.1038/sj.emboj.7600786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kirouac KN, Ling H. Structural basis of error-prone replication and stalling at a thymine base by human DNA polymerase iota. EMBO J. 2009;28(11):1644–1654. doi: 10.1038/emboj.2009.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Alt A, et al. Bypass of DNA lesions generated during anticancer treatment with cisplatin by DNA polymerase eta. Science. 2007;318(5852):967–970. doi: 10.1126/science.1148242. [DOI] [PubMed] [Google Scholar]
- 36.Yavuz S, Yavuz AS, Kraemer KH, Lipsky PE. The role of polymerase eta in somatic hypermutation determined by analysis of mutations in a patient with xeroderma pigmentosum variant. J Immunol. 2002;169(7):3825–3830. doi: 10.4049/jimmunol.169.7.3825. [DOI] [PubMed] [Google Scholar]
- 37.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 1997;276:307–326. doi: 10.1016/S0076-6879(97)76066-X. [DOI] [PubMed] [Google Scholar]
- 38.Kabsch W. XDS. Acta Crystallogr D Biol Crystallogr. 2010;66(Pt 2):125–132. doi: 10.1107/S0907444909047337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Collaborative Computational Project, Number 4 The CCP4 suite: Programs for protein crystallography. Acta Crystallogr D Biol Crystallogr. 1994;50(Pt 5):760–763. doi: 10.1107/S0907444994003112. [DOI] [PubMed] [Google Scholar]
- 40.Emsley P, Cowtan K. Coot: Model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60(Pt 12 Pt 1):2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- 41.Adams PD, et al. PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr. 2010;66(Pt 2):213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





