Summary
A single enzyme active site that catalyzes multiple reactions is a well-established biochemical theme, but how one nuclease site cleaves both DNA strands of a double helix has not been well understood. In analyzing site-specific DNA cleavage by the mammalian RAG1-RAG2 recombinase, which initiates V(D)J recombination, we find that the active site is reconfigured for the two consecutive reactions, and the DNA double helix adopts drastically different structures. For initial nicking of the DNA, a locally unwound and unpaired DNA duplex forms a zipper via alternating inter-strand base stacking, rather than melting as generally thought. The second strand cleavage and formation of a hairpin-DNA product requires a global scissor-like movement of protein and DNA, delivering the scissile phosphate into the rearranged active site.
Keywords: V(D)J recombination, SCID, interstrand base stacking, DNA zipper
Introduction
Many bacterial and eukaryotic transposases contain an RNase H-like (RNH) catalytic core and use a single active site to cleave both DNA strands at the boundary of recognition sequences 1,2. Among them, RAG1-RAG2 (or RAG) cleaves DNA in the immunoglobulin and T-cell receptor loci to initiate the process of V(D)J recombination and generate immune-system diversification in jawed vertebrates 2–4. Each of the two DNA recombination signal sequences, 12RSS and 23RSS, which are composed of a conserved heptamer and nonamer separated by a 12 or 23 bp non-conserved spacer, marks the borders of the antigen receptor V, D or J coding segments (the coding flanks). Upon binding a pair of 12- and 23-RSS DNAs, RAG first nicks one strand of each RSS and then cleaves the second strands by forming DNA hairpins (Fig. 1a–b) 5–7.
Fig. 1: Reactions catalyzed by RAG recombinase.
a, Three DNA bound states of mRAG. The top- (nicking) and bottom-strand (hairpinning) scissile phosphates are depicted as red and lilac spheres, respectively. The divalent metal ions in the active site are shown as green spheres. In the NFC, bases forming the DNA “zipper” are shown as red sticks. b, Two types of DNA cleavage mechanism exhibited by RNase H-like transposases. The site of the first DNA nick is marked with red scissors, and the nucleophilic attack is indicated with red arrows. RSS, recombination signal sequence; TIR, terminal inverted repeat of transposable elements. The dashed grey box indicates that only a subset of this type of transposases undergoes hairpin formation. c, DNA designs for generating NFC. Mutations are highlighted in blue (DNA1) and magenta (DNA2). AP stands for abasic analogue (tetrahydrofuran), and CF for coding flank. Subscripts label positions of nucleotides in top (t) and bottom strand (b) of the heptamer. d, Overall structure of NFC (DNA1). Protein domains and 12/23 RSS DNAs are labeled.
Although resistant to structural study for two decades, zebrafish (zRAG) and mouse RAG (mRAG) have in recent years yielded crystal and cryoEM structures of an apo form, and the DNA-bound pre-reaction (PRC) and hairpin-forming complexes (HFC) (Fig. 1a) 4,8–10. These structures reveal how a Y-shaped dimer of RAG1-RAG2 heterodimers pairs asymmetrical 12 and 23RSS DNAs and can undergo large conformational changes. The conserved catalytic core of RAG binds the recombination signal DNA in the same fashion as all RNH-type transposases 11–16. However, RAG nicks the top strand at the 5′ boundary of the RSS first, while all bacterial and many eukaryotic transposases first cut the equivalent of the bottom strand at the 3′ boundary 1,17–19 (Fig. 1b). In both mouse and zebrafish PRC structures 9,10, the bottom strand of the B-form DNA substrate, which is the strand opposite to the one that will be nicked, is juxtaposed to the RAG active site.
Recently reported cryoEM structures of three different zebrafish nick-forming complexes (NFC) of zebrafish RAG (zRAG) reveal that the top strand is placed in the active site for nicking when RSS DNA is untwisted by 180° 10. However, these structures with one RSS DNA (either 12 or 23RSS) or both untwisted were determined at moderate resolutions (4–5 Å) from a mixture of three PRC complexes (with one or both RSS DNAs bound) in the same cryoEM sample. In all six zRAG structures, the active sites are reported to be fully formed before the DNA is unwound, with the catalytic DDE motif binding two metal ions and situated adjacent to either the top (NFC) or bottom strand (PRC) for DNA cleavage 10. In these PRC structures, it is not clear why zRAG does not nick the bottom strand first.
After obtaining a pure mouse RAG NFC with modified DNA substrates, we have determined the structure up to 3.2 Å resolution and observed a previously unknown DNA zipper structure. The RAG active site, which is only partially formed and non-reactive in PRC, becomes fully assembled and adopts two different configurations for nicking and hairpin forming reactions in NFC and HFC, respectively. This structure completes the jigsaw puzzle of the mechanism determining how the two DNA cleavage reactions take place in a single active site.
Results
Structural characterization of a stable NFC by cryoEM
To avoid a severe preferred orientation problem in cryoEM, we used mRAG1 (aa 265–1040) and RAG2 (aa 1–520), which are longer than the catalytic core (mRAG1 of 384–1008 and mRAG2 of 1–359) used in crystallographic studies 9. Both forms of mRAG are active in DNA cleavage assays 9,20. To eliminate enzymatic activities of mRAG, mutation of the third catalytic residue in the DDE motif, E962Q, was generated for structural analyses 9. Because the previously determined crystal and cryoEM structures of HFC composed of WT core mRAG or the longer mRAG with E962Q mutation are superimposable, we used the longer mRAG proteins in this report, WT in cleavage assays and both WT and E962Q in cryoEM analyses.
A normal substrate pair of 12 and 23RSS (DNA0, Fig. 1c) can be cleaved by mRAG at 37° but not 22°C (Extended Data Fig. 1a–b). However, E962Q mutant mRAG-DNA0 complexes were solely in the PRC state even at 37°C as determined by cryoEM (Extended Data Fig. 1c, see Methods). The discrepancy between the effective DNA nicking in solution and the absence of a corresponding NFC complex captured by cryoEM was due to the E962Q mutation, which biased the mRAG-DNA complex toward inactive PRC. When WT mRAG and DNA0 were mixed in the presence of Ca2+ at 37°C, roughly 15% of the resulting complexes had one RSS DNA in the NFC conformation on cryoEM grids, and a few were pure NFC (Methods).
As DNA cleavage is improved by strategically placed abasic sites 21, we constructed DNA substrates with both 12 and 23RSS DNA containing an abasic site in the coding flank (CFb1) (designated as DNA1) or with two additional mutations (Gb1 to T and Tb2 to abasic) in the heptamer (DNA2) (Fig. 1c). DNA1, which was a better substrate than WT DNA0 both for nicking and hairpinning at 22 or 37°C, produced more NFC at 37°C than 22°C when mixed with the E962Q mutant (Extended Data Fig. 1). DNA2 is nicked as efficiently as DNA1 at 22°C, but cannot form hairpins (Extended Data Fig. 1a–b) due to the loss of mRAG1-DNA interactions necessary for the second strand cleavage 9. To the advantage of cryoEM analysis, DNA2 complexed with the E962Q mutant mRAG produced a pure NFC state at 22°C (Extended Data Fig. 1c).
Cryo-EM structures with both 12- and 23-RSS DNAs untwisted by 180° and the top strand in the active site of mRAG for nicking were reconstructed at 3.3 Å with DNA2 and 3.7 Å with DNA1 (Fig. 1d, Table 1), and improved to 3.2 and 3.4 Å, respectively, by applying a two-fold symmetry to the Y-shaped RAG-DNA complex (Extended Data Fig. 2–3). The NFC structure of WT mRAG-DNA0 was also determined at 3.8 Å resolution (Methods). Despite different DNA sequences adjacent to the cleavage site, the two NFC structures of DNA1 and DNA2 are superimposable with an RMSD of 1.1 Å over 1891 pairs of Cα atoms (Extended Data Fig. 4a), thus cross-validating each other. The NFC structure of WT mRAG with DNA0 has the same structural features as those of E962Q mutant RAG with DNA1 or DNA2 (Extended Data Fig. 4, Methods). The RMSD of 878 pairs of Cα atoms (one mRAG heterodimer, excluding the NBD domain) between the WT-DNA0 and E962Q-DNA1 NFC complexes is 0.6 Å, which is similar to the RMSD between the NFC states of DNA1 and DNA2 bound to E962Q mutant RAG (912 pairs of Cα atoms in one mRAG heterodimer and 0.7 Å).
Table 1.
Cryo-EM data collection, refinement and validation statistics
| PRC (DNA0-E962Q) (EMD-20030, PDB 6OEM) | PRC (DNA1-E962Q) (EMD-20031, PDB 6OEN) | NFC (DNA1-E962Q) (EMD-20032, PDB 6OEO) | NFC (DNA2-E962Q) (EMD-20033, PDB 6OER) | 12RSS-NFC/23RSS-PRC (DNA1-E962Q) (EMD-20034, PDB 6OEP) | 12RSS-PRC/23RSS-NFC (DNA1-E962Q) (EMD-20035, PDB 6OEQ) | NFC_C2 (DNA1-E962Q) (EMD-20039) | NFC_C2 (DNA2-E962Q) (EMD-20038) | NFC (DNA0-WT) (EMD-21003, PDB 6V0V) | |
|---|---|---|---|---|---|---|---|---|---|
| Data collection and processing | |||||||||
| Magnification | 130,000 | 130,000 | 130,000 | 130,000 | 130,000 | 130,000 | 130,000 | 130,000 | 130,000 |
| Voltage (kV) | 300 | 300 | 300 | 300 | 300 | 300 | 300 | 300 | 300 |
| Electron exposure (e−/Å2) | 57 | 42 | 42 | 50–60 | 42 | 42 | 42 | 50–60 | 45 |
| Defocus range (μm) | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.4 to −3.0 | −1.2 to −3.0 |
| Pixel size (Å) | 1.07 | 1.07 | 1.07 | 1.07 | 1.07 | 1.07 | 1.07 | 1.07 | 1.06 |
| Symmetry imposed | C1 | C1 | C1 | C1 | C1 | C1 | C2 | C2 | C1 |
| Initial particle images (no.) | 590,590 | 2,619,084 | 2,619,084 | 1,689,209 | 2,619,084 | 2,619,084 | 2,619,084 | 1,689,209 | 1,282,896 |
| Final particle images (no.) | 109,865 | 29,224 | 109,388 | 333,280 | 107,398 | 27,374 | 109,388 | 333,280 | 111,362 |
| Map resolution (Å) | 3.6 | 4.3 | 3.7 | 3.3 | 3.7 | 4.3 | 3.44 | 3.15 | 3.6 |
| FSC threshold | 0.143 | 0.143 | 0.143 | 0.143 | 0.143 | 0.143 | 0.143 | 0.143 | 0.143 |
| Map resolution range (Å) | 3–6 | 4–7 | 3–6 | 3–6 | 3–7 | 4–7 | 3–5 | 2.5–4.5 | 3–7 |
| Refinement | |||||||||
| Initial model used (PDB code) | 6CIK | 6OEM | 6OER | 5ZE0 | 6OEO, 6OEN | 6OEO, 6OEN | 6OEO | ||
| Model resolution (Å) | 4.1 | 4.4 | 3.8 | 3.4 | 3.8 | 4.4 | 3.9 | ||
| FSC threshold | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | ||
| Map sharpening B factor (Å2) | −50 | −100 | −100 | −80 | −100 | −100 | −100 | −129 | −60 |
| Model composition | |||||||||
| Nonhydrogen atoms | 19704 | 19662 | 19518 | 19484 | 19205 | 19147 | 8193 | ||
| Protein residues | 2000 | 1998 | 1970 | 1952 | 1909 | 1910 | 887 | ||
| Ligands | 4 | 3 | 6 | 6 | 5 | 4 | 3 | ||
| B factors (Å2) | |||||||||
| Protein | 114.86 | 131.29 | 89.17 | 85.32 | 96.36 | 120.58 | 98.93 | ||
| Ligand | 96.17 | 100.76 | 102.68 | 97.79 | 114.42 | 116.88 | 115.51 | ||
| R.m.s. deviations | |||||||||
| Bond lengths (Å) | 0.002 | 0.006 | 0.005 | 0.007 | 0.010 | 0.009 | 0.008 | ||
| Bond angles (°) | 0.473 | 0.980 | 0.716 | 0.799 | 1.019 | 1.053 | 1.000 | ||
| Validation | |||||||||
| MolProbity score | 2.64 | 2.31 | 2.15 | 2.18 | 1.87 | 1.98 | 2.01 | ||
| Clashscore | 7.52 | 8.05 | 13.82 | 13.51 | 6.20 | 8.58 | 10.71 | ||
| Poor rotamers (%) | 2.01 | 3.72 | 0.06 | 0.12 | 0.37 | 0.62 | 0.40 | ||
| Ramachandran plot | |||||||||
| Favored (%) | 95.4 | 93.17 | 91.71 | 90.34 | 90.70 | 91.24 | 92.61 | ||
| Allowed (%) | 4.6 | 6.83 | 8.19 | 9.45 | 9.24 | 8.71 | 7.28 | ||
| Disallowed (%) | 0.00 | 0.00 | 0.10 | 0.21 | 0.05 | 0.05 | 0.11 |
An untwisted but base-stacked DNA zipper in mouse NFC
In all NFC structures (DNA0, DNA1 or DNA2), DNA untwisting occurs locally in the second and third base pairs of each heptamer (CACAGTG), and the normally cylindrical DNA helix becomes a flat ribbon (Fig. 2a, Supplementary Video 1). Surrounding the 180° untwisting, the first (C/G) and fourth (A/T) basepairs of the heptamer remain hydrogen bonded, and the rest of the RSS is as in the PRC. The coding flank DNA beyond each RSS is rotated by 180° and presents its major groove rather than the minor groove (as in PRC) to contact RAG2 in NFC (Extended Data Fig. 4b, f). The four nucleotides of the unpaired At2Ct3/(Tb2Gb3) (the parenthesis indicating the complementary strand) form an inter-strand base-stacked zipper in the order of (Tb2)At2(Gb3)Ct3, by untwisting the strands, stretching lengthwise, breaking base pairing and sliding the two strands toward each other with interdigitated base stacking (Fig. 2b, 3a–b). The rise between these cross-stacked single bases is 3.2–3.4 Å, comparable to the basepair separation in A and B forms, and the heptamer DNA is elongated by >6Å compared to the PRC. The inter-strand base stacking brings the two nearly parallel phosphosugar backbones 3 Å closer than in B-DNA, forming a “slender waist” in the DNA substrate (Fig. 2a).
Fig. 2: Structure of the DNA zipper in NFC.
a, Two orthogonal views of the untwisted and extended DNA in NFC (orange), with the DNA in PRC (semi-transparent grey) superimposed. The (Tb2)At2(Gb3)Ct3 zipper stabilized by the inter-strand base stacking is labeled in red. Arrows highlight the flattened CACA (heptamer) in NFC. The scissile phosphates are shown as red and lilac spheres. The direction of DNA unwinding is marked in orange. The active site is marked by the catalytic DDE motif and two divalent cations. b, A zoom-in view of the (Tb2)At2(Gb3)Ct3 zipper in NFC structure of DNA1. c, Top view of the stacked At2Gb3 bases in NFC structure of DNA2.
Fig. 3: A DNA zipper is associated with untwisted DNA.
a-b, The DNA zipper in the mouse NFC structures with DNA1 or DNA2. The bases forming the zipper are labeled in red letters. c, The melted DNA and flipped-out bases (labeled in red) in zebrafish NFC (PDB: 6DBV). The three bases At2(Gb3)Ct3 forming the zipper in mouse NFC are superimposed and shown as green cartoon. In panels a to c, cryoEM density maps (contoured at 5 σ) are superimposed onto corresponding structures in semi-transparent grey. d, The (T)A(G)C zipper in the wildtype mouse NFC structures. The zipper is highlighted. e, The A(A)T zipper in the Tn1549 transposon structure (PDB: 6EMZ) is shown in orthogonal views.
In the three best resolved zebrafish NFC structures, which are of either RSS DNA (12 or 23) untwisted, and an average of the two 10, the outline of untwisted DNA backbones is similar to that in the mouse NFC, but the first 4 bp in each heptamer (CACAGTG) are modeled as melted and unpaired, and two bases (Tb2 and Gb3) are flipped out of the duplex with no contact by zRAG (Fig. 3c). The cryoEM density maps for these three related structures differ somewhat. Although they were interpreted as supporting the base flipped-out model (PDB:6DBR) 10, they can also support the zipper DNA (PDB: 6DBV, Fig. 3c). In the proposed zebrafish NFC structure with base pairs melted, how the DNA becomes lengthened by 6Å rather than shortened and what stabilizes the flipped-out extrahelical bases are unexplained.
In the mouse NFC structures, the 6 Å extension of DNA is a result of the zipper formation and is as essential as the 180° unwinding for placing the scissile phosphate in the RAG active site. In the DNA zipper, the bases are stabilized by stacking on each other. The conserved At2 and Gb3 are crucial for zipper formation, and each contributes a positively charged amine group (N6 of At2 and N2 of Gb3) to form cation-π interactions with its stacked neighbors (Fig. 2b–c). The same zipper also forms with native DNA (DNA0) in complex with WT mRAG (Fig. 3d). As the DNA zipper is intact in DNA2 that lacks Tb2 (Fig. 3b), the remaining At2(Gb3)Ct3 (or A(G)C) must be sufficient to maintain the untwisted DNA conformation. Base stacking is well known to be the principal force that stabilizes the DNA double helix 22,23. Here the inter-strand base stacking stabilizes the two untwisted strands and prevents DNA “melting”.
To assess the intrinsic propensity for DNA to form a zipper as in the mouse NFC structures versus a melted and base-flipped out structure reported for zebrafish, we first carried out unbiased molecular dynamics simulations of these two DNA structures (see Methods). We found that simulations started from the zipper structure were stable, remaining ~2 Å all-atom RMSD from the initial coordinates over 120 ns simulation time, regardless of whether the Amber14 24 or CHARMM36 force fields for DNA 25 were used. On the other hand, simulations started from the base-flipped structure quickly diverged from that state, with increasing RMSD from the initial coordinates, and became more zipper-like as evident from the decreasing RMSD to the zipper coordinates (Extended Data Fig. 6). As an additional test, we ran simulations from a canonical B-form DNA and applied a moving bias force to the terminal nucleotides to mimic the 6 Å stretching and unwinding in the NFC. Even with this simple bias, the DNA moves to within 2–3 Å RMSD from the zipper structure, while remaining ~5 Å RMSD from the base-flipped structure.
A related inter-digitated base-stacking structure has also been observed in a DNA unwound by ~60° in complex with a tyrosine recombinase (Fig. 3e) 26. There the DNA zipper is formed by A(A)T instead of A(G)C. We suspect that R(R)Y (R for purine and Y for pyrimidine) is favorable for inter-strand base stacking within a DNA zipper structure and can form at RY or YR dinucleotide sequences when DNA is unwound. Indeed, hAT transposases, which are homologous to RAG (Fig. 1b), may rely on the conserved CA or TA sequence at the transposon end 27 to form a DNA zipper for the first cleavage.
Stabilization of the DNA zipper by mRAG
To understand how the RAG protein may influence DNA unwinding, we re-determined the mouse PRC structure by cryoEM at 3.6 Å resolution (see Methods, Extended Data Fig. 7). This cryoEM structure differs from the existing PRC crystal structure by a 20° rotation of the Y stem (consisting of NBD and nonamer regions of DNA) 9 (Extended Data Fig. 4c). The large difference is most likely due to crystallization because the Y stems in the cryoEM structures of PRC and NFC differ by less than 5° (Extended Data Fig. 4b). Apart from the Y stem, the PRC structures are similar (Extended Data Fig. 4c). Free from the effects of crystallization, the cryoEM PRC structure is slightly expanded, as observed previously for the hairpin-forming complex (HFC) 9. As a bonus, the cryoEM density map of mouse PRC reveals the trace of loop L12 (residues 606–617), which connects the first two β strands of the RNH domain (Supplementary Fig. 1, Extended Data Fig. 7h) and extends from one RAG1 subunit towards the other, forming trans-interactions with the RSS DNA on the opposite Y arm (Fig. 4a, 5a).
Fig. 4: Protein-DNA interactions in NFC.
a, Superimposition of cryoEM structures of PRC and NFC. The RAG protein in NFC is shown as semi-transparent green surface presentation. ZnH2 domains and one L12 loop are shown in green (NFC) and light blue (PRC) cartoons. DNAs are shown as yellow and orange (NFC) or grey (PRC) cartoons. Directions of ZnH2 domain and loop L12 movement from PRC to NFC are indicated by arrowheads, and those of 12- and 23-RSS DNA by curved arrows. b,c, ZnH2 domain contact the coding flank and heptamer in the minor groove in PRC(b), but in the major groove in NFC (c). d, ZnH2 domain and loop L12 (blue-green arrow) of RAG1 and the RSS DNA (red arrow) move in opposite directions during the PRC to NFC transition. One CF base (Gua, G) is shown to mark the DNA rotation. e, DNA cleavage activities of WT and R848A mutant mRAG on the normal substrate (DNA0). Mean values and SD were obtained from three independent samples.
Fig. 5: Different configurations of L12 loops in PRC, NFC and HFC.
a-c, The L12 loops bridge the dimer interface between RNH domains in PRC (a), NFC (b) and HFC (c). L12 tracks Ct1At2Ct3 of the heptamer in the major groove in PRC but in the minor groove in NFC. In HFC, each L12 forms both cis and trans interactions with the two RSS DNA substrates. Only one Me2+ ion (green sphere) is observed in each active site in PRC, but two are present in NFC and HFC, when the scissile phosphate (red and lilac spheres, respectively) is brought into the active site. DNA bases are shown as ladders except for those in the heptamer, where bases and sugars are fully shown.
When cryoEM structures of mouse PRC and NFC are superimposed, over all they are indistinguishable from RAG2 at the tip of the Y structure to NBD at the Y stem (Fig. 4a, Extended Data Fig. 4b). The main difference occurs in the ZnH2 domains (aa 793–951) of RAG1 subunits (Fig. 4a, Supplementary Video 2). ZnH2, which is an insertion in the RNH domain and forms an appendage on each Y arm, opens outward upon DNA binding (25° rotation and 13 Å translation) 9,10 and then closes inward in transitioning from PRC to NFC (12° rotation and 6 Å translation). In the PRC before DNA is unwound, R848 in ZnH2 is buried deep in the minor groove of the coding flank abutting the heptamer, and the nearby M849, N850, G851 and N852 track along the first three bases (CAC) of the RSS (Fig. 4b). In passing from PRC to NFC, each ZnH2 domain must traverse a DNA phosphosugar backbone and transition into the major groove. In the NFC structures, sidechains of M849 to N852 still track the backbone of CAC but now insert into the major groove, forming both hydrophobic and charge interactions with the DNA (Fig. 4c). Instead of serving as a “piston” to actively drive DNA untwisting as proposed 10,28, the movement of each ZnH2 domain is in the direction opposite to that of DNA unwinding (Fig. 4a, d).
If ZnH2 were to drive DNA untwisting, R848, which is inserted in the minor groove in PRC (Fig. 4b–c), would play a positive role in untwisting DNA and stabilizing the DNA zipper afterward (Fig. 2b). As RAG is not an ATPase and cannot move any domain in a directional manner, R848 may instead limit Brownian motions of ZnH2 and prevent it moving between DNA grooves. To discern how DNA untwisting is initiated and what role the ZnH2 domain plays, we substituted Ala for R848 and found that without the R848 sidechain, the mutant mRAG is more rather than less active in DNA nicking and hairpinning (Fig. 4e). R848 thus appears to act as a barrier to ZnH2 movement and DNA untwisting, and its inhibitory effect appears to outweigh its role in stabilizing the zipper.
Opposite ZnH2, loop L12, which is present but was not included in the model of the zebrafish structures due to limited resolution of the cryo-EM map10, contacts the heptamer DNA and also changes conformation from PRC to NFC (Fig. 4a). In PRC, loop L12 contacts the bottom strand of the heptamer mostly on the major groove side (Fig. 5a). Accompanying DNA unwinding, L12 transforms from a stubby hairpin in PRC to an elongated one in NFC (Fig. 5b), and the tip of the β turn (G610 and S611) moves >6.5 Å, again in a direction opposite to DNA unwinding (Supplementary Video 2). In NFC, L12 is entirely in the minor groove, forming close contacts with the top of the DNA zipper (Tb2At2) and stabilizing the phosphate group immediately downstream of the scissile phosphate. The unwound and flattened DNA zipper ((Tb2)At2(Gb3)Ct3) is sandwiched between ZnH2 on the major groove side and L12 on the minor groove side (Fig. 4c). Each RAG-heptamer DNA interface is increased by 370 Å2 from PRC to NFC.
The RAG active site and strand-specific nicking
The RAG1 active site is not fully assembled in PRC, as the third catalytic residue, E962, is 10 Å away from the first two, D600 and D708 (Extended Data Fig. 4e) 9. This is because helix αX (Supplementary Fig. 1), on which E962 is situated, is not properly oriented in the apo form or in PRC, where the scissile phosphate is far away from the active site. In NFC, when the scissile phosphate is placed in the active site upon formation of the DNA zipper, the ZnH2 domain becomes properly closed and the connected helix αX is re-oriented to bring E962(Q) into the active site (Extended Data Fig. 8a, Supplementary Video 1).
When E962 is far from D600 and D708, only one divalent cation (Mg2+ or Mn2+) is observed in the active site of mouse PRC, as revealed by both cryoEM and crystallography. As two divalent cations were modeled into the active site of zebrafish PRC 10, we soaked mouse PRC crystals in a buffer containing 5 mM Mn2+ and collected diffraction data to 3.2 Å resolution. Based on the anomalous signal of Mn2+, we confirmed that only one Mn2+ is in each active site (Extended Data Fig. 8b). The incomplete assembly of the mRAG active site, with its dislocated E962 and absence of the second divalent cation, ensures that the wrong (bottom) strand is not cleaved even when juxtaposed with the active site in PRC.
After hydrolysis of the top strand, the 3′-OH product remains in the active site and serves as the nucleophile for cleaving the bottom strand and forming a hairpin product 9. When transitioning from NFC to hairpin-forming complex (HFC, 2.75 Å) 9, the two Y arms of RAG and the bound DNA pivot around the DDBD in unison by 12–14° toward each other (Fig. 6a–b, Supplementary Videos 3, 4). Balancing the Y arm rotation, the NBD domains and the nonamers on the other side of DDBD (the pivoting point) undergo twice as much rotation (24°) and translation (12 Å) (Fig. 6a). In HFC, the heptamer resumes base pairing and moves sideways as a duplex by ~18 Å, thus moving the bottom strand into the active site. Accompanying the global scissor-like movement, the bottom strand is bent 90° immediately beyond the scissile phosphate for hairpin formation (Fig. 6b–e). A near 90° bend of the bottom strand has also been observed in the HFC structure of Hermes transposase 32. Meanwhile the two L12 loops dissociate from each other, and each moves as much as 17 Å. The β turn of loop L12 maintains contact with the heptamer in trans, while D604 and K618 at the base of L12 help to flip out the first coding-flank base (CFb1) in cis to orient the scissile phosphate for hairpin formation (Fig. 5c, 6d–e).
Fig. 6: Cutting two DNA strands in a single active site.
a, After superimposition of one RNH domain in NFC and HFC, RAG on the other Y-arm is shown as green (NFC) and pink (HFC) molecular surface. RSS DNAs are shown as yellow (top strand) and orange (bottom strand) (NFC) or grey cartoons (HFC). Movements of the protein and RSS DNA are indicated by color-coded arrows. b, The RNH domains of NFC (green) and HFC (pink) are superimposable, so are the scissile phosphates and CFt1 carrying the 3′-OH product of nicking, which is the nucleophile for hairpin formation. The cleavage strands in NFC (yellow) and HFC (orange) have opposite polarity and bind differently to RNH. CF t1 in HFC is shown as light pink sticks. c, A zoom-in view of the catalytic center from panel B. For the nicking and hairpinning reactions, the nucleophiles are on the opposite sides of the scissile phosphates, and E962 assumes different rotamer conformations. The directions of the two reactions are marked by dashed red arrows. d,e, Different protein and DNA structures between NFC (d) and HFC (e). The RNH domain is shown as electrostatic surface (blue and red represent the positive and negative charge potential, respectively), and DNAs are shown as yellow (top strand) and orange (bottom strand) cartoon. The protein surface is changed due to the large movement of loop L12 and reorientation of a few polar sidechains.
Discussion
DNA untwisting and the zipper formation
A remaining question is what causes DNA to untwist by 180° and form a zipper in the absence of an ATPase or other external energy source. It was proposed that RAG recombinase functions as a piston and its domain movement forcefully drives the DNA to unwind 10,28. But during the transition from the PRC to the NFC state in both mouse and zebrafish RAG, the ZnH2 domain and loop L12 move in the opposite direction to DNA unwinding (Fig. 4d), and thus it is unlikely that these protein movements are the cause of DNA unwinding. Moreover, the rest of the Y-shaped RAG dimer remains unchanged between PRC and NFC in both mouse (Extended Data Fig. 4b,e,f) and zebrafish 10. Even though the coding flank DNAs are untwisted by 180° and present different DNA grooves to the protein, both mouse and zebrafish RAGs accommodate DNA changes with the same interface 9,10. A “spring-loaded” motion of the protein might occur, but the cryoEM structures of mouse and zebrafish NFC offer no indication that it takes place. The energy source for DNA untwisting and extension is unclear. We suspect that the CAC sequence in the heptamer plays a key role in DNA untwisting and DNA zipper formation, while domain movement in RAG and engagement of the active site likely stabilize the unwound DNA. In agreement with this DNA-centric view, CAC in the heptamer sequence is conserved in all species undergoing V(D)J recombination, and any mutation in CAC diminishes DNA cleavage and V(D)J recombination 29–31. As DNA nicking and NFC formation are both temperature dependent for mouse and zebrafish RAG 10 (Extended Data Fig. 1), thermal energy likely supports DNA untwisting. Analogous to higher temperature, the abasic sites in DNA1 and DNA2 destabilize the double-helix structure by disrupting base stacking and thereby favor DNA distortion and nicking by RAG.
Two consecutive reactions in one active site
The catalytic RNH domain remains unchanged for DNA nicking by hydrolysis (NFC) and hairpin formation by transesterification (HFC), and the scissile phosphates for the two consecutive reactions are superimposable (Fig. 6b, Extended Data Fig. 8c). But E962 adopts rotamer conformations that differ by 120° in mouse HFC and NFC (Fig. 6c). The rearrangement is essential to avoid clashes with the DNA in HFC and accommodate the opposite polarity of the DNA substrate. Perhaps due to limited resolutions, in the zebrafish PRC and NFC structures, the E962 equivalent (E984) and αX were modeled identically to the HFC structure and the active site re-configurations were not noted 10. Nevertheless, the nucleophilic water for nicking and the 3′-OH for hairpin formation are situated on opposite sides of the scissile phosphate in both mouse and zebrafish RAG.
The flexibility of the RAG active site is largely due to two evolutionary changes in the RNase H domain. First, enzymes in the RNase H superfamily usually contain an Asp as the last carboxylate in the catalytic triad 33, but among most RNH-type transposases the Asp is replaced by Glu (E962 equivalent) to form the signature DDE motif 34. We hypothesize that because of the longer Glu side chain and more rotamer possibilities, this substitution allows a single active site in the transposases to change configurations and cleave two anti-parallel DNA strands as observed in the NFC and HFC of RAG. Second, the RNH transposases often acquire an insertion between the second and third catalytic carboxylates right before the last helix (αX in RAG1) in the RNH domain (equivalent to ZnH2 in RAG1) 11,15. The inserted domain helps to bind DNA strands of opposite polarity and enables the single active site to catalyze multiple reactions.
Concluding remarks
RAG recombinase is a specialized RNH transposase, whose primary function is to generate hairpin ends on coding flank DNA (Fig. 1a–b) for subsequent processing to generate diverse antigen receptors 2,3,35,36, without transposing the cleaved DNA to a new genomic site. Nevertheless, catalysis of multiple hydrolytic and trans-esterification reactions on double helical DNA is a general feature for all RNH-type transposases. Unlike RAG, the majority of these transposases cleave the bottom strand first (Fig. 1b) and thus probably do not require DNA untwisting by 180° or zipper formation. But, like RAG, to cleave two antiparallel DNA strands in a single active site will depend on structural rearrangements of both the DNA substrate and the catalytic residues during each reaction cycle.
Methods
Cell lines
HEK293T cells were originally obtained from Thermo Fisher Scientific and maintained as stock in the Yang laboratory. None of the cell lines used were authenticated or tested for mycoplasma contamination.
Protein and DNA preparation
The mRAG proteins, which comprise active (WT or R848A) or catalytic inactive mutant (E962Q) RAG1 (aa 265–1040) and degradation resistant T490A mutant RAG2 (aa 1–520), were expressed as N-terminal His6-MBP fusions (on both RAG1 and RAG2) in HEK293T cells and purified as previously described 4,9. The extended domains of RAG beyond the catalytic core regions help to reduce the preferred orientation problem on cryoEM grids. In addition to amylose affinity purification, a step of Mono Q anion exchange chromatography improved protein purity and eliminated a trace amount of DNA contamination. The buffer used in amylose affinity purification was 20 mM HEPES (pH 7.4), 500 mM KCl, 5% glycerol, 2 mM DTT, 0.5 mM EDTA. The salt concentration of protein samples coming off the amylose column was lowered to 100 mM before loading onto a Mono Q column (GE Healthcare) which was pre-equilibrated with 20 mM HEPES (pH 7.4), 100 mM KCl, 5% glycerol, 2 mM DTT, 0.5 mM EDTA. mRAG protein was eluted by a linear gradient of 100–500 mM KCl. The purified mRAG protein was buffer-exchanged into a storage buffer containing 20 mM HEPES (pH 7.4), 500 mM KCl, 20% glycerol, 0.1 mM EDTA, 2 mM DTT, concentrated to 6–8 mg/ml, and stored at −80°C. Human HMGB1 (amino acids 1–163) was prepared as reported previously 37.
DNAs of 12- and 23-RSS used for structural analyses and biochemical assays (Supplementary Table 1) were synthesized as ssDNA (Integrated DNA Technologies). Long oligonucleotides (>20 nucleotides) were purified by 8%–15% TBE-urea PAGE in a small gel cassette (Life Technologies). Gel purified oligonucleotides were then loaded onto a Glen Gel-Pak column (Glen Research) and eluted in deionized H2O. dsDNA was annealed in a Thermocycler in annealing buffer containing 20 mM Tris-HCl, pH 8.0, 0.5 mM EDTA, 50 mM NaCl.
DNA cleavage assays
All assays were performed in a reaction buffer containing 25 mM HEPES (pH 7.4), 100 mM KCl, 1 mM DTT, 0.1 mg/ml BSA, and 5 mM MgCl2. A histone H3K4me3 peptide was included to bind and activate the PHD domain in the extended RAG2 20. 50 nM each of 12- and 23-RSS DNAs with a Cy5- or FAM- label on the 20 bp coding flank (DNA0, DNA1 and DNA2 for cleavage assay, and pre-nicked WT substrates for hairpin-formation assay) (Supplementary Table 1) were incubated with 50 nM of heterotetrameric WT or mutant (R848A) mRAG (tetramer), 100 nM HMGB1 and 200 nM H3K4Me3 peptide (Epicypher) at 22 or 37°C for 0–40 min. Reactions were stopped by adding an equal volume of formamide buffer (95% (v/v) formamide, 12 mM EDTA and 0.3% bromophenol blue) and heating at 95°C for 10 min. Cleavage products were separated by 15% TBE-urea PAGE, visualized and quantified using a Typhoon PhosphorImager (GE Healthcare). Plots of biochemical data show the mean ± SD from three independent experiments using Graphpad Prism software (version 7.0).
CryoEM sample preparation and data collection
The procedure for assembly and purification of mRAG complexed with 12- and 23- RSSs was similar to that described previously 4,9. The purified mRAG (WT or E962Q) contained MBP-tags on both RAG1 and RAG2 subunits. These MBP tags further helped to even out orientation distributions on QUANTIFOIL R 1.2/1.3 (Cu, 300 mesh) grids. MBP-mRAG protein, 12- and 23-RSS DNAs (DNA0, DNA1 or DNA2) (Supplementary Table 1), HMGB1 (aa 1–163) and H3K4Me3 peptide were mixed at 1:1.2:1.2:2.4:4 molar ratio in buffer containing 20 mM HEPES (pH 7.4), 100 mM KCl, 5 µM ZnCl2, 1 mM DTT, 5% glycerol and 5 mM divalent cation (Mg2+ in the E962Q-DNA0 complex, and Ca2+ in the E962Q-DNA1, E962Q-DNA2, and WT-DNA0 complexes) and incubated at 37°C for 15 min. Each mixture of mRAG-DNA then was further purified at 4°C by size exclusion chromatography on a Superdex 200 Increase 10/300 GL column (GE Healthcare) in buffer containing 20 mM HEPES (pH 7.3), 100 mM KCl, 1% glycerol, 1 mM DTT, 5 mM divalent cation (Ca2+ or Mg2+). Only samples in the elution peak fractions were pooled and used for cryoEM grid preparation.
To capture DNA in the reactive but not yet cleaved NFC state, we incubated a catalysis-deficient (E962Q) mutant mRAG with WT DNA0, mutant DNA1 or DNA2 as substrate (Fig. 1c) at 22 or 37°C for 19 second to 5 min and prepared cryoEM grids at the selected temperature for data acquisition. For grids of PRC with DNA0, the purified sample (0.4 mg/ml) was loaded on C-flat CF-1.2/1.3–4C holey carbon grids (3 µl sample on each grid, same below) at 22°C or 37°C at 100% humidity, blotted for 4 s, and flash-frozen in liquid ethane in a Vitrobot. To prepare grids of NFC with DNA1 or DNA2, the samples (0.4 mg/ml) were first incubated at 22°C or 37°C from 15 s to 5 min and then frozen on QUANTIFOIL R 1.2/1.3 (Cu, 300 mesh) grids. The frozen grids were stored in liquid nitrogen before use.
For structure determination, the frozen grids of the E962Q samples were loaded into a Titan Krios electron microscope operated at 300 kV for automated image acquisition with Leginon 3.1 38 at University of California, Los Angeles (UCLA). Movies were recorded on a Gatan K2 Summit direct electron detector using the super-resolution mode at 130K nominal magnification (calibrated pixel size of 1.07 Å at the sample level, corresponding to 0.535 Å in super-resolution mode) and defocus values ranging from −1.4 to −3.0 µm. During data collection, the total dose was 57 e−/A2 on E962Q mutant PRC with DNA0, 42 e−/A2 on E962Q mutant NFC with DNA1, and 50–60 e−/A2 on E962Q mutant NFC with DNA2. The detailed collection statistics are shown in Table 1.
To determine if the DNA zipper structure exists in WT NFC, the WT mRAG-DNA0 complex was prepared by mixing purified WT mRAG and DNA0 and incubating at 37°C for 5 or 30 min before freezing on QUANTIFOIL R 1.2/1.3 (Cu, 300 mesh) grids. CryoEM data were collected on a Titan Krios electron microscope operated at 300 kV using the SerialEM program at the Multi-Institute Cryo-EM Facility (MICEF) of National Institutes of Health. Movies were recorded on a Gatan K2 Summit direct electron detector using the super-resolution mode at 130K nominal magnification (calibrated pixel size of 1.06 Å at the sample level, corresponding to 0.53 Å in super-resolution mode), with a total dose of 45 e−/A2 and defocus values ranging from −1.2 to −3.0 µm.
Structure determination and model refinement
All frames in each collected movie were aligned and summed to generate both dose-weighted and dose-unweighted micrographs using Motioncorr2 39. The latter were only used for defocus determination. Particles on dose-weighted micrographs were picked using Gautomatch (developed by Dr. K. Zhang; https://www.mrc-lmb.cam.ac.uk/kzhang/Gautomatch) and extracted in RELION-2.1 using a box size of 280 * 280 pixels 40. Using the extracted particles, initial maps were obtained with cryoSPARC 41, and then served as the reference for template-based particle picking in Gautomatch and 3D classification in RELION 42. 2D classification and 3D classification were used to remove contamination and screen for the most homogeneous particles used for in-depth 3D structural analyses. Criteria for selection are integrity or completeness of protein-DNA complexes and well-resolved protein secondary structures and DNA helices. In the mixed PRC and NFC structures of E962Q-DNA1 complexes (PDB: 6OEP and 6OEQ, Table 1), the half with the untwisted RSS DNA is superimposable with the pure NFC structures made of DNA1 or DNA2 (PDB: 6OEO and 6OER), and the other half is superimposable with the pure PRC structures made of DNA0 or DNA1 (PDB: 6OEM and 6OEN). In all PRC and NFC structures, tracings of HMGB1 are similar to but less complete than those in the HFC structures, probably due to the reduced resolution.
For WT mRAG complexed with DNA0 and E962Q with DNA1, a masked 3D classification without alignment was applied on either the12RSS-side or 23RSS-side to classify different conformations from the datasets 43 (Extended Data Fig. 2 and 5). Among WT RAG-DNA0 complexes, only ~10% of particles contained one untwisted RSS DNA (Extended Data Fig. 5). From the merged NFC particles after initial 3D classification, however, a pure NFC state was isolated by further classification of the 12RSS- and 23RSS-half separately, and averaging of the two NFC halves. We obtained a 3.7 Å averaged map containing a pure NFC state on one half and a mixed state of NFC and PRC on the other half of RAG complexes. The 3.6 Å map for model building of the NFC half was generated by using a soft mask covering the target region during postprocessing in RELION. The trimmed mutant NFC_DNA1 structure (6OEO) was used as the initial model for the WT NFC structure, containing only one RAG1-RAG2 heterodimer and one RSS DNA.
All reported resolutions are based on the “gold standard” refinement procedure and the 0.143 Fourier Shell Correlation (FSC) criterion 44. Local resolution was estimated using Resmap 45. For model building, we used the reported 3.15-Å PRC and 2.75-Å HFC crystal structures as initial models to build cryoEM structures of PRC and NFC, respectively. We first fit the coordinates into the cryoEM map using Chimera, and then manually adjusted and rebuilt the model according to the cryoEM density in COOT 46. Phenix real-space refinement was used to refine the model. MolProbity and EMRinger 47 were used to validate the final model. The refinement statistics are shown in Table 1. The detailed classifications and map qualities of mRAG complexed with DNA1, DNA2 and DNA0 are shown in the Supplemental Information (Extended Data Fig.2, 3, 5 and 7, respectively). The RMSD of 878 pairs of Cα atoms (one mRAG heterodimer, excluding the NBD domain) between the WT-DNA0 and E962Q-DNA1 NFC complexes is 0.6 Å, which is similar to the RMSD between the NFC states of DNA1 and DNA2 bound to E962Q mutant RAG (912 pairs of Cα atoms in one mRAG heterodimer and 0.7 Å).
To determine the NFC and PRC populations of mRAG complexed with DNA0, DNA1 or DNA2 at 22°C or 37°C, different cryoEM datasets were collected on either a 300 kV Titan Krios electron microscope or a 200 kV FEI Tecnai F20 electron microscope equipped with a Gatan K2 Summit direct electron detector. Motion correction, CTF-estimation and particle picking were done as described above. 2D classifications were done first to remove obvious contaminants. The selected particles from 2D classification were used to refine an initial model generated from cryoSPARC. Because of a substantial positional change of the ZnH2 domain between the pre-reaction and nick-forming state and the 180° rotation of DNA, which results in a switch between major and minor groove, it is easy to distinguish NFC from PRC even in moderate to low-resolution cryoEM reconstructions. Masked 3D classifications as described above were used to classify PRC or NFC conformations on both the12RSS and 23RSS sides. Then the percentages of PRC or NFC were counted with the sum of the two being 100%.
Molecular simulation of the untwisted DNA
Molecular simulations of the DNA duplex with sequence 5′-ACACAG-3′ were carried out using the GROMACS 5.1.4 simulation code 48 in combination with the PLUMED 2.4.3 plug-in 49. Simulations were run with either the Amber 14 DNA force field with explicit TIP3P water 24 or with the CHARMM 36 DNA force field with the modified CHARMM TIP3P water 25. Sodium and chloride ions were added to a total ionic strength of ~100 mM, and such that the net charge of the system was zero. Simulations were run using periodic boundary conditions, with a 6.5 nm truncated octahedron cell. Lennard-Jones interactions were treated with a twin-range cutoff with inner and outer radii of 0.9 and 1.4 nm respectively, while the electrostatic energy and forces were calculated via Particle Mesh Ewald with a grid spacing of 0.12 nm. All bonds were fixed in length using the LINCS constraint algorithm, and the equations of motion were integrated via a leapfrog algorithm with a 2 fs time step. The temperature was kept constant with a velocity rescaling thermostat 50, while a Parinello-Rahman barostat 51 was used to maintain the average pressure at 1 bar. Prior to starting each simulation, a short steepest descent energy minimization was run to relieve any close contacts introduced during the setup process.
Unbiased simulations were run starting from either the published structure with bases flipped out (PDB: 6DBR) or from the DNA zipper structure determined in the present work, and were run at constant temperature and pressure for ~120 ns with both the Amber 14 and CHARMM 36 force fields. To maintain the DNA distortion, position restraints were applied to the terminal residues on each strand with a force constant of 1000 kJ.mol−1 nm−2 in each dimension. Otherwise, the dynamics of the interior residues was completely unrestrained.
A second set of simulations was run by applying a twisting and stretching force to the same DNA duplex in an initially canonical B-DNA structure. The bias was applied only to the terminal residues of each strand, by defining a distance matrix RMSD (DRMSD) 52 coordinate comprising all the heavy atoms of the terminal residues, relative to the experimental structure. This coordinate has a minimum when the relative positions of these atoms are the same as in the experimental structure. A time-dependent umbrella bias of the form was employed, with the target value of the coordinate being linearly reduced from its value in the initial structure to zero over the course of the 40 ns biased simulation. A force constant kumb of 10,000 kJ.mol−1 nm−2 was used.
Determination of the number of divalent cations in the active site of PRC
Crystals of WT core mRAG1-RAG2 (aa 384–1008 and aa 1–359, respectively) complexed with a nicked 12RSS and intact 23RSS were grown as previously described 9. The complex was assembled in the purification buffer containing 1 mM Ca2+. Dehydration and Mn2+ soaking of the crystals, X-ray diffraction data collection and processing were carried out as described previously9. Searching for Mn2+ and Zn2+ was perform using AUTOSOL 53,54 based on the anomalous diffraction data, RAG sequence and a structural model of RAG recombinase (PDB: 6CIM) in the absence of metal ions and HMGB chains. Finally, two Zn2+ and two Mn2+ were found in each RAG molecule comprising two RAG1 and two RAG2 subunits (Extended Data Fig. 8b).
Extended Data
Extended Data Fig. 1. Cleavage and cryoEM analysis of DNA substrates in NFC.
a-b, Cleavage efficiencies (nicking and hairpinning) of the three DNA variants by WT mRAG at 22 and 37°C (mean and s.d., n= 3 independent samples). c, Percentage of NFC and PRC (NFC/PRC) in cryoEM 3D classification from samples made of DNA0, DNA1 or DNA2 substrate with E962Q mutant mRAG at 22 and 37°C. Asterisk (*) indicates that the dataset was collected on a Tecnai F20 electron microscope instead of Titan Krios
Extended Data Fig. 2. Structure determination of mouse NFC with DNA1 by cryoEM.
a, Flow chart for cryoEM data processing of mRAG complexed with DNA1. The maps with red bold letter are used for final model building. b, A surface presentation of the 3.7 Å NFC (DNA1) map (C1 symmetry). Colors are according to the local resolution estimated by ResMap, and the color scale bar is shown on its right. c, Angular distributions of all particles used for the final three-dimensional reconstruction shown in b. d, The FSC curves of the NFC (DNA1) map (C1). The “gold standard” FSC between two independent halves of the map (black line) indicates a resolution of 3.7 Å, and the blue line is the FSC between the final refined model and the final map. e to i, Representative regions of the C1 map (transparent grey surface). The maps are shown with the final structural models (cartoon or stick) superimposed.
Extended Data Fig. 3. Structure determination of mouse NFC with DNA2 by cryoEM.
a, Flow chart for cryoEM data processing of mRAG complexed with DNA2. The maps with red bold letter are used for final model building. b, A surface presentation of the 3.3 Å NFC (DNA2) map (C1 symmetry). Colors are according to the local resolution estimated by ResMap, and the color scale bar is shown on its right. c, Angular distributions of all particles used for the final three-dimensional reconstruction shown in b. d, The FSC curves of NFC (DNA2) map (C1). The “gold standard” FSC between two independent halves of the map (black line) indicates a resolution of 3.3 Å, and the blue line is the FSC between the final refined model and the final map. e to i, Representative regions of the C1 map (transparent grey surface). The maps are shown with the final structural models (cartoon or stick) superimposed.
Extended Data Fig. 4. Structural comparisons of mouse PRC and NFC.
a, CryoEM structures of NFC with DNA1 (green) and DNA2 (blue) are superimposable. b, Comparison of mouse PRC and NFC structures. Superposition of cryoEM PRC (red) and NFC (DNA1) (green) structures reveals limited NBD and nonamer movement, which is marked with blue dashed circle (right panel). c, Superposition of crystal (grey) and cryoEM (red) PRC structures reveals the different NBD and nonamer region (circled in red dashes) due to crystal-lattice contacts. d-e, The zoom-in views of the active center and DNA distortions in the superimposed structures shown in a-b. The catalytic DDE motif and two metal ions (a and b) are labeled; the heptamer of RSS DNA is shown in detailed cartoon presentation; the scissile phosphate of top strand is marked by a large ball. In panel d, bases forming the DNA zipper are labeled. f, A zoom-in view of boxed area in panel b. RAG2 interacts with the minor groove in PRC or the major groove in NFC.
Extended Data Fig. 5. Structure determination of mouse WT NFC with DNA0 by cryoEM.
a, Flow chart for cryoEM data processing of WT mRAG complexed with DNA0. Data processing was done using RELION. The 3.6Å map labeled in red was used for final model building. b, A surface presentation of the 3.6 Å map of WT NFC (DNA0). Colors are according to the local resolution estimated by ResMap, and the color scale bar is shown on its right. c, Angular distributions of all particles used for the final three-dimensional reconstruction. d, The FSC curves of WT NFC (DNA0) map. The “gold standard” FSC between two independent halves of the map indicates an overall resolution of 3.6 Å. e-h, Representative regions of the map (transparent grey surface). The refined zipper DNA fits the map better (e) than the melted DNA (PDB: 6DBR) (f).
Extended Data Fig. 6. Molecular simulations of the untwisted region of DNA.
a, b, Unbiased simulations were run starting from the base-flipped out structure (PDB: 6DBR) (a), and the zipper structure (b) using the Amber 14 force field. Plotted in each case are the all-atom RMSD to the base-flipped out structure (black) and the zippered structure (red). The structures at the start and end of each run are shown above the RMSD plots. c, d, The analogous results are given for simulations with the CHARMM 36 force field. e, Biased simulations were run from a canonical B-DNA form, in which the terminal residues were driven to mimic the stretched and untwisted DNA observed in the mouse and zebrafish NFC structures. The RMSD and initial and final structures are shown as before.
Extended Data Fig. 7. Structure determination of mouse PRC with DNA0 by cryoEM.
a, Flow chart for the cryoEM data processing of mRAG complexed with DNA0. The maps with red bold letter are used for final model building. b, A surface presentation of the 3.6 Å NFC map (C1 symmetry). Colors are according to the local resolution estimated by ResMap, and the color scale bar is shown on its right. c, Angular distributions of all particles used for the final three-dimensional reconstruction. d, The FSC curves of PRC (DNA0) map (C1). The “gold standard” FSC between two independent halves of the map (black line) indicates a resolution of 3.6 Å, and the blue line is the FSC between the final refined model and the final map. e to i, Representative regions of the C1 map (transparent grey surface). The maps are shown with the final structural models (cartoon or stick) superimposed. j, The maps of 23RSS in PRC (DNA0), NFC (DNA1) and NFC (DNA2).
Extended Data Fig. 8. Remodeling of the active site in NFC and HFC.
a, Repositioning of αX in the RNH domain during PRC to NFC transition (both are cryoEM structures). E962 is far from the active site in PRC (light blue) but is positioned for catalysis in NFC (green). b, Anomalous X-ray scattering of the PRC crystals confirms that one Mn2+ and one Zn2+ are bound to each RAG1 subunit. The anomalous map is contoured at 3σ in red. The blue 2Fo-Fc map (contoured at 1σ) highlights R848, which is buried in the minor groove. c, The re-configured E962 in HFC (pink) after the first DNA cleavage by nicking (green).
Supplementary Material
The two basepairs (AC/(TG)) undergoing zipper formation are shown as sticks and the rest of DNA is simplified as tube and ladder cartoon. The active site DDE are shown in red sticks, and divalent cations are represented as green spheres. The second Mg2+ ion appears only at the end because it binds in a fully formed active site with substrate in place, but not in PRC. The red sphere on the DNA strand marks the scissile phosphate for nicking, and DNA becomes nicked when two Mg2+ ions are properly bound.
The mRAG protein is shown as semi-transparent molecular surface. ZnH2 domains (green and light blue cartoon) of RAG1 move significantly and two L12 loops (green and light blue coils) become extended. The second metal ion (green sphere) is captured in the active site (marked by D600, D708 and E962 in red sticks) only when E962 and scissile phosphate are in the active configuration. The 12/23RSS DNAs are shown as yellow and orange tube-and-ladders.
One RAG1-RAG2 heterodimer (bound to 12RSS, yellow) is superimposed between the two structures and shown in semi-transparent molecular surface. The second RAG1-RAG2 heterodimer (bound to 23RSS, orange) shown as colored cartoon moves towards RAG-12RSS, delivering the bottom strand on each RSS DNA into the active site. Two L12 loops (green and light blue coil) of RAG1 subunits move away from the RAG1 interface. The configuration of the active center (red stick-and-balls for catalytic residues) except for E962 remains unchanged, and the closing motion positions the scissile phosphates on bottom strands into the active centers. The green spheres represent Mg2+ ions. The second Mg2+ ion appears only when the active site is fully formed, but not when DNA undergoes conformational changes. When both Mg2+ ions are properly bound in the active site, DNA cleavage / transesterification reaction takes place as shown.
The scissile phosphates for nicking and hairpinning of DNA1 are highlighted by red and pink spheres. The active site DDE are shown as red sticks, and two metal ions in the NFC and HFC states are shown as green spheres. When the catalytic carboxylates DDE are not fully aligned or the scissile phosphate is not captured in the active site, such as during conformational changes of RAG and DNA, only one metal ion may be retained.
Acknowledgement
This research was supported by the National Institute of Diabetes and Digestive and Kidney Diseases to M.G. (DK036167), W.Y. (DK036147 and DK036144) and ZHZ (GM071940). The authors acknowledge the use of instruments at the Electron Imaging Center for NanoMachines supported by NIH (1S10RR23057, 1S10OD018111 and U24GM116792), NSF (DBI-1338135 and DMR-1548924) and CNSI at UCLA.
Footnotes
Competing interests
The authors declare no competing interest.
Data availability
The accession numbers for the cryoEM structures and associated density maps of the mouse PRC and NFC complexes reported in this paper have been deposited to the PDB and EMDB under accession codes PDB 6OEM to 6OER and 6V0V and EMD-20030 to EMD-20035, EMD-20038, EMD-20039 and EMD-21003, as specified in Table 1.
Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
References
- 1.Mizuuchi K. Transpositional recombination: mechanistic insights from studies of mu and other elements. Annu Rev Biochem 61, 1011–51 (1992). [DOI] [PubMed] [Google Scholar]
- 2.Gellert M. V(D)J recombination: RAG proteins, repair factors, and regulation. Annu Rev Biochem 71, 101–32 (2002). [DOI] [PubMed] [Google Scholar]
- 3.Schatz DG & Swanson PC. V(D)J recombination: mechanisms of initiation. Annu Rev Genet 45, 167–202 (2011). [DOI] [PubMed] [Google Scholar]
- 4.Kim MS, Lapkouski M, Yang W & Gellert M. Crystal structure of the V(D)J recombinase RAG1-RAG2. Nature 518, 507–11 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sakano H, Huppi K, Heinrich G & Tonegawa S. Sequences at the somatic recombination sites of immunoglobulin light-chain genes. Nature 280, 288–94 (1979). [DOI] [PubMed] [Google Scholar]
- 6.Lewis SM. The mechanism of V(D)J joining: lessons from molecular, immunological, and comparative analyses. Adv Immunol 56, 27–150 (1994). [DOI] [PubMed] [Google Scholar]
- 7.Lapkouski M, Chuenchor W, Kim MS, Gellert M & Yang W. Assembly Pathway and Characterization of the RAG1/2-DNA Paired and Signal-end Complexes. J Biol Chem 290, 14618–25 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ru H. et al. Molecular Mechanism of V(D)J Recombination from Synaptic RAG1-RAG2 Complex Structures. Cell 163, 1138–1152 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kim MS. et al. Cracking the DNA Code for V(D)J Recombination. Mol Cell 70, 358–370 e4 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ru H. et al. DNA melting initiates the RAG catalytic pathway. Nat Struct Mol Biol 25, 732–742 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Davies DR, Goryshin IY, Reznikoff WS & Rayment I. Three-dimensional structure of the Tn5 synaptic complex transposition intermediate. Science 289, 77–85 (2000). [DOI] [PubMed] [Google Scholar]
- 12.Richardson JM, Colloms SD, Finnegan DJ & Walkinshaw MD. Molecular architecture of the Mos1 paired-end complex: the structural basis of DNA transposition in a eukaryote. Cell 138, 1096–108 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hare S, Gupta SS, Valkov E, Engelman A & Cherepanov P. Retroviral intasome assembly and inhibition of DNA strand transfer. Nature 464, 232–6 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Montano SP, Pigli YZ & Rice PA. The mu transpososome structure sheds light on DDE recombinase evolution. Nature 491, 413–7 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hickman AB. et al. Structural basis of hAT transposon end recognition by Hermes, an octameric DNA transposase from Musca domestica. Cell 158, 353–67 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Passos DO. et al. Cryo-EM structures and atomic model of the HIV-1 strand transfer complex intasome. Science 355, 89–92 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yusa K. piggyBac Transposon. Microbiol Spectr 3, MDNA3–0028-2014 (2015). [DOI] [PubMed] [Google Scholar]
- 18.Lesbats P, Engelman AN & Cherepanov P. Retroviral DNA Integration. Chem Rev 116, 12730–12757 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nowotny M, Gaidamakov SA, Crouch RJ & Yang W. Crystal structures of RNase H bound to an RNA/DNA hybrid: substrate specificity and metal-dependent catalysis. Cell 121, 1005–16 (2005). [DOI] [PubMed] [Google Scholar]
- 20.Grundy GJ, Yang W & Gellert M. Autoinhibition of DNA cleavage mediated by RAG1 and RAG2 is overcome by an epigenetic signal in V(D)J recombination. Proc Natl Acad Sci U S A 107, 22487–92 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Grundy GJ, Hesse JE & Gellert M. Requirements for DNA hairpin formation by RAG1/2. Proc Natl Acad Sci U S A 104, 3078–83 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Mills JB & Hagerman PJ. Origin of the intrinsic rigidity of DNA. Nucleic Acids Res 32, 4055–9 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yakovchuk P, Protozanova E & Frank-Kamenetskii MD. Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Res 34, 564–74 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zgarbova M. et al. Refinement of the Cornell et al. Nucleic Acids Force Field Based on Reference Quantum Chemical Calculations of Glycosidic Torsion Profiles. J Chem Theory Comput 7, 2886–2902 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hart K. et al. Optimization of the CHARMM additive force field for DNA: Improved treatment of the BI/BII conformational equilibrium. J Chem Theory Comput 8, 348–362 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Rubio-Cosials A. et al. Transposase-DNA Complex Structures Reveal Mechanisms for Conjugative Transposition of Antibiotic Resistance. Cell 173, 208–220 e20 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Atkinson PW. hAT Transposable Elements. Microbiol Spectr 3(2015). [DOI] [PubMed] [Google Scholar]
- 28.Ru H, Zhang P & Wu H. Structural gymnastics of RAG-mediated DNA cleavage in V(D)J recombination. Curr Opin Struct Biol 53, 178–186 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ramsden DA, McBlane JF, van Gent DC & Gellert M. Distinct DNA sequence and structure requirements for the two steps of V(D)J recombination signal cleavage. EMBO J 15, 3197–206 (1996). [PMC free article] [PubMed] [Google Scholar]
- 30.Hesse JE, Lieber MR, Mizuuchi K & Gellert M. V(D)J recombination: a functional definition of the joining signals. Genes Dev 3, 1053–61 (1989). [DOI] [PubMed] [Google Scholar]
- 31.Hu J. et al. Chromosomal Loop Domains Direct the Recombination of Antigen Receptor Genes. Cell 163, 947–59 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hickman AB. et al. Structural insights into the mechanism of double strand break formation by Hermes, a hAT family eukaryotic DNA transposase. Nucleic Acids Res 46, 10286–10301 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Nowotny M. Retroviral integrase superfamily: the structural perspective. EMBO Rep 10, 144–51 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yuan YW & Wessler SR. The catalytic domain of all eukaryotic cut-and-paste transposase superfamilies. Proc Natl Acad Sci U S A 108, 7884–9 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Boboila C, Alt FW & Schwer B. Classical and alternative end-joining pathways for repair of lymphocyte-specific and general DNA double-strand breaks. Adv Immunol 116, 1–49 (2012). [DOI] [PubMed] [Google Scholar]
- 36.Deriano L & Roth DB. Modernizing the nonhomologous end-joining repertoire: alternative and classical NHEJ share the stage. Annu Rev Genet 47, 433–55 (2013). [DOI] [PubMed] [Google Scholar]
- 37.Grundy GJ, Ramon-Maiques S, Dimitriadis EK, Kotova S, Biertumpfel C, Heymann JB, Steven AC, Gellert M, and Yang W . Initial stages of V(D)J recombination: the organization of RAG1/2 and RSS DNA in the postcleavage complex. Mol Cell 35(2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Suloway C. et al. Automated molecular microscopy: the new Leginon system. J Struct Biol 151, 41–60 (2005). [DOI] [PubMed] [Google Scholar]
- 39.Zheng SQ, Palovcak E, Armache JP, Verba KA, Cheng Y, and Agard DA. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat Methods 14, 331–332 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fernandez-Leiro R, and Scheres SHW. A pipeline approach to single-particle processing in RELION. Acta Crystallogr D Struct Biol 73, 496–502 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Punjani A, Rubinstein JL, Fleet DJ, and Brubaker MA. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14(2017). [DOI] [PubMed] [Google Scholar]
- 42.Scheres SH. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol 180, 519–530 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bai XC, Rajendra E, Yang G, Shi Y & Scheres SH. Sampling the conformational space of the catalytic subunit of human γ-secretase. eLife 4, e11182 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Swint-Kruse L, and Brown CS. Resmap: automated representation of macromolecular interfaces as two-dimensional networks. Bioinformatics 21, 3327–3328 (2005). [DOI] [PubMed] [Google Scholar]
- 45.Kucukelbir A, Sigworth FJ, and Tagare HD. Quantifying the local resolution of cryo-EM density maps. Nat Methods 11, 63–65 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Emsley P, Lohkamp B, Scott WG, and Cowtan K. Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486–501 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Barad BA, Echols N, Wang RY, Cheng Y, DiMaio F, Adams PD, and Fraser JS. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat Methods 12, 943–946 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Páll S, Abraham MJ, Kutzner C, Hess B & Lindahl E. Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS. Lecture Notes in Computer Science 8759(2015). [Google Scholar]
- 49.Tribello GA, Bonomi M, Branduardi D, Camilloni C & Bussi G. PLUMED2:Newfeathersforanoldbird. Comput. Phys. Commun 185(2014). [Google Scholar]
- 50.Bussi G, Donadio D & Parrinello M. Canonical sampling through velocity rescaling. J. Chem. Phys 126, 014101 (2007). [DOI] [PubMed] [Google Scholar]
- 51.Parrinello M & Rahman A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys 52, 7182–7190 (1981). [Google Scholar]
- 52.Domanski J, Sansom MSP, Stansfeld PJ & Best RB. Balancing Force Field Protein-Lipid Interactions To Capture Transmembrane Helix-Helix Association. J Chem Theory Comput 14, 1706–1715 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Grosse-Kunstleve RW & Adams PD. Substructure search procedures for macromolecular structures. Acta Crystallogr D Biol Crystallogr 59, 1966–73 (2003). [DOI] [PubMed] [Google Scholar]
- 54.McCoy AJ, Storoni LC & Read RJ. Simple algorithm for a maximum-likelihood SAD function. Acta Crystallogr D Biol Crystallogr 60, 1220–8 (2004). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
The two basepairs (AC/(TG)) undergoing zipper formation are shown as sticks and the rest of DNA is simplified as tube and ladder cartoon. The active site DDE are shown in red sticks, and divalent cations are represented as green spheres. The second Mg2+ ion appears only at the end because it binds in a fully formed active site with substrate in place, but not in PRC. The red sphere on the DNA strand marks the scissile phosphate for nicking, and DNA becomes nicked when two Mg2+ ions are properly bound.
The mRAG protein is shown as semi-transparent molecular surface. ZnH2 domains (green and light blue cartoon) of RAG1 move significantly and two L12 loops (green and light blue coils) become extended. The second metal ion (green sphere) is captured in the active site (marked by D600, D708 and E962 in red sticks) only when E962 and scissile phosphate are in the active configuration. The 12/23RSS DNAs are shown as yellow and orange tube-and-ladders.
One RAG1-RAG2 heterodimer (bound to 12RSS, yellow) is superimposed between the two structures and shown in semi-transparent molecular surface. The second RAG1-RAG2 heterodimer (bound to 23RSS, orange) shown as colored cartoon moves towards RAG-12RSS, delivering the bottom strand on each RSS DNA into the active site. Two L12 loops (green and light blue coil) of RAG1 subunits move away from the RAG1 interface. The configuration of the active center (red stick-and-balls for catalytic residues) except for E962 remains unchanged, and the closing motion positions the scissile phosphates on bottom strands into the active centers. The green spheres represent Mg2+ ions. The second Mg2+ ion appears only when the active site is fully formed, but not when DNA undergoes conformational changes. When both Mg2+ ions are properly bound in the active site, DNA cleavage / transesterification reaction takes place as shown.
The scissile phosphates for nicking and hairpinning of DNA1 are highlighted by red and pink spheres. The active site DDE are shown as red sticks, and two metal ions in the NFC and HFC states are shown as green spheres. When the catalytic carboxylates DDE are not fully aligned or the scissile phosphate is not captured in the active site, such as during conformational changes of RAG and DNA, only one metal ion may be retained.














