Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jun 13.
Published in final edited form as: Nat Struct Mol Biol. 2022 Apr 14;29(4):395–402. doi: 10.1038/s41594-022-00756-0

CRISPR-Cas9 bends and twists DNA to read its sequence

Joshua C Cofsky 1,2,3, Katarzyna M Soczek 1,2,3, Gavin J Knott 4, Eva Nogales 1,2,5,6, Jennifer A Doudna 1,2,3,5,6,7,8,*
PMCID: PMC9189902  NIHMSID: NIHMS1807894  PMID: 35422516

Abstract

In bacterial defense and genome editing applications, the CRISPR-associated protein Cas9 searches millions of DNA base pairs to locate a 20-nucleotide, guide-RNA-complementary target sequence that abuts a protospacer-adjacent motif (PAM). Target capture requires Cas9 to unwind DNA at candidate sequences using an unknown ATP-independent mechanism. Here we show that Cas9 sharply bends and undertwists DNA upon PAM binding, thereby flipping DNA nucleotides out of the duplex and toward the guide RNA for sequence interrogation. Cryo-electron-microscopy (EM) structures of Cas9:RNA:DNA complexes trapped at different states of the interrogation pathway, together with solution conformational probing, reveal that global protein rearrangement accompanies formation of an unstacked DNA hinge. Bend-induced base flipping explains how Cas9 “reads” snippets of DNA to locate target sites within a vast excess of non-target DNA, a process crucial to both bacterial antiviral immunity and genome editing. This mechanism establishes a physical solution to the problem of complementarity-guided DNA search and shows how interrogation speed and local DNA geometry may influence genome editing efficiency.

Introduction

CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats, CRISPR-associated) nucleases provide bacteria with RNA-guided adaptive immunity against viral infections1 and serve as powerful tools for genome editing in human, plant and other eukaryotic cells2. The basis for Cas9’s utility is its DNA recognition mechanism, which involves base-pairing of one DNA strand with 20 nucleotides of the guide RNA to form an R-loop. The guide RNA’s recognition sequence, or “spacer,” can be chosen to match a desired DNA target, enabling programmable site-specific DNA selection and cutting3. The search process that Cas9 uses to comb through the genome and locate rare target sites requires local unwinding to expose DNA nucleotides for RNA hybridization, but it does not rely on an external energy source such as ATP hydrolysis4,5. This DNA interrogation process defines the accuracy and speed with which Cas9 induces genome edits, yet the mechanism remains unknown.

Molecular structures of Cas96 in pre- and post-DNA bound states revealed that the protein’s REC (“recognition”) and NUC (“nuclease”) lobes can rotate dramatically around each other, assuming an “open” conformation in the apo Cas9 structure7 and a “closed” conformation in the Cas9:guide-RNA8,9 and Cas9:guide-RNA:DNA R-loop9 structures. Furthermore, single-molecule experiments established the importance of PAMs (5′-NGG-3′) for pausing at candidate targets5,10, and R-loop formation was found to occur through directional strand invasion beginning at the PAM5 (Fig. 1a). However, these findings did not elucidate the actions that Cas9 performs to interrogate each candidate target sequence. These actions, repeated over and over, comprise the slowest phase of Cas9’s bacterial immune function and its induction of site-specific genome editing11. Understanding the mechanism of DNA interrogation is critical to determining how Cas9 searches genomes to find bona fide targets and exclude the vast excess of non-target sequences.

Fig. 1 |. Trapping the Cas9 interrogation complex.

Fig. 1 |

a, Known steps leading to Cas9-catalyzed DNA cleavage. Orange/black arrows indicate direction of guide RNA strand invasion into the DNA helix. Magenta X indicates location of cystamine modification. b, Chemistry of protein:DNA cross-link. c, RNA and DNA sequences used in structural studies. d, Sharpened cryo-EM map (threshold 7σ) and model of the cross-linked Cas9:sgRNA:DNA complex (0 RNA:DNA matches, bent DNA), centered on density contributed by the non-native thioalkane cross-link.

Results and Discussion

Covalent cross-linking of Cas9 to DNA stabilizes the interrogation complex

Evidence that Cas9’s target engagement begins with PAM binding5 implies that during genome search, there exists a transient “interrogation state” in which Cas9:guide RNA has engaged with a PAM but not yet formed RNA:DNA base pairs (Fig. 1a). Cas9:guide-RNA complexes must repeatedly visit the interrogation state at each surveyed PAM, irrespective of the sequence of the adjacent 20-base-pair (bp) candidate complementarity region (CCR). While this state is the key to Cas9’s DNA search mechanism, the interrogation complex has so far evaded structure determination due to its transience, with an estimated lifetime of <30 milliseconds in bacteria11.

To trap the Cas9 interrogation complex, we replaced residue Thr1337 with cysteine in S. pyogenes Cas9, the most widely used genome editing enzyme, and combined this protein with a single-guide RNA (sgRNA)3 and a 30-bp DNA molecule functionalized with an N4-cystamine cytosine modification12 (Fig. 1b). The DNA included a PAM but lacked any complementarity to the sgRNA spacer (Fig. 1c). Reaction of the cysteine thiol with the cystamine creates a protein:DNA disulfide cross-link on the side of the PAM distal to the site of R-loop initiation (Fig. 1b,d). The position of the cross-link was chosen based on previous high-resolution structures of the Cas9:PAM interface9,13 (Extended Data Fig. 1a). Incubation of Cas9 T1337C with sgRNA and the modified DNA duplex resulted in a decrease in electrophoretic mobility for ~70% of the total protein mass under denaturing but non-reducing conditions (Extended Data Fig. 1b), consistent with protein:DNA cross-link formation. The cross-link did not inhibit Cas9’s ability to cleave sgRNA-complementary DNA (Extended Data Fig. 1c), suggesting that the enzyme is not grossly perturbed by the introduced disulfide. More importantly, mechanistic hypotheses revealed by cross-linked complexes can be tested in non-cross-linked complexes.

We subjected the cross-linked interrogation complex to cryo-EM imaging and analysis (Extended Data Figs. 2, 3ab). Ab initio volume reconstruction, refinement and modeling revealed two structural states of the complex. In one, the DNA lies as a linear duplex across the surface of the open form of the Cas9 ribonucleoprotein (Fig. 2a). Remarkably, in the other state, Cas9’s two lobes pinch the DNA into a V shape whose helical arms meet at the site of R-loop initiation, employing a bending mode that underwinds the DNA duplex (Fig. 2b).

Fig. 2 |. Cryo-EM structures of the Cas9 interrogation complex, compared to previously determined crystal structures.

Fig. 2 |

a, Unsharpened cryo-EM map (threshold 4σ) of Cas9 interrogation complex in open-protein/linear-DNA conformation, alongside apo Cas9 crystal structure (PDB 4CMP, 2Fo-Fc, threshold 1.5σ). b, Closed-protein/bent-DNA conformation (threshold 5σ), alongside Cas9:sgRNA crystal structure (PDB 4ZT0, 2Fo-Fc, threshold 1.5σ). Green, NUC lobe; blue, REC lobe domains 1/2; light blue, REC lobe domain 3; orange, guide RNA; magenta, DNA. Maps produced in the present work are displayed with full opacity. REC lobe domain 3 does not appear in the cryo-EM structure in a (see Supplementary Information). Additional classes observed in the closed-protein state are shown in Extended Data Fig. 3a.

The linear-DNA conformation reveals a DNA scanning state of Cas9

In the “linear-DNA” conformation (Fig. 2a), the interface of the DNA with the PAM-interacting domain is similar to that seen in the crystal structure of Cas9:R-loop9 (Extended Data Fig. 3c). However, the REC lobe of the protein is in a position radically different from that observed in all prior structures of nucleic-acid-bound Cas98,9,13,14, having rotated away from the NUC lobe into an open-protein conformation that resembles the apo Cas9 crystal structure7 (Fig. 2a).

Notably, a linear piece of DNA docked into the PAM-binding cleft would result in a severe structural clash4 in either the Cas9:R-loop9 or the Cas9:sgRNA8 crystal structure but only a minor one in the apo Cas9 crystal structure7, which can be relieved by slightly tilting and bending the DNA (Extended Data Fig. 3c). We propose that the open-protein conformation, originally thought to be unique to nucleic-acid-free Cas9, can also be adopted by the sgRNA-bound protein to enable its interaction with linear DNA. Indeed, cryo-EM analysis of the Cas9:sgRNA complex revealed only particles in the open-protein state (Extended Data Fig. 4), indicating that the original crystal structure of Cas9:sgRNA8, which was in a closed-protein state, represented only one possible conformation of the complex that happened to be captured in that crystal form. Single-molecule Förster resonance energy transfer experiments also support the ability of Cas9:sgRNA to access both closed and open conformations15. The linear-DNA/open-protein conformation captured in our cryo-EM structure, then, may represent the conformation of Cas9 during any process for which it must accommodate a piece of linear DNA, such as during sliding10 or initially engaging with a PAM.

The bent-DNA conformation reveals PAM-adjacent DNA unwinding by Cas9

In the “bent-DNA” Cas9 interrogation complex, the protein grips the PAM as in the linear-DNA complex. The CCR (candidate complementarity region), on the other hand, is tilted at a 50° angle to the PAM-containing helix and leans against the REC lobe, which has risen into the same “closed” position as in the Cas9:sgRNA crystal structure (Fig. 2b, 3a). Compared to the high-resolution cryo-EM density contributed by the PAM-containing duplex, CCR density is poorly resolved (Fig. 3a, Extended Data Fig. 3b), reflecting conformational heterogeneity; however, even at low resolution, its distinct helical shape (Fig. 2b, 3a) enabled construction of an atomic model that adheres to B-form DNA constraints between the PAM-distal tip and position +3 of the CCR (Fig. 3b). Connection of this B-form helix to the PAM-containing helix requires backbone distortion and helical underwinding at precisely the position from which an R-loop would be initiated if the sgRNA were complementary to the CCR5 (Fig. 3b,c). Underwinding is accomplished through major groove compression (Fig. 3b), as observed for other protein-induced bends1623.

Fig. 3 |. DNA conformation at the site of bending.

Fig. 3 |

a, Unsharpened cryo-EM map (threshold 5σ) of Cas9 interrogation complex in closed-protein/bent-DNA conformation. Green, NUC lobe; blue, REC lobe; orange, guide RNA; magenta, DNA. b, Bent DNA model. CCR, candidate complementarity region. Yellow, PAM; cyan, target-strand Thy(+1) and Thy(+2). c, DNA (black) and RNA (orange) models, demonstrating proximity of the first DNA:RNA base pairs (cyan) that would form if they were complementary. d, DNA model within sharpened cryo-EM density (threshold 7.5σ). The break in density in the target strand indicates dramatic conformational heterogeneity. The modeled conformation of Thy(+1) and Thy(+2) (cyan), which represents just one possible conformation within a diverse ensemble, was chosen based on permanganate reactivity data and geometric constraints imposed by neighboring nucleotides.

At the distorted bending vertex, a missing wedge of cryo-EM density appears across from non-target-strand nucleotides Ade(+1) and Ade(+2), suggesting that target-strand nucleotides Thy(+1) and Thy(+2) have become unpaired from their partners (Fig. 3d). The overall weakness of density for Thy(+1) and Thy(+2) suggests dramatic mobility, and the modeled conformation of those nucleotides represents a physically plausible member of a diverse conformational ensemble (which also agrees with solution experiments to be discussed shortly). Therefore, in the bent-DNA conformation, two helical arms join at an underwound hinge whose target-strand nucleotides are heterogeneously positioned. DNA disorder is consistent with the function of the Cas9 interrogation complex, which is to flip target-strand nucleotides from the DNA duplex toward the sgRNA to test base pairing potential.

Cas9:sgRNA bends DNA in non-cross-linked complexes

To determine whether unmodified Cas9 can bend DNA, we produced interrogation complexes that lacked the cross-link and tested them in a DNA cyclization assay24 (Fig. 4a). We created a series of 160-bp double-stranded DNA substrates that all bore a “J”-shape due to the inclusion of a special A-tract sequence that forms a protein-independent 108° bend25. Each substrate also included two PAMs spaced by a near-integral number of B-form DNA turns (31 bp). In eleven versions of this substrate, we varied the number of base pairs between the A-tract and the proximal PAM from 21 to 31 bp, effectively rotating the Cas9 binding sites around an entire turn of a B-form DNA helix (Extended Data Fig. 5). If Cas9 bends the DNA, each additional base pair added to the variable (21–31 bp) region will turn the Cas9-induced bend by ~34° with respect to the fixed A-tract bend. The relative direction of the two bends can be discerned from each substrate’s ligase-catalyzed cyclization efficiency, which should increase when the two bends point in the same direction (DNA assumes a “C” shape) and decrease when the two bends point in opposite directions (DNA assumes an “S” shape), as a function of the proximity of the DNA ends to be sealed (Fig. 4a). We measured the cyclization efficiency of each substrate in the absence and presence of Cas9 and an sgRNA lacking homology to either of the two CCR sequences. Consistent with expectations for a bend, the Cas9-dependent enhancement (or reduction) of cyclization efficiency tracked a sinusoidal shape when plotted against the A-tract/PAM spacing, reflecting phase-dependent variation in the end-to-end distance of different substrates (Fig. 4b, Extended Data Fig. 6a, b). Additionally, by interpreting the absolute phase of the cyclization enhancement curve (that is, the spacing value at which the peak occurs, where the two bends point in the same direction) in the context of the known direction of the A-tract bend24,26, we conclude that the bending direction observed in this experiment is the same as that observed in the bent-DNA cryo-EM structure (Fig. 4c, Extended Data Fig. 6b, Supplementary Information).

Fig. 4 |. DNA cyclization efficiency experiments.

Fig. 4 |

a, Substrate structure and experimental pipeline. Black A, A-tract; yellow P, PAM; orange spacer, A-tract/PAM1 distance (21–31 bp). For simplicity, bending is only depicted at a single PAM in the cylindrical volume illustrations. Substrates that are more S-shaped (left side of the cycle diagram/gel) cyclize more slowly than substrates that are C-shaped (right) due to changes in the end-to-end distance of the molecules. When base pairs are added to the spacer, the cyclization efficiency is expected to rise as the Cas9-induced bend becomes aligned with the A-tract bend, then fall as the bends become misaligned again, in a roughly sinusoidal pattern. b, Cas9-dependent cyclization enhancement of eleven substrate variants. Three replicates are depicted. MCE, monomolecular cyclization efficiency; xHRBP, mutated helix-rolling basic patch (K233A/K234A/K253A/K263A); Φ, phase difference from reference bend to the Cas9 WT peak. A protein that does not bend the DNA at all would yield the line y=1. A protein that bends DNA in a different direction would yield an x-shifted sinusoid that peaked at a different value of spacer length. c, Comparison of bending phase difference in the cyclization experiments vs. cryo-EM/crystal structures of DNA bends introduced by Cas9 or CAP (PDB 1CGP). See Supplementary Information for discussion of the CAP-based reference bend analysis. The magenta vector superposed on the aligned helices would point toward the A-tract in the cyclization substrates; in the larger structural diagram, it points out of the page.

Next, we wondered whether local DNA conformations observed in the cross-linked interrogation complex resemble those in the native complex. To characterize DNA distortion with single-nucleotide resolution, we measured the permanganate reactivity of individual thymines in the target DNA strand of a non-cross-linked interrogation complex (Fig. 5a). As anticipated for protein-induced base unstacking27,28, we detected a PAM- and Cas9-dependent increase in permanganate reactivity at Thy(+1) and Thy(+2) (Fig. 5ac, Extended Data Fig. 7a, b). The relationship between permanganate reactivity and Cas9:sgRNA concentration at these thymines suggests that the affinity of Cas9:sgRNA for this sequence is weak (10 μM), as expected for this necessarily transient interaction with off-target DNA (Fig. 5b, Extended Data Fig. 7a). Remarkably, Thy(+1) and Thy(+2) are precisely the nucleotides that appeared to be unpaired in the bent-DNA cryo-EM map of the cross-linked Cas9 interrogation complex (Fig. 3d), which shared the same DNA sequence as the permanganate substrate. These results suggest that Cas9 bends DNA through a backbone distortion that exposes target-strand nucleobases +1 and +2 to solvent and, more generally, that cryo-EM analysis of the cross-linked complex captured meaningful structural features of the native complex.

Fig. 5 |. Permanganate reactivity measurements.

Fig. 5 |

a, Experimental pipeline. The autoradiograph depicts the raw data used to produce the “intact PAM” graph in b. b, Oxidation probability of select thymines as a function of [Cas9:sgRNA]. Data depict a single replicate. Model information and additional replicates/thymines are presented in Extended Data Fig. 7a. CCR, candidate complementarity region. c, Oxidation probability of T(+1) and T(+2) in the presence of the indicated Cas9:sgRNA variant ([Cas9:sgRNA]=16 μM). Three replicates depicted. xPLL, KES(1107–1109)GG; xHRBP, K233A/K234A/K253A/K263A; xPBA, R1333A/R1335A.

A Cas9 conformational rearrangement accompanies DNA bending

The described linear- and bent-DNA conformations present a new model for Cas9 function in which open-protein Cas9:sgRNA first associates with the PAM on linear DNA, then engages a switch to the closed-protein state to bend the DNA and expose its PAM-adjacent nucleobases for interrogation (Supplementary Video 1). Because this transition involves energetically unfavorable base unstacking, we wondered how the unstacked state is stabilized.

Remarkably, the “phosphate lock loop” (Lys1107-Ser1109), which was proposed to support R-loop nucleation by tugging on the target-strand phosphate between the PAM and nucleotide +113, is disordered in the linear-DNA structure but stably bound to the target strand in the bent-DNA structure (Extended Data Fig. 8a, b), highlighting this contact as a potential energetic compensator for the base unstacking penalty. In the permanganate assay, mutation of the phosphate lock loop decreased activity to the level observed without Cas9 or with a Cas9 mutant deficient in PAM recognition (which lacks the PAM-binding arginines, “xPBA”) (Fig. 5c, Extended Data Fig. 7b), indicating that the loop may play a role in DNA bending. However, a negative result in this assay could be attributed either to weakened DNA bending activity or to an overall destabilization of the protein:DNA interaction4.

Another notable structural element is a group of lysines (Lys233/Lys234/Lys253/Lys263, termed here the “helix-rolling basic patch”) on REC2 (REC lobe domain 2) that contact the DNA phosphate backbone (at bp +8 to +13) in both the linear- and bent-DNA structures, an interaction that has not been observed before (Extended Data Fig. 8a, c). Mutation of these lysines attenuated anisotropy in the cyclization assay (Fig. 4b, Extended Data Fig. 6a) and abolished Cas9’s permanganate sensitization activity (Fig. 5c, Extended Data Fig. 7b). Structural modeling of the linear-to-bent transition (Supplementary Video 2) suggests that the helix-rolling basic patch may couple DNA bending to inter-lobe protein rotations similar to those observed in multi-body refinements29 of the cryo-EM images (Supplementary Video 34). Consensus EM reconstructions also revealed large segments of the REC lobe and guide RNA that become ordered upon lobe closure (Supplementary Video 1, Supplementary Information), implying that Cas9 can draw upon diverse structural transitions across the complex to regulate DNA bending.

The bent-DNA state makes R-loop nucleation structurally accessible

We propose that the function of the bent-DNA conformation is to promote local base flipping that can lead to R-loop nucleation. To probe the structure of a complex that has already proceeded to the R-loop nucleation step, we employed the same cross-linking strategy with adjusted RNA and DNA sequences that allow partial R-loop formation (Fig. 1c). Cryo-EM analysis of this construct revealed nucleotides +1 to +3 of the DNA target strand hybridized to the sgRNA spacer (Fig. 6a, Extended Data Fig. 9). In contrast to the disorder that characterized this region in the bent-DNA map, all three nucleotides are well-resolved, apparently stabilized by their hybridization to the A-form sgRNA spacer. The increase in resolution extends to the non-target strand and to the more PAM-distal regions of the CCR, suggesting that the DNA becomes overall more ordered in response to R-loop nucleation. The ribonucleoprotein architecture resembles that of the bent-DNA structure except for slight tilting of REC2, which accommodates a repositioning of the CCR duplex toward the newly formed RNA:DNA base pairs (Fig. 6a). Therefore, unstacked nucleotides in the bent-DNA state can hybridize to the sgRNA spacer with minimal global structural changes, further supporting the bent-DNA structure as a gateway to R-loop nucleation.

Fig. 6 |. Structure of a nascent R-loop and overview model.

Fig. 6 |

a, Left, unsharpened cryo-EM map (threshold 5σ) of Cas9:sgRNA:DNA with 3 RNA:DNA bp. Green, NUC lobe; blue, REC lobe; orange, guide RNA; magenta, DNA. Black vectors indicate differences from the bent-DNA structure with 0 RNA:DNA matches. The primary difference is a rigid-body rotation of REC lobe domain 2. Inset, sharpened cryo-EM map (threshold 8σ) and model. For clarity, only DNA and RNA are shown. Black, candidate complementarity region; yellow, PAM. b, Model for bend-dependent Cas9 target search and capture. Large diagrams depict states structurally characterized in the present work.

Our structures outline a model for a poorly understood aspect of Cas9 function that is fundamental to CRISPR target search and capture (Fig. 6b). First, open-conformation Cas9:sgRNA associates with the PAM of a linear DNA target. By engaging the open-to-closed protein conformational switch, Cas9 bends and twists the DNA to locally unwind the base pairs next to the PAM. If target-strand nucleotides are unable to hybridize to the sgRNA spacer, the candidate target is released, and Cas9 proceeds to the next candidate. If the target strand is sgRNA-complementary, unwound nucleotides initiate an RNA:DNA hybrid that can expand through strand invasion to a full 20-bp R-loop, activating DNA cleavage.

Due to its energetic linkage to base flipping30, DNA bending provides a viable mechanical solution to any biological problem that requires unrestricted access to nucleobases31,32, including the sequence interrogation challenge faced by all DNA-targeting CRISPR systems4,3336. In our key structural snapshot of this process, Cas9 specifically employs a bending mode that involves underwinding (Fig. 3), which may be a topological necessity for the downstream propagation of flipping events, and which could underlie some features of Cas9’s mechanical sensitivity3739. In contrast, certain methyltransferases that flip only one nucleotide at a time can afford to do so without gross helical distortion40,41.

Interestingly, other proteins define the vertex of an underwound DNA bend using intimate contacts to the distorted nucleotides, either to intact base pairs in the case of transcription factors1820,23 or to flipped bases and their estranged partners in the case of base excision repair enzymes16,17,21,22,42. During initial DNA interrogation, Cas9 appears to make no such contacts, instead straddling the bending vertex and relying on mechanical strain to stabilize extrahelical nucleotide conformations without restricting their motion. To catalyze RNA-programmable strand exchange, Cas9 must distort DNA independent of its sequence, likening its functional constraints to those of the filamentous recombinase RecA43, which non-specifically destabilizes candidate DNA via longitudinal stretching4446.

The mechanism illustrated here reveals for the first time the individual steps that comprise the slowest phase of Cas9’s genome editing function11. The energetic tuning of binding, bending, and RNA:DNA hybrid nucleation dictates the speed of target capture—bent/unstacked states must be stable enough to promote fast transitions to RNA-hybridized states, but not so stable that Cas9 wastes undue time on off-target DNA. Thus, the energetic landscape surrounding the states identified in this work will be a crucial subject of study to understand the success of current state-of-the-art genome editors and to inform the engineering of faster ones. Finally, DNA in eukaryotic chromatin is rife with bends, due either to intrinsic structural features of the DNA sequence47 or to interactions with looping proteins48. Because the Cas9-induced DNA bend described here has a well-defined direction that may either match or antagonize incumbent bends, it will be important to test how local chromatin geometry affects Cas9’s efficiency in both dissociating from off-target sequences and opening R-loops on real target sequences.

Methods

Protein expression and purification

Cas9 was expressed and purified as described previously28, with slight modifications. Briefly, protein was expressed from custom pET-based vectors in E. coli BL21 Star(DE3) cells. Cells were sonicated in lysis buffer (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 0.5 mM PMSF, 10 tablets/L cOmplete EDTA-free protease inhibitor cocktail (Roche), 0.25 mg/mL chicken egg white lysozyme (Sigma-Aldrich)), and clarified lysate was loaded onto Ni-NTA resin, which was then washed (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 5% glycerol, 20 mM imidazole) and eluted (50 mM HEPES (pH 7.5), 500 mM NaCl, 1 mM TCEP, 5% glycerol, 300 mM imidazole). Proteins were cleaved overnight with TEV protease at 4°C without dialysis. The digested protein solution was diluted with one volume of low-salt ion exchange buffer (50 mM HEPES (pH 7.5), 250 mM KCl, 1 mM TCEP, 10% glycerol). Digested protein was purified on a HiTrap Heparin HP affinity column (Cytiva), eluting with high-salt ion exchange buffer (50 mM HEPES (pH 7.5), 1 M KCl, 1 mM TCEP, 10% glycerol). Eluted protein was then purified by size exclusion in protein-purification size exclusion buffer (20 mM HEPES (pH 7.5), 150 mM KCl, 1 mM dithiothreitol (DTT), 10% glycerol) on a Superdex 200 Increase 10/300 GL column (Cytiva). Plasmid/protein sequences and Addgene IDs can be found in the Supplementary Information.

Nucleic acid preparation

All DNA oligonucleotides were synthesized by Integrated DNA Technologies except the cystamine-functionalized target strand, which was synthesized by TriLink Biotechnologies (with HPLC purification). DNA oligonucleotides that were not HPLC-purified by the manufacturer were PAGE-purified in house (unless a downstream preparative step involved another PAGE purification), and all DNA oligonucleotides were stored in water. Duplex DNA substrates were annealed by heating to 95°C and cooling to 25°C over the course of 40 min on a thermocycler. Guide RNAs were transcribed and purified as described previously28, except no ribozyme was included in the transcript. Briefly, in vitro transcription reactions included PCR-assembled DNA template, 40 mM Tris-Cl (pH 7.9 at 25°C), 25 mM MgCl2, 10 mM DTT, 0.01% (v/v) Triton X-100, 2 mM spermidine, 5 mM of each NTP, and 100 μg/mL T7 RNA polymerase. Transcription was allowed to proceed for 2.5 hr at 37°C, after which RNA was purified by urea-PAGE, ethanol-precipitated, and resuspended in RNA storage buffer (0.1 mM EDTA, 2 mM sodium citrate, pH 6.4). All sgRNA molecules were annealed (80°C for 2 min, then moved directly to ice) in RNA storage buffer prior to use. For both DNA and RNA, A260 was measured on a NanoDrop (Thermo Scientific), and concentration was estimated according to extinction coefficients reported previously49. Oligonucleotide sequences can be found in the Supplementary Information.

Cryo-EM construct preparation

DNA duplexes were pre-annealed in water at 10X concentration (60 μM target strand, 75 μM non-target strand). Cross-linking reactions were assembled with 300 μL water, 100 μL 5X disulfide reaction buffer (250 mM Tris-Cl, pH 7.4 at 25°C, 750 mM NaCl, 25 mM MgCl2, 25% glycerol, 500 μM DTT), 50 μL 10X DNA duplex, 25 μL 100 μM sgRNA, and 25 μL 80 μM Cas9. Cross-linking was allowed to proceed at 25°C for 24 hours (0 RNA:DNA matches) or 8 hours (3 RNA:DNA matches). Sample was then purified by size exclusion (Superdex 200 Increase 10/300 GL, Cytiva) in cryo-EM buffer (20 mM Tris-Cl, pH 7.5 at 25°C, 200 mM KCl, 100 μM DTT, 5 mM MgCl2, 0.25% glycerol). Peak fractions were pooled, concentrated to an estimated 6 μM, snap-frozen in 10-μL aliquots in liquid nitrogen, and stored at −80°C until grid preparation. For the Cas9:sgRNA structural construct, which lacked a cross-link, the reaction was assembled with 350 μL water, 100 μL 5X disulfide reaction buffer, 0.45 μL 1 M DTT, 25 μL 100 μM sgRNA, and 25 μL 80 μM Cas9. The complex was allowed to form at 25°C for 30 minutes. The Cas9:sgRNA sample was then size-exclusion-purified and processed as described for the DNA-containing constructs. For Cas9:sgRNA, cryo-EM buffer contained 1 mM DTT instead of 100 μM DTT.

SDS-PAGE analysis

For non-reducing SDS-PAGE, thiol exchange was first quenched by the addition of 20 mM S-methyl methanethiosulfonate (S-MMTS). Then, 0.25 volumes of 5X non-reducing SDS-PAGE loading solution (0.0625% w/v bromophenol blue, 75 mM EDTA, 30% glycerol, 10% SDS, 250 mM Tris-Cl, pH 6.8) were added, and the sample was heated to 90°C for 5 minutes before loading of 3 pmol onto a 4–15% Mini-PROTEAN TGX Stain-Free Precast Gel (Bio-Rad), alongside PageRuler Prestained Protein Ladder (Thermo Scientific). Gels were imaged using the Stain-Free imaging protocol (5-min activation, 3-s exposure) of Bio-Rad Image Lab 5.2.1 on a Bio-Rad ChemiDoc. For reducing SDS-PAGE, no S-MMTS was added, and 5% β-mercaptoethanol (βME) was added along with the non-reducing SDS-PAGE loading solution. For radioactive SDS-PAGE analysis, a 4–20% Mini-PROTEAN TGX Precast Gel (Bio-Rad) was pre-run for 20 min at 200 V (to allow free ATP to migrate ahead of free DNA), run with radioactive sample for 15 min at 200 V, dried (80°C, 3 hours) on a gel dryer (Bio-Rad), and exposed to a phosphor screen, subsequently imaged on an Amersham Typhoon using the Amersham Typhoon Control Software 2.0.0.6 (Cytiva).

Nucleic acid radiolabeling

Standard 5′ radiolabeling was performed with T4 polynucleotide kinase (New England Biolabs) at 0.2 U/μL (manufacturer’s units), 1X T4 PNK buffer (New England Biolabs), 400 nM DNA oligonucleotide, and 200 nM [γ−32P]-ATP (PerkinElmer) for 30 min at 37°C, followed by a 20-min heat-killing incubation at 65°C. Radiolabeled oligos were then buffer exchanged into water using a Microspin G-25 spin column (GE Healthcare). For 5′ radiolabeling of sgRNAs, the 5′ triphosphate was first removed by treatment with Quick CIP (New England BioLabs, manufacturer’s instructions). The reaction was then supplemented with 5 mM DTT and the same concentrations of T4 polynucleotide kinase (New England BioLabs) and [γ−32P]-ATP (PerkinElmer) used for DNA radiolabeling, and the remainder of the protocol was completed as for DNA.

Radiolabeled target-strand cleavage rate measurements

DNA duplexes at 10X concentration (20 nM radiolabeled target strand, 75 μM unlabeled non-target strand) were annealed in water with 60 μM cystamine dihydrochloride (pH 7). A 75-μL reaction was assembled from 15 μL 5X Mg-free disulfide reaction buffer (250 mM Tris-Cl, pH 7.4 at 25°C, 750 mM NaCl, 5 mM EDTA, 25% glycerol, 500 μM DTT), 7.5 μL 600 μM cystamine dihydrochloride (pH 7), 37.5 μL water, 3.75 μL 80 μM Cas9, 3.75 μL 100 μM sgRNA, 7.5 μL 10X DNA duplex. The reaction was incubated at 25°C for 2 hours, at which point the cross-linked fraction had fully equilibrated. To non-reducing or reducing reactions, 5 μL of 320 mM S-MMTS or 80 mM DTT (respectively) in 1X Mg-free disulfide reaction buffer was added. Samples were incubated at 25°C for an additional 5 min, then cooled to 16°C and allowed to equilibrate for 15 min. One aliquot was quenched into 0.25 volumes 5X non-reducing SDS-PAGE solution and subject to SDS-PAGE analysis to assess the extent of cross-linking (for the reduced sample, no βME was added, as the DTT had already effectively reduced the sample). Another aliquot was quenched for reducing urea-PAGE analysis as timepoint 0. DNA cleavage was initiated by combining the remaining reaction volume with 0.11 volumes 60 mM MgCl2. Aliquots were taken at the indicated timepoints for reducing urea-PAGE analysis.

Urea-PAGE analysis

To each sample was added 1 volume of 2X urea-PAGE loading solution (92% formamide, 30 mM EDTA, 0.025% bromophenol blue, 400 μg/mL heparin). For reducing urea-PAGE analysis, 5% βME was subsequently added. Samples were heated to 90°C for 5 minutes, then resolved on a denaturing polyacrylamide gel (10% or 15% acrylamide:bis-acrylamide 29:1, 7 M urea, 0.5X TBE). For radioactive samples, gels were dried (80°C, 3 hr) on a gel dryer (Bio-Rad), exposed to a phosphor screen, and imaged on an Amersham Typhoon (Cytiva). For samples containing fluorophore-conjugated DNA, gels were directly imaged on the Typhoon without further treatment. For unlabeled samples, gels were stained with 1X SYBR Gold (Invitrogen) in 0.5X TBE prior to Typhoon imaging.

Fluorescence and autoradiograph data analysis

Band volumes in fluorescence images and autoradiographs were quantified in Image Lab 6.1 (Bio-Rad). For fluorescence images recorded by the ChemiDoc, Image Lab’s native .scn files were used for quantification. For images recorded by the Typhoon, the Typhoon software’s native .gel files (square root encoded) were used for quantification. Data were fit by the least-squares method in Prism 7 (GraphPad Software).

Cryo-EM grid preparation and data collection

Cryo-EM samples were thawed and diluted to 3 μM (Cas9:sgRNA:DNA) or 1.5 μM (Cas9:sgRNA) in cryo-EM buffer. An UltrAuFoil grid (1.2/1.3-μm, 300 mesh, Electron Microscopy Sciences, catalog no. Q350AR13A) was glow-discharged in a PELCO easiGlow for 15 s at 25 mA, then loaded into an FEI Vitrobot Mark IV equilibrated to 8°C with 100% humidity. From the sample, kept on ice up until use, 3.6 μL was applied to the grid, which was immediately blotted (Cas9:sgRNA:DNA{0 RNA:DNA matches} and Cas9:sgRNA: blot time 4.5 s, blot force 8; Cas9:sgRNA:DNA{3 RNA:DNA matches}: blot time 3 s, blot force 6) and plunged into liquid-nitrogen-cooled ethane. Micrographs for Cas9:sgRNA were collected on a Talos Arctica TEM operated at 200 kV and x36,000 magnification (1.115 Å/pixel), at −0.8 to −2 μm defocus, using the super-resolution camera setting (0.5575 Å/pixel) on a Gatan K3 Direct Electron Detector. Micrographs for Cas9:sgRNA:DNA complexes were collected on a Titan Krios G3i TEM operated at 300 kV with energy filter, x81,000 nominal magnification (1.05 Å/pixel), −0.8 μm to −2 μm defocus, using the super-resolution camera setting (0.525 Å/pixel) in CDS mode on a Gatan K3 Direct Electron Detector. All images were collected using beam shift in SerialEM v.3.8.7 software.

Cryo-EM data processing and model building

Details of cryo-EM data processing and model building can be found in the Supplementary Information.

Permanganate reactivity measurements

DNA duplexes were annealed at 50X concentration (100 nM radiolabeled target strand, 200 nM unlabeled non-target strand) in 1X annealing buffer (10 mM Tris-Cl, pH 7.9 at 25°C, 50 mM KCl, 1 mM EDTA), then diluted to 10X concentration in water. A Cas9 titration at 5X was prepared by diluting an 80 μM Cas9 stock solution with protein-purification size exclusion buffer. An sgRNA titration at 5X was prepared by diluting a 100 μM sgRNA stock solution with RNA storage buffer. For all reactions, the sgRNA concentration was 1.25 times the Cas9 concentration, and the reported ribonucleoprotein concentration is that of Cas9. Reactions were assembled with 11 μL 5X permanganate reaction buffer (100 mM Tris-Cl, pH 7.9 at 25°C, 120 mM KCl, 25 mM MgCl2, 5 mM TCEP, 500 μg/mL UltraPure BSA, 0.05% Tween-20), 11 μL water, 11 μL 5X Cas9, 11 μL 5X sgRNA, 5.5 μL 10X DNA. A stock solution of KMnO4 was prepared fresh in water, and its concentration was corrected to 100 mM (10X reaction concentration) based on 8 averaged NanoDrop readings (ε526 = 2.4 × 103 M−1 cm−1). Reaction tubes and KMnO4 (or water, for reactions lacking permanganate) were equilibrated to 30°C for 15 minutes. To initiate the reaction, 22.5 μL of Cas9:sgRNA:DNA was added to 2.5 μL 100 mM KMnO4 or water. After 2 min, 25 μL 2X stop solution (2 M βME, 30 mM EDTA) was added to stop the reaction, and 50 μL of water was added to each quenched reaction. Samples were extracted once with 100 μL 25:24:1 phenol:chloroform:isoamyl alcohol (pH 8) in 5PRIME Phase Lock Heavy tubes (Quantabio). The aqueous phase was isolated and combined with 10 μL 3 M sodium acetate (pH 5.2), 1 μL GlycoBlue coprecipitant (Invitrogen), and 300 μL ethanol, and left at −20°C for >2 hr. DNA was precipitated by centrifugation, and supernatant was decanted. A second wash was performed with 500 μL 70% ethanol. Pellets were resuspended in 70 μL 10% piperidine and incubated at 90°C for 30 min. Solvent was evaporated in a SpeedVac (ThermoFisher Scientific). Approximate yield was determined by measuring radioactivity of the pellet-containing tube in a benchtop radiation counter (Bioscan QC-4000), and pellets were resuspended in an appropriate volume of loading solution (50% water, 50% formamide, 0.025% w/v bromophenol blue) to normalize signal across samples prior to resolution by denaturing PAGE and autoradiography.

Because piperidine treatment leads to low levels of cleavage at every nucleotide, the exhaustive single-nucleotide ladder could be used to assign band identities, also confirmed by the dark/light pattern (piperidine-catalyzed cleavage at thymines is less efficient than at other nucleotides in the absence of permanganate modification and more efficient in the presence of permanganate modification). Data analysis was performed as follows: let vi denote the volume of band i in a lane with n total bands (band 1 is the shortest cleavage fragment, band n is the topmost band corresponding to the starting/uncleaved DNA oligonucleotide). The probability of cleavage at thymine i is defined as: pcleave,i=vij=invj. Oxidation probability of thymine i is defined as: pox,i = pcleave,i,+ pmpcleave,i,−pm, where +pm indicates the experiment that contained 10 mM KMnO4 and −pm indicates the no-permanganate experiment. An extensive description of this type of analysis can be found in ref. 28.

Preparation of DNA cyclization substrates

Each variant DNA cyclization substrate precursor was assembled by PCR from two amplification primers (one of which contained a fluorescein-dT) and two assembly primers. Each reaction was 400 μL total (split into 4 × 100-μL aliquots) and contained 1X Q5 reaction buffer (New England BioLabs), 200 μM dNTPs, 200 nM forward amplification primer, 200 nM reverse amplification primer, 1 nM forward assembly primer, 1 nM reverse assembly primer, 0.02 U/μL Q5 polymerase. Thermocycle parameters were as follows: 98°C, 30 s; {98°C, 10 s; 55°C, 20 s; 72°C, 15 s}; {98°C, 10 s; 62°C, 20 s; 72°C, 15 s}; {98°C, 10 s; 72°C, 35 s}x25; 72°C, 2 min; 10°C, ∞. PCR products were phenol-chloroform-extracted, ethanol-precipitated, and resuspended in 80 μL water. To this was added 15 μL 10X CutSmart buffer, 47.5 μL water, and 7.5 μL ClaI restriction enzyme (10,000 units/mL, New England BioLabs), and digestion was allowed to proceed overnight at 37°C. Samples were then combined with 0.25 volumes 5X native quench solution (25% glycerol, 250 μg/mL heparin, 125 mM EDTA, 1.2 mg/mL proteinase K, 0.0625% w/v bromophenol blue), incubated at 55°C for 15 minutes, and resolved on a preparative native PAGE gel (8% acrylamide:bis-acrylamide 37.5:1, 0.5X TBE) at 4°C. Fluorescent bands, made visible on a blue LED transilluminator, were cut out, and DNA was extracted, ethanol-precipitated, and resuspended in water.

Cyclization efficiency measurements

Each cyclization reaction contained the following components: 1 μL 10X T4 DNA ligase reaction buffer (New England BioLabs), 2 μL water, 1 μL 10X ligation buffer additives (400 μg/mL UltraPure BSA, 100 mM KCl, 0.1% NP-40), 2 μL 80 μM Cas9 (or protein-purification size exclusion buffer), 2 μL 100 μM sgRNA (or RNA storage buffer), 1 μL 25 nM cyclization substrate, 1 μL T4 DNA ligase (400,000 units/mL, New England BioLabs) (or ligase storage buffer). All reaction components were incubated together at 20°C for 15 minutes prior to reaction initiation except for the ligase, which was incubated separately. Reactions were initiated by combining the ligase with the remainder of the components, allowed to proceed at 20°C for 30 minutes, then quenched with 2.5 μL 5X native quench solution. Samples were then incubated at 55°C for 15 minutes, resolved on an analytical native PAGE gel (8% acrylamide:bis-acrylamide 37.5:1, 0.5X TBE) at 4°C, and imaged for fluorescein on an Amersham Typhoon (Cytiva). Monomolecular cyclization efficiency (MCE) for a given lane is defined as (band volume of circular monomers)/(sum of all band volumes). Bimolecular ligation efficiency (BLE) is defined as (sum of band volumes of all linear/circular n-mers, for n≥2)/(sum of all band volumes). The non-specific degradation products indicated in Extended Data Fig. 6a were not included in the analysis.

Data availability

All data generated or analyzed during this study are included within this manuscript and its supporting information files except for the cryo-EM data/models, which can be accessed as follows: Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, open-protein/linear-DNA conformation (PDB 7S3H, EMD-24823); Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, closed-protein/bent-DNA conformation (PDB 7S36, EMD-24817); Cas9:sgRNA (S. pyogenes) in the open-protein conformation (PDB 7S37, EMD-24818); Cas9:sgRNA:DNA (S. pyogenes) forming a 3-base-pair R-loop (PDB 7S38, EMD-24819). The analyses and figures in this study also draw on previously determined structures with PDB codes 4ZT0, 5FQ5, 5F9R, 4CMP, 1CGP, and 4UN3 as described in the figure legends and the Supplementary Information. For figures containing fluorescence images, autoradiographs, or scatter plots, the original data are available as Source Data files.

Extended Data

Extended Data Fig. 1. Characterization of the Cas9:DNA cross-link.

Extended Data Fig. 1

a, Crystal structure of Cas9:sgRNA:DNA with 20-bp RNA:DNA hybrid formed (PDB 4UN3). In the inset, Arg1333 and Arg1335 recognize the two guanines of the PAM. Green, NUC lobe; blue, REC lobe; orange, guide RNA; black, DNA; yellow, PAM. b, Non-reducing SDS-PAGE (Stain-Free) analysis of cross-linking reactions and controls. Complexes were prepared identically to structural constructs but in smaller volumes and without size exclusion purification. Competitor duplex, where indicated, was added before the cross-linkable duplex at an equivalent concentration. The depicted experiment was performed once, and a prior optimization experiment yielded similar results. c, Top left, non-reducing SDS-PAGE autoradiograph to determine the fraction of DNA cross-linked to Cas9 at t=0. The target strand is radiolabeled. Bottom, reducing urea-PAGE autoradiograph revealing target-strand cleavage kinetics; quantification depicted in top right. The depicted model is y=C1ek1t+(BmaxC)(1ek2t). The depicted experiments were performed once, and a prior optimization experiment yielded similar results.

Extended Data Fig. 2. Cryo-EM sample quality.

Extended Data Fig. 2

a, SDS-PAGE (Stain-Free) analysis of purified proteins and cryo-EM samples. SC, structural construct; SC 1, Cas9:sgRNA:DNA with 0 RNA:DNA matches; SC 2, Cas9:sgRNA; SC 3, Cas9:sgRNA:DNA with 3 RNA:DNA matches. The depicted experiment was performed once (on the samples used for the final cryo-EM grids). b, Reducing urea-PAGE (SYBR-Gold-stained) analysis of purified nucleic acid components and cryo-EM samples. TS, target strand; NTS, non-target strand. The depicted experiment was performed once (on the samples used for the final cryo-EM grids). c, Reducing urea-PAGE autoradiograph of radioactive mimics of structural constructs. sgRNA 1 and non-target strand 1 are those used to create SC 1. sgRNA 2 and non-target strand 2 are those used to create SC 3. sgRNA 3 bears a spacer with 20 nt of complementarity to the DNA target strand. The depicted experiment was performed once, and a prior optimization experiment yielded similar results.

Extended Data Fig. 3. Cryo-EM analysis of Cas9:sgRNA:DNA with 0 RNA:DNA matches.

Extended Data Fig. 3

a, Classes from RELION 3D classification of closed-protein particles (threshold 6σ). The number of particles in each class is indicated next to the class number. In classes 1/2/5/6/7 the DNA is bent next to the PAM (visible for class 1 at lower contour). In classes 3/4/8 the DNA continues along a more linear trajectory for half a turn past the PAM, into the region normally occupied by REC2; in these classes, density in the region of the putative collision (black arrow) is uninterpretable as either protein or DNA, likely due to particle damage, and is thus colored gray. The class used for the final closed-protein/bent-DNA map is class 2, highlighted in yellow. Green, NUC lobe; blue, REC lobe domains 1/2; light blue, REC lobe domain 3; orange, guide RNA; magenta, DNA. b, Details of final cryo-EM maps. c, Linear DNA docked into open-protein/linear-DNA cryo-EM structure and previous Cas9 crystal structures. All structures were aligned to the C-terminal domain of PDB 5F9R; then, the linear DNA was aligned to the PAM-containing duplex of PDB 5F9R. Green, NUC lobe; blue, REC lobe; orange, guide RNA; black, docked DNA; yellow, PAM. The DNA truly belonging to each structure (if present) is depicted in gray.

Extended Data Fig. 4. Cryo-EM analysis of Cas9:sgRNA.

Extended Data Fig. 4

a, Details of cryo-EM analysis. b, Unsharpened cryo-EM map (threshold 5σ) and model of Cas9:sgRNA in open-protein conformation. Green, NUC lobe; blue, REC lobe; orange, guide RNA; gray, unattributed density (REC1 or guide RNA, see Supplementary Information).

Extended Data Fig. 5. Nucleic acid sequences used in DNA cyclization experiments.

Extended Data Fig. 5

Green star/bold T, fluorescein-conjugated dT; circled P, 5′ phosphate; CCR, candidate complementarity region.

Extended Data Fig. 6. Details of DNA cyclization experiments.

Extended Data Fig. 6

a, Fluorescence image and analysis of native PAGE gel resolving ligation products. Gel represents one replicate. Three replicates are plotted on the graphs. The polymeric/cyclized band assignments were made by reference to the relative electrophoretic mobilities observed in Kahn & Crothers, 1992. b, Comparison of Cas9:DNA cyclization data to CAP:DNA cyclization data. The depicted model is y=Asin2π10.45bpx+ϕ0+b, with the following constraints: A>0,b>A. The average of 224° and 212° is reported in Fig. 4c. J, J-factor (defined in Kahn & Crothers, 1992); Φ, phase difference; CBS, CAP-binding site.

Extended Data Fig. 7. Details of permanganate reactivity measurements.

Extended Data Fig. 7

a, Autoradiographs and analysis of all thymines except T(+25), which was insufficiently resolved from neighboring bands. The depicted autoradiographs are replicate 1. Due to systematic variation across replicates, individual replicates are presented on separate graphs and fitted separately. The depicted model is pox=Bmax[Cas9:sgRNA]KD+[Cas9:sgRNA], with KD shared across T(+1) and T(+2). CCR, candidate complementarity region; CI, confidence interval. b, Autoradiograph (same gel as “intact PAM” autoradiograph in a) and analysis of experiments containing variants of Cas9:sgRNA (16 μM), with the “intact PAM” DNA substrate. Graph depicts three replicates.

Extended Data Fig. 8. Structural features potentially relevant to Cas9-induced DNA bending.

Extended Data Fig. 8

a, Location of each feature in the bent-DNA structure. Green, NUC lobe; blue, REC lobe; orange, guide RNA; black, DNA; yellow, PAM. b, Comparison of phosphate lock loop (magenta) in various structures, within sharpened cryo-EM maps (row of three, threshold 8σ) or 2Fo-Fc map (upper right, threshold 1.5σ). Black arrow indicates the eponymous phosphate between nucleotides 0 and +1 of the target strand. c, Comparison of helix-rolling basic patch (magenta) in various structures, within unsharpened cryo-EM maps (first panel, threshold 5σ; second panel, threshold 6σ).

Extended Data Fig. 9. Cryo-EM analysis of Cas9:sgRNA:DNA with 3 RNA:DNA matches.

Extended Data Fig. 9

Details of cryo-EM analysis.

Supplementary Material

Supplementary Information
Video 1
Download video file (21.7MB, mov)
Video 2
Download video file (63.4MB, mov)
Video 3
Download video file (98.9MB, mp4)
Video 4
Download video file (99MB, mp4)

Table 1 |.

Cryo-EM data collection, refinement and validation statistics

Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, open-protein/linear-DNA conformation (EMD-24823) (PDB 7S3H) Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, closed-protein/bent-DNA conformation (EMD-24817) (PDB 7S36) Cas9:sgRNA (S. pyogenes) in the open-protein conformation (EMD-24818) (PDB 7S37) Cas9:sgRNA:DNA (S. pyogenes) forming a 3-base-pair R-loop (EMD-24819) (PDB 7S38)
Data collection and processing
Magnification x81,000 x81,000 x36,000 x81,000
Voltage (kV) 300 300 200 300
Electron exposure (e–/Å2) 50 50 50 50
Defocus range (μm) −0.8 to −2 −0.8 to −2 −0.8 to −2 −0.8 to −2
Pixel size (Å) 1.05 (nom.) / 0.525 (super-res.) 1.05 (nom.) / 0.525 (super-res.) 1.115 (nom.) / 0.5575 (super-res.) 1.05 (nom.) / 0.525 (super-res.)
Symmetry imposed C1 C1 C1 C1
Initial particle images (no.) 83,86,564 83,86,564 15,94,370 5,006,233
Final particle images (no.) 5,45,450 18,121 87,130 17,424
Map resolution (Å) 2.5 3.2 3.2 3.3
 FSC threshold 0.143 0.143 0.143 0.143
Map resolution range (Å)
Refinement
Initial model used (PDB code) 4ZT0, 5FQ5, 5F9R, 4CMP 4ZT0, 5FQ5, 5F9R 4ZT0, 5F9R, 4CMP 4ZT0, 5FQ5, 5F9R
Model resolution (Å) 3.0 3.2 5.3 3.3
 FSC threshold 0.5 0.5 0.5 0.5
Model resolution range (Å)
Map sharpening B factor (Å2)
Model composition
 Non-hydrogen atoms 9,966 13,821 9,392 13,987
 Protein residues 1,030 1,353 1,033 1,353
 Nucleotides 76 131 46 139
 Ligands 1 1 0 1
B factors (Å2)
 Protein 56.34 80.16 109.19 85.61
 Nucleotide 68.92 104.85 225.8 119.07
 Ligand 58.68 89.24 70.37
R.m.s. deviations
 Bond lengths (Å) 0.007 0.007 0.006 0.007
 Bond angles (°) 1.070 0.980 1.037 0.980
Validation
 MolProbity score 1.42 1.36 1.49 1.33
 Clashscore 3.77 2.98 4.66 2.62
 Poor rotamers (%) 0.11 0.16 0.11 0.08
Ramachandran plot
 Favored (%) 96.28 96.07 96.29 95.85
 Allowed (%) 3.72 3.93 3.71 4.15
 Disallowed (%) 0 0 0 0

Acknowledgements

We thank D. Toso and J. Remis at the Cal Cryo facility for technical assistance in data collection. We thank members of the Nogales lab for discussions and advice on EM data processing, especially A.J. Florez Ariza and D. Herbst. We also thank J. Davis and E. Zhong for advice on EM data processing. We thank A. Chintangal for computational support. We thank N. Moriarty for assistance in modeling the thioalkane linker. We thank J. Kuriyan for scientific guidance and comments on the manuscript. We thank P. Pausch and H. Shi for comments on the manuscript. This work was supported by an NSF Graduate Research Fellowship (J.C.C.), an NHMRC Investigator Grant (EL1, 1175568, G.J.K.), the Howard Hughes Medical Institute (J.A.D.), the National Science Foundation (award number 1817593, J.A.D.), the Centers for Excellence in Genomic Science of the National Institutes of Health (award number RM1HG009490, J.A.D.), and the Somatic Cell Genome Editing Program of the Common Fund of the National Institutes of Health (award number U01AI142817-02, J.A.D.). J.A.D. and E.N. are HHMI investigators.

Footnotes

Competing interests

The Regents of the University of California have patents issued and/or pending for CRISPR technologies on which G.J.K and J.A.D. are inventors. J.A.D. is a cofounder of Caribou Biosciences, Editas Medicine, Scribe Therapeutics, Intellia Therapeutics and Mammoth Biosciences. J.A.D. is a scientific advisor to Caribou Biosciences, Intellia Therapeutics, Scribe Therapeutics, Mammoth Biosciences, Algen Biotechnologies, Felix Biosciences and Inari. J.A.D. is a Director at Johnson & Johnson and at Tempus, and has research projects sponsored by Biogen and Apple Tree Partners. The remaining authors declare no competing interests.

References

  • 1.Barrangou R et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 (2007). [DOI] [PubMed] [Google Scholar]
  • 2.Pickar-Oliver A & Gersbach CA The next generation of CRISPR-Cas technologies and applications. Nat Rev Mol Cell Biol 20, 490–507 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Jinek M et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mekler V, Minakhin L & Severinov K Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation. Proc Natl Acad Sci U S A 114, 5443–5448 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sternberg SH, Redding S, Jinek M, Greene EC & Doudna JA DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Jiang F & Doudna JA CRISPR-Cas9 Structures and Mechanisms. Annu Rev Biophys 46, 505–529 (2017). [DOI] [PubMed] [Google Scholar]
  • 7.Jinek M et al. Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science 343, 1247997 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Jiang F, Zhou K, Ma L, Gressel S & Doudna JA STRUCTURAL BIOLOGY. A Cas9-guide RNA complex preorganized for target DNA recognition. Science 348, 1477–1481 (2015). [DOI] [PubMed] [Google Scholar]
  • 9.Jiang F et al. Structures of a CRISPR-Cas9 R-loop complex primed for DNA cleavage. Science 351, 867–871 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Globyte V, Lee SH, Bae T, Kim J-S & Joo C CRISPR/Cas9 searches for a protospacer adjacent motif by lateral diffusion. EMBO J 38, e99466 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jones DL et al. Kinetics of dCas9 target search in Escherichia coli. Science 357, 1420–1424 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Verdine GL & Norman DPG Covalent trapping of protein-DNA complexes. Annu Rev Biochem 72, 337–366 (2003). [DOI] [PubMed] [Google Scholar]
  • 13.Anders C, Niewoehner O, Duerst A & Jinek M Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569–573 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Nishimasu H et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156, 935–949 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Osuka S et al. Real-time observation of flexible domain movements in CRISPR-Cas9. EMBO J 37, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bruner SD, Norman DP & Verdine GL Structural basis for recognition and repair of the endogenous mutagen 8-oxoguanine in DNA. Nature 403, 859–866 (2000). [DOI] [PubMed] [Google Scholar]
  • 17.Fromme JC & Verdine GL Structure of a trapped endonuclease III-DNA covalent intermediate. EMBO J 22, 3461–3471 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kim JL, Nikolov DB & Burley SK Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520–527 (1993). [DOI] [PubMed] [Google Scholar]
  • 19.Kim Y, Geiger JH, Hahn S & Sigler PB Crystal structure of a yeast TBP/TATA-box complex. Nature 365, 512–520 (1993). [DOI] [PubMed] [Google Scholar]
  • 20.Love JJ et al. Structural basis for DNA bending by the architectural transcription factor LEF-1. Nature 376, 791–795 (1995). [DOI] [PubMed] [Google Scholar]
  • 21.Slupphaug G et al. A nucleotide-flipping mechanism from the structure of human uracil-DNA glycosylase bound to DNA. Nature 384, 87–92 (1996). [DOI] [PubMed] [Google Scholar]
  • 22.Vassylyev DG et al. Atomic model of a pyrimidine dimer excision repair enzyme complexed with a DNA substrate: structural basis for damaged DNA recognition. Cell 83, 773–782 (1995). [DOI] [PubMed] [Google Scholar]
  • 23.Werner MH, Huth JR, Gronenborn AM & Clore GM Molecular basis of human 46X,Y sex reversal revealed from the three-dimensional solution structure of the human SRY-DNA complex. Cell 81, 705–714 (1995). [DOI] [PubMed] [Google Scholar]
  • 24.Kahn JD & Crothers DM Protein-induced bending and DNA cyclization. Proc Natl Acad Sci U S A 89, 6343–6347 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Koo HS, Drak J, Rice JA & Crothers DM Determination of the extent of DNA bending by an adenine-thymine tract. Biochemistry 29, 4227–4234 (1990). [DOI] [PubMed] [Google Scholar]
  • 26.Schultz SC, Shields GC & Steitz TA Crystal structure of a CAP-DNA complex: the DNA is bent by 90 degrees. Science 253, 1001–1007 (1991). [DOI] [PubMed] [Google Scholar]
  • 27.Bui CT, Rees K & Cotton RGH Permanganate oxidation reactions of DNA: perspective in biological studies. Nucleosides Nucleotides Nucleic Acids 22, 1835–1855 (2003). [DOI] [PubMed] [Google Scholar]
  • 28.Cofsky JC et al. CRISPR-Cas12a exploits R-loop asymmetry to form double-strand breaks. eLife 9, e55143 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Nakane T, Kimanius D, Lindahl E & Scheres SH Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in RELION. Elife 7, e36861 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ramstein J & Lavery R Energetic coupling between DNA bending and base pair opening. Proc Natl Acad Sci U S A 85, 7231–7235 (1988). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Allan BW et al. DNA bending by EcoRI DNA methyltransferase accelerates base flipping but compromises specificity. J Biol Chem 274, 19269–19275 (1999). [DOI] [PubMed] [Google Scholar]
  • 32.Su T-J, Tock MR, Egelhaaf SU, Poon WCK & Dryden DTF DNA bending by M.EcoKI methyltransferase is coupled to nucleotide flipping. Nucleic Acids Res 33, 3235–3244 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Blosser TR et al. Two distinct DNA binding modes guide dual roles of a CRISPR-Cas protein complex. Mol Cell 58, 60–70 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hochstrasser ML, Taylor DW, Kornfeld JE, Nogales E & Doudna JA DNA Targeting by a Minimal CRISPR RNA-Guided Cascade. Mol Cell 63, 840–851 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Westra ER et al. CRISPR immunity relies on the consecutive binding and degradation of negatively supercoiled invader DNA by Cascade and Cas3. Mol Cell 46, 595–605 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Xiao Y et al. Structure Basis for Directional R-loop Formation and Substrate Handover Mechanisms in Type I CRISPR-Cas System. Cell 170, 48–60.e11 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ivanov IE et al. Cas9 interrogates DNA in discrete steps modulated by mismatches and supercoiling. Proc Natl Acad Sci U S A 117, 5853–5860 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Newton MD et al. DNA stretching induces Cas9 off-target activity. Nat Struct Mol Biol 26, 185–192 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Szczelkun MD et al. Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proc Natl Acad Sci U S A 111, 9798–9803 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Klimasauskas S, Kumar S, Roberts RJ & Cheng X HhaI methyltransferase flips its target base out of the DNA helix. Cell 76, 357–369 (1994). [DOI] [PubMed] [Google Scholar]
  • 41.Reinisch KM, Chen L, Verdine GL & Lipscomb WN The crystal structure of HaeIII methyltransferase convalently complexed to DNA: an extrahelical cytosine and rearranged base pairing. Cell 82, 143–153 (1995). [DOI] [PubMed] [Google Scholar]
  • 42.Dalhus B, Laerdahl JK, Backe PH & Bjørås M DNA base repair--recognition and initiation of catalysis. FEMS Microbiol Rev 33, 1044–1078 (2009). [DOI] [PubMed] [Google Scholar]
  • 43.Bell JC & Kowalczykowski SC RecA: Regulation and Mechanism of a Molecular Search Engine. Trends Biochem Sci 41, 491–507 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Chen Z, Yang H & Pavletich NP Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structures. Nature 453, 489–484 (2008). [DOI] [PubMed] [Google Scholar]
  • 45.Yang D, Boyer B, Prévost C, Danilowicz C & Prentiss M Integrating multi-scale data on homologous recombination into a new recognition mechanism based on simulations of the RecA-ssDNA/dsDNA structure. Nucleic Acids Res 43, 10251–10263 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Yang H, Zhou C, Dhar A & Pavletich NP Mechanism of strand exchange from RecA-DNA synaptic and D-loop structures. Nature 586, 801–806 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Koo HS, Wu HM & Crothers DM DNA bending at adenine .thymine tracts. Nature 320, 501–506 (1986). [DOI] [PubMed] [Google Scholar]
  • 48.Garcia HG et al. Biological consequences of tightly bent DNA: the other life of a macromolecular celebrity. Biopolymers 85, 115–130 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Cavaluzzi MJ & Borer PN Revised UV extinction coefficients for nucleoside-5’-monophosphates and unpaired DNA and RNA. Nucleic Acids Res 32, e13 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information
Video 1
Download video file (21.7MB, mov)
Video 2
Download video file (63.4MB, mov)
Video 3
Download video file (98.9MB, mp4)
Video 4
Download video file (99MB, mp4)

Data Availability Statement

All data generated or analyzed during this study are included within this manuscript and its supporting information files except for the cryo-EM data/models, which can be accessed as follows: Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, open-protein/linear-DNA conformation (PDB 7S3H, EMD-24823); Cas9:sgRNA:DNA (S. pyogenes) with 0 RNA:DNA base pairs, closed-protein/bent-DNA conformation (PDB 7S36, EMD-24817); Cas9:sgRNA (S. pyogenes) in the open-protein conformation (PDB 7S37, EMD-24818); Cas9:sgRNA:DNA (S. pyogenes) forming a 3-base-pair R-loop (PDB 7S38, EMD-24819). The analyses and figures in this study also draw on previously determined structures with PDB codes 4ZT0, 5FQ5, 5F9R, 4CMP, 1CGP, and 4UN3 as described in the figure legends and the Supplementary Information. For figures containing fluorescence images, autoradiographs, or scatter plots, the original data are available as Source Data files.

RESOURCES