Abstract
Cas9 is a CRISPR-associated endonuclease capable of RNA-guided, site-specific DNA cleavage1–3. The programmable activity of Cas9 has been widely utilized for genome editing applications4–6, yet its precise mechanisms of target DNA binding and off-target discrimination remain incompletely understood. Here we report a series of cryo-electron microscopy structures of Streptococcus pyogenes Cas9 capturing the directional process of target DNA hybridization. In the early phase of R-loop formation, the Cas9 REC2 and REC3 domains form a positively charged cleft that accommodates the distal end of the target DNA duplex. Guide–target hybridization past the seed region induces rearrangements of the REC2 and REC3 domains and relocation of the HNH nuclease domain to assume a catalytically incompetent checkpoint conformation. Completion of the guide–target heteroduplex triggers conformational activation of the HNH nuclease domain, enabled by distortion of the guide–target heteroduplex, and complementary REC2 and REC3 domain rearrangements. Together, these results establish a structural framework for target DNA-dependent activation of Cas9 that sheds light on its conformational checkpoint mechanism and may facilitate the development of novel Cas9 variants and guide RNA designs with enhanced specificity and activity.
Subject terms: DNA, Genetic engineering, Cryoelectron microscopy, X-ray crystallography, RNA
Cryo-electron microscopy structures of Streptococcus pyogenes Cas9 in multiple DNA-bound states provide insights on the mechanism of Cas9 activation by target DNA.
Main
Cas9 enzymes rely on a dual guide RNA structure consisting of a CRISPR RNA (crRNA) guide and a trans-activating CRISPR RNA (tracrRNA) coactivator to cleave complementary DNA targets. S. pyogenes Cas9 (SpCas9) has found widespread use as a programmable DNA-targeting tool in genome editing and gene-targeting applications4–6. Target DNA binding by SpCas9 is dependent on the initial recognition of an NGG protospacer-adjacent motif (PAM) downstream of the target site2,7–9, which triggers local DNA strand separation to initiate its directional hybridization with a 20-nt segment in the guide crRNA to form an R-loop structure7,10,11. Target strand (TS) binding is facilitated by structural pre-ordering of nucleotides 11–20 of the crRNA (counting from the 5′ end), termed the seed sequence, in an A form-like conformation8,12. Formation of a complete R-loop leads to the activation of the Cas9 HNH and RuvC nuclease domains to catalyse cleavage of the TS and non-target DNA strand (NTS), respectively2,8,13. Although highly specific, SpCas9 cleaves off-target sites with imperfect complementarity to the guide RNA, often resulting in considerable levels of off-target genome editing14–18. The off-target activity is dependent on the number, type and positioning of base mismatches within the guide–target heteroduplex15,19–21. PAM-proximal mismatches within the seed region are discriminated against by substantially increased dissociation rates11,19,21,22, whereas PAM-distal mismatches are compatible with stable DNA binding13,19,21,23,24. Such off-targets are instead discriminated by a conformational checkpoint mechanism that monitors the integrity of the guide–target duplex to induce conformational activation of the nuclease domains11,13,19,21–24. Structural, biophysical and computational studies of SpCas9 have shed light on the mechanism of guide RNA binding, PAM recognition and nuclease activation, revealing that the enzyme undergoes extensive conformational rearrangements throughout these steps. In particular, high-resolution structures of the fully bound target DNA complex of SpCas925–28 have revealed a target-DNA-dependent conformational rearrangement of the Cas9 REC lobe that is necessary for cleavage activation. However, the mechanisms that underpin R-loop formation and off-target discrimination during conformational activation have remained elusive.
Cryo-EM analysis of R-loop formation
To investigate the mechanism of R-loop formation, we initially determined the minimal extent of target DNA complementarity necessary for stable binding using fluorescence-coupled size-exclusion chromatography, revealing that the presence of six complementary nucleotides in the PAM-proximal region of the target DNA heteroduplex is sufficient for stable association with the SpCas9–guide RNA complex. (Extended Data Fig. 1). Subsequently, catalytically inactive SpCas9 (dCas9) was reconstituted with a single-molecule guide RNA (sgRNA) and partially matched DNA targets containing 6, 8, 10, 12, 14 and 16 complementary nucleotides upstream of the PAM (Fig. 1a and Extended Data Fig. 2). We analysed the resulting complexes using cryo-electron microscopy (cryo-EM), yielding molecular reconstructions at resolutions of 3.0–4.1 Å (Extended Data Fig. 3 and Extended Data Tables 1 and 2). We additionally determined cryo-EM reconstructions of wild-type SpCas9 bound to 18-nt complementary DNA targets in the presence of 1 mM and 10 mM Mg2+, representing the checkpoint and catalytically active states, respectively (Extended Data Fig. 3 and Extended Data Table 2). Three-dimensional variability analysis29 was used to analyse conformational heterogeneity within each complex (Supplementary Videos 1–8). Most of the detected variability within each complex can be attributed to the PAM-distal duplex and the REC2, REC3 and HNH domains, suggestive of conformational equilibrium sampling. The resulting structural models are representative of the most abundant conformational state of each complex (Extended Data Fig. 4).
Extended Data Table 1.
Extended Data Table 2.
Structural superpositions of the partially matched complexes with the guide-RNA-bound binary SpCas9 complex12 provide a framework for the visualization of the DNA-binding mechanism, revealing stepwise domain rearrangements coupled to R-loop formation (Extended Data Fig. 5a). All complexes exhibit almost identical conformations of the bridge helix, REC1, RuvC and PAM-interaction domains, as well as the PAM-proximal double stranded DNA (dsDNA) duplex and the sgRNA downstream (3′ terminal) of the seed region. Conformational differences are observed in the positioning of the REC2, REC3 and the HNH domain relative to the emerging R-loop, consistent with the 3D variability analysis.
R-loop initiation by bipartite seed
The structure of the 6-nucleotide complementary target (6-nt match) complex reveals a 5-bp heteroduplex formed by the sgRNA seed sequence and TS DNA (Fig. 1b). Hybridization beyond the fifth seed sequence nucleotide is precluded by base stacking with the side chain of Tyr450, which was previously observed in the structure of the Cas9–sgRNA binary complex12 (Fig. 1c). Comparisons with the binary complex structure indicate that TS hybridization is associated with the displacement of the REC2 domain out of the central binding channel (Fig. 1b). The PAM-distal duplex part of the DNA substrate is bound in a positively charged cleft formed by the REC2 and REC3 domains (Fig. 1b and Extended Data Fig. 5b), stabilized by interactions of the REC2 residues Ser219, Thr249 and Lys263 with the NTS backbone (Extended Data Fig. 5c), and REC3 residues Arg586 and Thr657 with the TS backbone (Extended Data Fig. 5d). Similar REC lobe conformation and protein contacts with the PAM-distal end of the DNA have been observed in a 3-bp heteroduplex complex described in a recent study30. Consequently, the NTS is positioned parallel to the guide RNA–TS DNA heteroduplex within the central binding channel (Fig. 1b). The 5′-terminal part of the sgRNA appears to be conformationally flexible but residual cryo-EM density suggests its placement in a positively charged cleft located between the HNH and PAM-interaction domains (Extended Data Fig. 5e).
The structure of the 8-nucleotide complementary target (8-nt match) complex reveals that expansion of the R-loop heteroduplex, enabled by unstacking of Tyr450, forces further repositioning of the REC2 and REC3 domains to widen the binding channel as the PAM-distal duplex shifts deeper inside the channel (Figs. 1d and 2a–c and Extended Data Fig. 5f). R-loop propagation and PAM-distal duplex displacement results in the formation of new intermolecular contacts, with Cas9 contacting the PAM-distal duplex backbone through REC2 domain residues Ser217, Lys234 and Lys253, and REC3 residues Arg557 and Arg654 (Extended Data Fig. 5g,h).
Together, these observations suggest that the seed sequence of the Cas9 guide RNA is bipartite and that its hybridization with target DNA proceeds in two steps, consistent with the existence of a short-lived intermediate state observed in FRET studies11,31. To validate the observed interactions, we tested the cleavage activities of structure-based Cas9 mutant proteins in vitro (Extended Data Fig. 6a). Alanine substitution of Tyr450 resulted in substantial reductions of off-target substrate cleavage rates, whereas on-target cleavage remained largely unperturbed (Fig. 1e and Extended Data Fig. 6b). As observed previously32, the effect was more prominent for off-target substrates containing mismatches with the seed region of the guide RNA compared with off-targets containing only PAM-distal mismatches. Together, these results suggest that disruption of seed sequence interactions in the binary Cas9–sgRNA complex and early binding intermediates might exacerbate R-loop destabilization caused by off-target mismatches, resulting in an increased rate of off-target substrate dissociation and thus increased specificity. By contrast, a subset of mutations of DNA-interacting REC2 or REC3 residues resulted in increased off-target cleavage, as did the deletion of the REC2 domain (Extended Data Fig. 6b–e), consistent with single-molecule studies implicating the REC2 domain in Cas9 specificity31. Collectively, these results underscore the importance of specific Cas9–DNA contacts during early steps of R-loop formation for the specificity of Cas9.
R-loop propagation and remodelling
Further guide RNA–TS hybridization to form a 10-bp heteroduplex causes a rearrangement of the REC2 and REC3 domains and repositioning of the PAM-distal DNA duplex into the positively charged central binding channel formed by the REC3, RuvC and HNH domains (Fig. 2a). Here, the PAM-distal dsDNA duplex forms a continuous base stack with the sgRNA–TS heteroduplex (Fig. 2d). The displaced NTS is positioned underneath the HNH domain and continues to run parallel to the extending guide RNA–TS DNA heteroduplex (Extended Data Fig. 7a). X-ray crystallographic analysis of the 10-nt match complex at a resolution of 2.8 Å (Extended Data Table 3) confirmed that the TS and NTS remain hybridized at the PAM-distal end of the DNA substrate (Extended Data Fig. 7b). The PAM-distal duplex is wedged between the REC3 and RuvC domains and the L1 HNH linker (Fig. 2d and Extended Data Fig. 7a,b). The relocation of the PAM-distal duplex causes the REC2 domain to shift closer to the binding channel and occlude the cleavage site in TS DNA (Fig. 2d). This shift also establishes new electrostatic interactions between a negatively charged helix in REC2 (Glu260, Asp261, Asp269, Asp272, Asp273, Asp274 and Asp276) and a positively charged helix in REC3 (Lys599, Arg629, Lys646, Lys649, Lys652, Arg653, Arg654 and Arg655), hereafter referred to as the DDD and RRR helices, respectively (Fig. 2e), which are highly conserved across Cas9 orthologues that contain a REC2 domain (Extended Data Fig. 7c). Cleavage of off-target substrates in vitro was reduced by alanine substitutions of the interacting residues in the REC2 DDD helix, whereas mutations in the REC3 RRR helix only reduced cleavage of the off-target substrate containing a mismatch in the seed region (Extended Data Fig. 6d,e). These results suggest that the REC2–REC3 interaction contributes to Cas9 restructuring during R-loop extension; however, the DDD and RRR helices might have additional structural roles during upstream and downstream steps in the DNA-binding mechanism, particularly as the REC3 RRR helix contacts the backbone of the PAM-distal DNA duplex during early stages of target binding (Extended Data Fig. 5d,h).
Extended Data Table 3.
Values in parentheses are for highest-resolution shell. Structure was solved from data collected from two crystals.
The HNH nuclease domain remains docked on the RuvC and PI domains in the 6-, 8- and 10-nt match complexes, with the active site buried at the interdomain interface (Fig. 3a). R-loop extension past the seed region to form a 12-bp heteroduplex does not result in major REC2/3 domain rearrangements, with the PAM-distal duplex remaining coaxially stacked onto the guide RNA–TS DNA heteroduplex throughout the 12-, 14-, 16- and 18-nt match complexes (Fig. 3b). By contrast, the HNH domain becomes disordered along with the surrounding RuvC 1011–1040 and PI 1245–1251 loops in the 12-nt match complex (Fig. 3c). Upon extension of the R-loop heteroduplex to 14 bp, the RuvC and PI loops responsible for HNH docking remain structurally disordered (Fig. 3c and Extended Data Fig. 8a) and residual density is observed for the HNH domain as its L2 linker contacts the guide RNA–TS heteroduplex (Extended Data Fig. 8a). Further extension of the R-loop heteroduplex from 14 to 16 bp causes translocation of the HNH domain towards the guide RNA–TS DNA heteroduplex within the central binding channel (Fig. 3c). Facilitated by the formation of the PAM-distal part of the R-loop, a loop in the RuvC domain (residues 1030–1040) restructures into a helical conformation, establishing interactions with the L2 linker (Extended Data Fig. 8b). This repositions the L2 linker and shifts the HNH domain on top of the heteroduplex, sealing off the central binding channel (Fig. 3c and Extended Data Fig. 8c). The HNH domain remains in a catalytically incompetent orientation, with its active site located around 31 Å away from the scissile phosphate group in the TS.
Conformational checkpoint and activation
Previous studies have shown that substrates containing 4-bp mismatches at the PAM-distal end of the target sequence (positions 16–20) are generally refractory to Cas9 cleavage, whereas substrates containing mismatches at positions 19 and 20 are efficiently cleaved13,23,24,33. The cryo-EM reconstruction of the 18-nt match complex in the presence of 1 mM Mg2+ reveals that the most populated 3D class in the sample represents a pre-cleavage state with an intact TS and disordered NTS (Fig. 4a). Upon extension of the R-loop to 18 bp, the HNH domain continues to assume the catalytically incompetent orientation observed in the 16-nt match complex, whereas the conformation of the REC2 and REC3 domains remains the same as in the 12-, 14- and 16-nt match complexes (Figs. 3b and 4a). The observed conformation is thus consistent with a catalytically inactive checkpoint state inferred from previous biophysical and structural studies23,24,33.
The cryo-EM reconstruction obtained from a sample reconstituted in the presence of 10 mM Mg2+ reveals a catalytically active conformation in which both the TS and the NTS are cleaved at the expected positions (Fig. 4b–f). In contrast to previously reported structures of catalytically active Cas9 enzymes28,34,35, the PAM-proximal part of the cleaved NTS remains bound in the RuvC active site (Fig. 4b,f). In this state, the REC2 domain is shifted away from the TS cleavage site, enabling the HNH domain to undergo a rotation of about 140° to engage the TS scissile phosphate with its active site and catalyse its hydrolysis via a one-metal-ion mechanism (Fig. 4d), in agreement with previous structural data28,34,35. This rearrangement is facilitated by pronounced bending of the PAM-distal region of the guide RNA–TS DNA heteroduplex and a concomitant reorientation of the REC3 domain that preserves interactions with the heteroduplex (Fig 4c). HNH domain rotation is brought about by restructuring of the L1 and L2 linkers, which results in the widening of the NTS binding cleft and exposure of the RuvC active site (Fig. 4b,e,f). The L1 linker, which is structurally disordered in the 18-nt match checkpoint complex, forms an α-helix and interacts with the minor groove of the guide RNA–TS DNA heteroduplex via multiple hydrogen-bonding interactions (Fig. 4e). The L2 linker helix becomes extended, allowing Phe916 to intercalate between NTS nucleobases by π–π stacking, thereby stabilizing the NTS in the RuvC active site (Fig. 4f). The NTS scissile phosphate is coordinated by two Mg2+ ions, its position consistent with a His983-dependent catalytic mechanism proposed by molecular dynamics simulations36. A recent complementary study reported the structure of a 17-nt match catalytic complex that exhibits nearly identical HNH domain positioning and bent conformation of the guide RNA–TS DNA heteroduplex as observed in the 18-nt match catalytic complex37, indicating that catalytic activation can occur once a 17-bp heteroduplex is formed. Together, these structural observations provide a rationale for the allosteric coupling of R-loop formation with HNH domain rearrangement and RuvC active site accessibility, in agreement with single-molecule studies showing that PAM-distal end positioning modulates HNH domain conformation33.
Conclusions
In sum, our structural analysis of SpCas9 along its DNA-binding pathway points to a mechanism whereby R-loop formation is allosterically and energetically coupled to domain rearrangements necessary for nuclease domain activation (Extended Data Fig. 9). The initial phase of R-loop formation is facilitated by TS hybridization to a bipartite seed sequence of the guide RNA and interactions of the PAM-distal DNA with the Cas9 REC2 and REC3 domains. The observation of a bipartite seed sequence in the Cas9 guide RNA and a two-step seed hybridization mechanism involving a conformational rearrangement brings parallels with other RNA-guided nucleic acid-targeting systems including the Cascade complex and Argonaute proteins, both of which feature discontinuous seed sequences in their guide RNAs38–41. We identify mutations that destabilize the binding intermediate states and thus increase off-target discrimination, which presents an opportunity for the development of novel high-fidelity SpCas9 variants. As most off-target sequences are only bound but not cleaved19–21,42, these variants could prove useful for applications that rely on the fidelity of Cas9 target binding, such as transcriptional regulation or base editing43. Directional target DNA hybridization is associated with dynamic repositioning of the REC2, REC3 and HNH domains to initially assume a catalytically inactive, checkpoint conformation upon R-loop completion. As conformational activation of the nuclease domains is allosterically controlled by structural distortion of the PAM-distal end of the guide–target heteroduplex and the sensing of its integrity by Cas9, it is precluded by incomplete PAM-distal heteroduplex pairing (<17 bp). Bona fide off-target substrates are able to pass the conformational checkpoint because they maintain heteroduplex integrity despite the presence of PAM-distal mismatches, in agreement with our recent structural data44. Furthermore, guide RNA modifications that result in altered heteroduplex conformation have profound effects on Cas9 nuclease activity and specificity45. Together, our structural studies thus highlight the importance of maintaining guide–target complementarity and proper heteroduplex geometry, consistent with biophysical and computational studies showing that the conformation of the R-loop heteroduplex strongly affects off-target binding11,46. These findings thus have important implications for ongoing experimental and computational studies of CRISPR–Cas9 off-target activity, and will inform its further technological development.
Methods
Expression and purification of Cas9 proteins
Wild-type and mutant SpCas9 proteins were expressed in Escherichia coli Rosetta 2 (DE3) (Novagen) for 16 h at 18 °C as fusion proteins with an N-terminal His6–MBP–TEV tag. Bacterial pellets were resuspended and lysed in 20 mM HEPES-KOH pH 7.5, 500 mM KCl, 5 mM imidazole, supplemented with protease inhibitors. Cell lysates were clarified using ultracentrifugation and loaded on a 15 ml Ni-NTA Superflow column (QIAGEN) and washed with 7 column volumes of 20 mM HEPES-KOH pH 7.5, 500 mM KCl, 5 mM imidazole. Tagged Cas9 was eluted with 10 column volumes of 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 200 mM imidazole. Salt concentration was adjusted to 250 mM KCl and the protein was loaded on a 10 ml HiTrap Heparin HP column (GE Healthcare) equilibrated in 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 1 mM DTT. The column was washed with 5 column volumes of 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 1 mM DTT, and dCas9 was eluted with 15 column volumes of 20 mM HEPES-KOH pH 7.5, 1.5 M KCl, 1 mM DTT, in a 0–50% gradient (peak elution around 500 mM KCl). His6–MBP tag was removed by TEV protease cleavage overnight at 4 °C with gentle shaking. The untagged protein was concentrated and further purified on a Superdex 200 16/600 gel filtration column (GE Healthcare) in 20 mM HEPES-KOH pH 7.5, 500 mM KCl, 1 mM DTT. Pure fractions were concentrated to 10 mg/ml, flash frozen in liquid nitrogen and stored at 80 °C.
sgRNA in vitro transcription
The sgRNA was transcribed from a dsDNA template (Supplementary Table 1) in a 5 ml transcription reaction (30 mM Tris-HCl pH 8.1, 25 mM MgCl2, 2 mM spermidine, 0.01% Triton X-100, 5 mM CTP, 5 mM ATP, 5 mM GTP, 5 mM UTP, 10 mM DTT, 1 µM DNA transcription template, 0.5 units inorganic pyrophosphatase (Thermo Fisher), 250 µg T7 RNA polymerase). The transcription reaction was incubated at 37 °C for 5 h, after which the dsDNA template was degraded for 30 min with 15 units of RQ1 DNAse (Promega). The transcribed sgRNA was PAGE purified on an 8% denaturing polyacrylamide gel containing 7 M urea, ethanol precipitated and dissolved in DEPC-treated water.
Gel filtration binding assay
The dCas9–guide RNA complex was assembled by incubating 371 pmol dCas9 with 400 pmol of the sgRNA in 20 mM HEPES-KOH pH 7.5, 200 mM KCl, 2 mM MgCl2 for 10 min at room temperature. Then 250 pmol of Cy5-labelled dsDNA substrate was added and incubated another 15 min. The volume was adjusted up to 100 µl with reaction buffer and the mixture was centrifuged to remove possible precipitates. Individual reactions were transferred to a 96-well plate and analysed using a Superdex 200 Increase 5/150 GL gel filtration column (GE Healthcare) attached to an Agilent 1200 Series Gradient HPLC system. The 260 nm, 280 nm and Cy5 signals were exported and plotted as a function of the retention volume in GraphPad Prism 9.
In vitro nuclease activity assays
Cleavage reactions were performed at 37 °C in reaction buffer, containing 20 mM HEPES pH 7.5, 250 mM KCl, 5 mM MgCl2 and 1 mM DTT. First, Cas9 protein was pre-incubated with sgRNA in 1:1.25 ratio for 10 min at room temperature. The protein–RNA complex was rapidly mixed with the dsDNA substrates (containing 5′-ATTO-532 labelled TS) (Supplementary Table 1), to yield final concentrations of 1.67 μM protein and 66.67 nM substrate in a 7.5 µl reaction. Complexes were collected at 1 min, 2.5 min, 5 min, 15 min, 45 min, 90 min, 150 min and 24 h. Cleavage was stopped by addition of 2 µl of 250 mM EDTA, 0.5% SDS and 20 μg of proteinase K. Formamide was added to the reactions with final concentration of 50%, samples were incubated at 95 °C for 10 min, and resolved on a 15% denaturing PAGE gel containing 7 M urea and imaged using a Typhoon FLA 9500 gel imager.
Statistics and reproducibility
Nuclease activity rate constants (kobs) were extracted from single exponential fits: [Product] = A × (1 − exp(−kobs × t)). kobs data are presented as mean ± s.e.m. (n = 4 independent replicates), obtained by direct fitting of four time-course datasets in GraphPad Prism 9 without calculating individual kobs values. Statistical analysis was performed using a two-sided t-test. The confidence interval used was 95%.
Crystallization and X-ray structure determination
The 10-nt complementary ternary complex of dCas9 was assembled by first incubating dCas9 with the sgRNA in a 1:1.5 molar ratio, and pre-purifying the binary complex on a Superdex 200 16/600 gel filtration column (GE Healthcare) in 20 mM HEPES-KOH pH 7.5, 500 mM KCl, 1 mM DTT. The binary complex was diluted in 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 1 mM DTT to 2.5 mg ml−1 and the partially complementary dsDNA substrate was added in 1:1.5 molar excess. For crystallization, 1 µl of the ternary complex (1.5–2.5 mg ml−1) was mixed with 1 µl of the reservoir solution (0.1 M sodium cacodylate pH 6.5, 0.8–1.2 M ammonium formate, 12–14% PEG4000) and crystals were grown at 20 °C using the hanging drop vapour diffusion setup. Crystals were collected after 3–4 weeks, cryoprotected in 0.1 M Na cacodylate pH 6.5, 1.0 M ammonium formate, 13% PEG4000, 20% glycerol, 2 mM MgCl2, and flash-cooled in liquid nitrogen. Diffraction data was measured at the beamline PXIII of the Swiss Light Source at a temperature of 100 K (Paul Scherrer Institute, Villigen, Switzerland) and processed using the autoPROC and STARANISO package with anisotropic cut-off47. Phases were obtained by molecular replacement using the Phaser module of the Phenix package48 using the NUC lobe of the PDB ID: 5FQ5 as initial search model. The crystals belonged to the P1 space group and contained two copies of the complex in the asymmetric unit.
Cryo-EM sample preparation and data acquisition
To assemble the 6-, 8-, 10-, 12-, 14- and 16-nt match complexes, dCas9 protein was mixed with the sgRNA in a 1:1.5 molar ratio, and incubated at room temperature for 10 min in buffer 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 1 mM DTT. The respective partially complementary dsDNA substrate (Supplementary Table 1) was then added in a 1:3 Cas9:DNA molar ratio and incubated another 20 min at room temperature. The complexes were then purified using a Superdex 200 Increase 10/300 GL gel filtration column (GE Healthcare) and eluted in 20 mM HEPES-KOH pH 7.5, 250 mM KCl, 1 mM DTT. Concentration of the monomeric peak was determined using the Qubit 4 Fluorometer Protein Assay, and then diluted to 0.275 mg ml−1 in 20 mM HEPES-KOH pH 7.5, 250 mM KCl cold buffer. 3 µl of diluted complexes were applied to a glow discharged 200-mesh holey carbon grid (Au 1.2/1.3 Quantifoil Micro Tools), blotted for 1.5–2.5 s at 90% humidity, 20 °C, plunge frozen in liquid propane/ethane mix (Vitrobot, FEI) and stored in liquid nitrogen. To prepare the 18-nt match (checkpoint), wild-type Cas9–sgRNA complex was reconstituted with substrate DNA in 20 mM HEPES-KOH pH 7.5, 150 mM KCl, 1 mM DTT buffer, and incubated with 1 mM MgCl2 for 1 min at 37 °C prior to vitrification. The 18-nt match catalytic complex was reconstituted in 20 mM HEPES-KOH pH 7.5, 100 mM KCl, 1 mM DTT buffer, and incubated with 10 mM MgCl2 for 1 min at 37 °C prior to vitrification. Data collection was performed on a 300 kV FEI Titan Krios G3i microscope equipped with a Gatan Quantum Energy Filter and a K3 direct detection camera in super-resolution mode. Micrographs were recorded at a calibrated magnification of 130,000× with a pixel size of 0.325 Å and subsequently binned to 0.65 Å. Data acquisition was performed automatically using EPU with three shots per hole at −0.8 μm to −2.2 μm defocus. Data for the 18-nt match (checkpoint) complex was collected using a Titan Krios G4 equipped with a SelectrisX energy filter and a FalconIV detector at a magnification of 270,000×, pixel size of 0.45 Å, defocus −0.8 μm to −1.5 μm.
Cryo-EM data processing
Acquired super-resolution cryo-EM data was processed using cryoSPARC49. Gain-corrected micrographs were imported and binned to a pixel size of 0.65 Å during patch motion correction. After patch CTF estimation, micrographs with a resolution estimation worse than 5 Å and full-frame motion distance larger than 100 Å were discarded. Initial particles were picked using blob picker with 100–140 Å particle size. Particle picks were inspected and particles with NCC scores below 0.4 were discarded. Remaining particles were extracted with a box size of 384 × 384 pixels, down-sampled to 192 × 192 pixels. After 2D classification, templates were generated using good classes and particle picking was repeated using the template picker. Duplicate particles were removed, and 2D classified Cas9 particles were used for ab initio 3D reconstruction. All partially bound complexes displayed several conformational states. After several rounds of 3D classification, classes with most detailed features were reextracted using full 384 × 384 pixel box size and subjected to non-uniform refinement to generate high-resolution reconstructions50. The 18-nt match (checkpoint) complex was extracted with a box size of 504 × 504 pixels. Each map was sharpened using the appropriate B-factor value to enhance structural features, and local resolution was calculated and visualized using ChimeraX51.
Structural model building, refinement and analysis
Manual Cas9 domain placement based on PDB model 5FQ5, model adjustment and nucleic acid building was completed using COOT52. Atomic model refinement was performed using Phenix.refine for X-ray data and Phenix.real_space_refine for cryo-EM48. The quality of refined models was assessed using MolProbity53. Protein-nucleic acid interactions were analysed using the PISA web server54. Characterization of the guide–protospacer duplex was performed using the 3DNA 2.0 web server55. Structural figures were generated using ChimeraX51.
Protein sequence alignment
Protein sequences of Cas9 orthologues harbouring the REC2 domain were obtained from UniProt. Sequence alignment was performed using MUSCLE with default parameters56. Alignment was visualized using Jalview with highlighting only the conservation of charged residues57.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-022-05114-0.
Supplementary information
Source data
Acknowledgements
This work was supported by the Swiss National Science Foundation Grant 31003A_182567 (to M.J.). M.J. is an International Research Scholar of the Howard Hughes Medical Institute and Vallee Scholar of the Bert L. and N. Kuggie Vallee Foundation. We thank S. Sorrentino and A. Myasnikov for their assistance with cryogenic electron microscopy data collection; F. Boneberg and C. Chanez for their help with preparing reagents; members of the Jinek laboratory for discussion and critical reading of the manuscript; and J. Cofsky, K. Soczek and J. Doudna for sharing unpublished data and helpful comments.
Extended data figures and tables
Author contributions
M.P. and M.J. conceived the study and designed experiments. M.P. purified Cas9, performed in vitro cleavage assays, crystallized 10-bp heteroduplex complex, prepared cryo-EM samples and solved the structures. I.Q., L.L. and L.M.M. expressed and purified Cas9 mutant proteins, assisted with figure preparation, and collected cryo-EM data. M.S. collected cryo-EM data. M.P. and M.J. performed structural analysis and wrote the manuscript with input from the remaining authors.
Peer review
Peer review information
Nature thanks Rick Russell, John van der Oost and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.
Data availability
Atomic coordinates, maps and structure factors of the reported X-ray and cryo-EM structures have been deposited in the Protein Data Bank under accession numbers 7Z4D (10-nt match complex, X-ray), 7Z4C (6-nt match complex, cryo-EM), 7Z4E (8-nt match complex, cryo-EM), 7Z4K (10-nt match complex, cryo-EM), 7Z4G (12-nt match complex, cryo-EM), 7Z4H (14-nt match complex, cryo-EM), 7Z4I (16-nt match complex, cryo-EM), 7Z4L (18-nt match checkpoint complex, cryo-EM) and 7Z4J (18-nt match catalytic complex, cryo-EM) and in the Electron Microscopy Data Bank under accession codes EMD-14493 (6-nt match complex, cryo-EM), 14494 (8-nt match complex, cryo-EM), 14500 (10-nt match complex, cryo-EM), 14496 (12-nt match complex, cryo-EM), 14497 (14-nt match complex, cryo-EM), 14498 (16-nt match complex, cryo-EM), 14501 (18-nt match checkpoint complex, cryo-EM) and 14499 (18-nt match catalytic complex, cryo-EM). Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Luuk Loeff, Irma Querques, Lena M. Muckenfuss
Change history
10/30/2023
A Correction to this paper has been published: 10.1038/s41586-023-06779-x
Extended data
is available for this paper at 10.1038/s41586-022-05114-0.
Supplementary information
The online version contains supplementary material available at 10.1038/s41586-022-05114-0.
References
- 1.Garneau JE, et al. The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature. 2010;468:67–71. doi: 10.1038/nature09523. [DOI] [PubMed] [Google Scholar]
- 2.Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sapranauskas R, et al. The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli. Nucleic Acids Res. 2011;39:9275–9282. doi: 10.1093/nar/gkr606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Cong L, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jinek M, et al. RNA-programmed genome editing in human cells. eLife. 2013;2:e00471. doi: 10.7554/eLife.00471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mali P, et al. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. doi: 10.1126/science.1232033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mekler V, Minakhin L, Severinov K. Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation. Proc. Natl Acad. Sci. USA. 2017;114:5443–5448. doi: 10.1073/pnas.1619926114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sternberg SH, Redding S, Jinek M, Greene EC, Doudna JA. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature. 2014;507:62–67. doi: 10.1038/nature13011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Szczelkun MD, et al. Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proc. Natl Acad. Sci. USA. 2014;111:9798–9803. doi: 10.1073/pnas.1402597111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Anders C, Niewoehner O, Duerst A, Jinek M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature. 2014;513:569–573. doi: 10.1038/nature13579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ivanov IE, et al. Cas9 interrogates DNA in discrete steps modulated by mismatches and supercoiling. Proc. Natl Acad. Sci. USA. 2020;117:5853–5860. doi: 10.1073/pnas.1913445117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jiang F, Zhou K, Ma L, Gressel S, Doudna JA. A Cas9–guide RNA complex preorganized for target DNA recognition. Science. 2015;348:1477–1481. doi: 10.1126/science.aab1452. [DOI] [PubMed] [Google Scholar]
- 13.Sternberg SH, LaFrance B, Kaplan M, Doudna JA. Conformational control of DNA target cleavage by CRISPR–Cas9. Nature. 2015;527:110–113. doi: 10.1038/nature15544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cameron P, et al. Mapping the genomic landscape of CRISPR–Cas9 cleavage. Nat. Methods. 2017;14:600–606. doi: 10.1038/nmeth.4284. [DOI] [PubMed] [Google Scholar]
- 15.Doench JG, et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR–Cas9. Nat. Biotechnol. 2016;34:184–191. doi: 10.1038/nbt.3437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hsu PD, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 2013;31:827–832. doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lazzarotto CR, et al. CHANGE-seq reveals genetic and epigenetic effects on CRISPR–Cas9 genome-wide activity. Nat. Biotechnol. 2020;38:1317–1327. doi: 10.1038/s41587-020-0555-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tsai SQ, et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR–Cas nucleases. Nat. Biotechnol. 2015;33:187–197. doi: 10.1038/nbt.3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Boyle EA, et al. Quantification of Cas9 binding and cleavage across diverse guide sequences maps landscapes of target engagement. Sci. Adv. 2021;7:eabe5496. doi: 10.1126/sciadv.abe5496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jones SK, Jr, et al. Massively parallel kinetic profiling of natural and engineered CRISPR nucleases. Nat. Biotechnol. 2021;39:84–93. doi: 10.1038/s41587-020-0646-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhang L, et al. Systematic in vitro profiling of off-target affinity, cleavage and efficiency for CRISPR enzymes. Nucleic Acids Res. 2020;48:5037–5053. doi: 10.1093/nar/gkaa231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Singh D, Sternberg SH, Fei J, Doudna JA, Ha T. Real-time observation of DNA recognition and rejection by the RNA-guided endonuclease Cas9. Nat. Commun. 2016;7:12778. doi: 10.1038/ncomms12778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chen JS, et al. Enhanced proofreading governs CRISPR–Cas9 targeting accuracy. Nature. 2017;550:407–410. doi: 10.1038/nature24268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Dagdas YS, Chen JS, Sternberg SH, Doudna JA, Yildiz A. A conformational checkpoint between DNA binding and cleavage by CRISPR–Cas9. Sci. Adv. 2017;3:eaao0027. doi: 10.1126/sciadv.aao0027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Anders C, Bargsten K, Jinek M. Structural plasticity of PAM recognition by engineered variants of the RNA-guided endonuclease Cas9. Mol. Cell. 2016;61:895–902. doi: 10.1016/j.molcel.2016.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jiang F, et al. Structures of a CRISPR–Cas9 R-loop complex primed for DNA cleavage. Science. 2016;351:867–871. doi: 10.1126/science.aad8282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Nishimasu H, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014;156:935–949. doi: 10.1016/j.cell.2014.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Zhu X, et al. Cryo-EM structures reveal coordinated domain motions that govern DNA cleavage by Cas9. Nat. Struct. Mol. Biol. 2019;26:679–685. doi: 10.1038/s41594-019-0258-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Punjani A, Fleet DJ. 3D variability analysis: resolving continuous flexibility and discrete heterogeneity from single particle cryo-EM. J. Struct. Biol. 2021;213:107702. doi: 10.1016/j.jsb.2021.107702. [DOI] [PubMed] [Google Scholar]
- 30.Cofsky JC, Soczek KM, Knott GJ, Nogales E, Doudna JA. CRISPR–Cas9 bends and twists DNA to read its sequence. Nat. Struct. Mol. Biol. 2022;29:395–402. doi: 10.1038/s41594-022-00756-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sung K, Park J, Kim Y, Lee NK, Kim SK. Target specificity of Cas9 nuclease via DNA rearrangement regulated by the REC2 domain. J. Am. Chem. Soc. 2018;140:7778–7781. doi: 10.1021/jacs.8b03102. [DOI] [PubMed] [Google Scholar]
- 32.Kleinstiver BP, et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature. 2016;529:490–495. doi: 10.1038/nature16526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yang M, et al. The conformational dynamics of Cas9 governing DNA cleavage are revealed by single-molecule FRET. Cell Rep. 2018;22:372–382. doi: 10.1016/j.celrep.2017.12.048. [DOI] [PubMed] [Google Scholar]
- 34.Sun W, et al. Structures of Neisseria meningitidis Cas9 complexes in catalytically poised and anti-CRISPR-inhibited states. Mol. Cell. 2019;76:938–952 e935. doi: 10.1016/j.molcel.2019.09.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zhang Y, et al. Catalytic-state structure and engineering of Streptococcus thermophilus Cas9. Nat. Catal. 2020;3:813–823. doi: 10.1038/s41929-020-00506-9. [DOI] [Google Scholar]
- 36.Casalino L, Nierzwicki L, Jinek M, Palermo G. Catalytic mechanism of non-target DNA cleavage in CRISPR–Cas9 revealed by ab initio molecular dynamics. ACS Catal. 2020;10:13596–13605. doi: 10.1021/acscatal.0c03566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bravo JPK, et al. Structural basis for mismatch surveillance by CRISPR–Cas9. Nature. 2022;603:343–347. doi: 10.1038/s41586-022-04470-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Klum SM, Chandradoss SD, Schirle NT, Joo C, MacRae IJ. Helix-7 in Argonaute2 shapes the microRNA seed region for rapid target recognition. EMBO J. 2018;37:75–88. doi: 10.15252/embj.201796474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mulepati S, Heroux A, Bailey S. Crystal structure of a CRISPR RNA-guided surveillance complex bound to a ssDNA target. Science. 2014;345:1479–1484. doi: 10.1126/science.1256996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Blosser TR, et al. Two distinct DNA binding modes guide dual roles of a CRISPR–Cas protein complex. Mol. Cell. 2015;58:60–70. doi: 10.1016/j.molcel.2015.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Xiao Y, et al. Structure basis for directional R-loop formation and substrate handover mechanisms in type I CRISPR–Cas system. Cell. 2017;170:48–60 e11. doi: 10.1016/j.cell.2017.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kuscu C, Arslan S, Singh R, Thorpe J, Adli M. Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol. 2014;32:677–683. doi: 10.1038/nbt.2916. [DOI] [PubMed] [Google Scholar]
- 43.Anzalone AV, Koblan LW, Liu DR. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 2020;38:824–844. doi: 10.1038/s41587-020-0561-9. [DOI] [PubMed] [Google Scholar]
- 44.Pacesa, M. et al. Structural basis for Cas9 off-target activity. Preprint at bioRxiv10.1101/2021.11.18.469088 (2021).
- 45.Donohoue PD, et al. Conformational control of Cas9 by CRISPR hybrid RNA–DNA guides mitigates off-target activity in T cells. Mol. Cell. 2021;81:3637–3649.e3635. doi: 10.1016/j.molcel.2021.07.035. [DOI] [PubMed] [Google Scholar]
- 46.Newton MD, et al. DNA stretching induces Cas9 off-target activity. Nat. Struct. Mol. Biol. 2019;26:185–192. doi: 10.1038/s41594-019-0188-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Vonrhein C, et al. Data processing and analysis with the autoPROC toolbox. Acta Crystallogr. D. 2011;67:293–302. doi: 10.1107/S0907444911007773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Liebschner D, et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D. 2019;75:861–877. doi: 10.1107/S2059798319011471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods. 2017;14:290–296. doi: 10.1038/nmeth.4169. [DOI] [PubMed] [Google Scholar]
- 50.Punjani A, Zhang H, Fleet DJ. Non-uniform refinement: adaptive regularization improves single-particle cryo-EM reconstruction. Nat. Methods. 2020;17:1214–1221. doi: 10.1038/s41592-020-00990-8. [DOI] [PubMed] [Google Scholar]
- 51.Pettersen EF, et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 2021;30:70–82. doi: 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of Coot. Acta Crystallogr. D. 2010;66:486–501. doi: 10.1107/S0907444910007493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Williams CJ, et al. MolProbity: more and better reference data for improved all-atom structure validation. Protein Sci. 2018;27:293–315. doi: 10.1002/pro.3330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 2007;372:774–797. doi: 10.1016/j.jmb.2007.05.022. [DOI] [PubMed] [Google Scholar]
- 55.Li S, Olson WK, Lu XJ. Web 3DNA 2.0 for the analysis, visualization, and modeling of 3D nucleic acid structures. Nucleic Acids Res. 2019;47:W26–W34. doi: 10.1093/nar/gkz394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview Version 2-a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25:1189–1191. doi: 10.1093/bioinformatics/btp033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Atomic coordinates, maps and structure factors of the reported X-ray and cryo-EM structures have been deposited in the Protein Data Bank under accession numbers 7Z4D (10-nt match complex, X-ray), 7Z4C (6-nt match complex, cryo-EM), 7Z4E (8-nt match complex, cryo-EM), 7Z4K (10-nt match complex, cryo-EM), 7Z4G (12-nt match complex, cryo-EM), 7Z4H (14-nt match complex, cryo-EM), 7Z4I (16-nt match complex, cryo-EM), 7Z4L (18-nt match checkpoint complex, cryo-EM) and 7Z4J (18-nt match catalytic complex, cryo-EM) and in the Electron Microscopy Data Bank under accession codes EMD-14493 (6-nt match complex, cryo-EM), 14494 (8-nt match complex, cryo-EM), 14500 (10-nt match complex, cryo-EM), 14496 (12-nt match complex, cryo-EM), 14497 (14-nt match complex, cryo-EM), 14498 (16-nt match complex, cryo-EM), 14501 (18-nt match checkpoint complex, cryo-EM) and 14499 (18-nt match catalytic complex, cryo-EM). Source data are provided with this paper.