Skip to main content
Nature Communications logoLink to Nature Communications
. 2026 Apr 16;17:3584. doi: 10.1038/s41467-026-71626-2

Engineering a compact high-fidelity Staphylococcus aureus Cas9 variant with broader targeting range and mechanistic insights into its activation

Satoshi N Omura 1,#, Ryoya Nakagawa 1,#, Shohei Kajimoto 1,#, Sae Okazaki 2, Soh Ishiguro 3, Hideto Mori 4,5,6, Kosuke Onishi 1, Yuji Kashiwakura 7,8, Takafumi Hiramoto 7, Kio Horinaka 1, Mamoru Tanaka 2, Hisato Hirano 1, Kasey Jividen 9, Keitaro Yamashita 2, Shengdar Q Tsai 9, Nozomu Yachie 3,4,10, Tsukasa Ohmori 7,8, Hiroshi Nishimasu 2,11,12,, Osamu Nureki 1,
PMCID: PMC13090337  PMID: 41991526

Abstract

Staphylococcus aureus Cas9 (SaCas9) is smaller than the widely used Streptococcus pyogenes Cas9 (SpCas9) and has been harnessed for gene therapy using an adeno-associated virus vector. However, SaCas9 requires a longer NNGRRT (where N is any nucleotide and R is A or G) protospacer adjacent motif (PAM) for target DNA recognition, thereby restricting the targeting range. Although PAM-relaxed Cas9 variants have been developed, expanded targeting is often accompanied by compromised target specificity. Here, we report the rational engineering of eSaCas9-NNG, a SaCas9 variant that recognizes relaxed NNG PAMs while maintaining high target fidelity, thereby overcoming a fundamental trade-off in Cas9-based genome editing. eSaCas9-NNG efficiently induces indels and base conversions at endogenous sites bearing NNG PAMs in human cells and mice, with editing efficiencies comparable to those of other PAM-relaxed nucleases, including SpRY, SpG, and iGeoCas9, but with reduced off-target activity. We further determine the cryo-electron microscopy structures of eSaCas9-NNG in five distinct functional states, revealing the structural basis for its relaxed PAM recognition, improved target specificity, and nuclease activation. Overall, our findings demonstrate that eSaCas9-NNG could be used as a versatile genome editing tool for in vivo gene therapy, and improve our mechanistic understanding of the diverse CRISPR-Cas9 nucleases.

Subject terms: Cryoelectron microscopy, Gene therapy, CRISPR-Cas9 genome editing


CRISPR-Cas9-based genome editing is powerful but limited by target range, specificity, and delivery constraints. Here, authors engineer a compact SaCas9 that recognizes NNG PAMs for efficient genome and base editing in cells and mice, and reveal its activation mechanism via cryo-EM structural analysis.

Introduction

The RNA-guided DNA endonuclease Cas9 associates with a single-guide RNA (sgRNA) to cleave double-stranded DNA (dsDNA) targets complementary to the sgRNA guide. Since Streptococcus pyogenes Cas9 (SpCas9) exhibits high nuclease activity, it has been widely used for genome editing in eukaryotic cells13. Besides the guide RNA–target DNA complementarity, SpCas9 requires an NGG (where N is any nucleobase) sequence as the protospacer adjacent motif (PAM), restricting the targetable genomic sites4. To relax this constraint, we and others have engineered SpCas9 variants with altered PAM specificities, such as SpCas9-NG5 and SpG/SpRY6, enabling genome editing at expanded target sites in various cell lines. However, broadened PAM compatibility is frequently accompanied by reduced target specificity relative to wild-type SpCas969. Moreover, the large size of SpCas9 and its engineered variants (1368 residues and 4.1 kb) poses a challenge in packaging them into an adeno-associated virus (AAV) vector for delivery into the target tissue, hampering their therapeutic applications10.

Staphylococcus aureus Cas9 (SaCas9) consists of 1053 residues (3.2 kb), approximately 0.95 kb shorter than SpCas9, and exhibits genome editing activities comparable to those of SpCas9 in human cells11. Notably, SaCas9, along with its sgRNA, can be packaged into a single AAV vector, enabling genome editing in mouse liver11. Furthermore, the catalytically inactive version of SaCas9 (dSaCas9) fused to a transcriptional regulator or the nickase version of SaCas9 (nSaCas9) fused to a cytosine or adenosine deaminase can be utilized as compact tools for transcriptional regulation or base editing, respectively1215. However, SaCas9 requires relatively long NNGRRT (where R is A or G) PAMs, limiting its utility in genome editing applications11.

Here, we rationally engineered a SaCas9 variant, eSaCas9-NNG, that recognizes relaxed NNG PAMs while exhibiting reduced off-target cleavage, thereby addressing the inherent trade-off between target range and specificity in Cas9-based genome editing. eSaCas9-NNG efficiently induces both indels and single-base conversions at endogenous target sites with NNG PAMs in human cells and mice. We then determined the cryo-electron microscopy (cryo-EM) structures of eSaCas9-NNG in complex with its cognate sgRNA and dsDNA target in five distinct functional states, explaining how the introduced mutations alter the PAM specificity and enhance the cleavage fidelity. Our structures in multiple states reveal the stepwise domain rearrangements coupled to guide RNA–target DNA heteroduplex formation, highlighting the differences in the activation mechanisms between the small eSaCas9-NNG and the large SpCas9 variant SpRY8. Overall, this study demonstrates that the engineered SaCas9 variant can be harnessed as a compact and precise AAV-deliverable genome editing tool, and advances our understanding of the RNA-guided DNA cleavage mechanisms of the diverse Cas9 enzymes.

Results

Structure-guided engineering of SaCas9 for relaxed PAM recognition

To determine the optimal guide length for SaCas9, we performed in vitro cleavage experiments, using the purified SaCas9, sgRNAs with 20- to 23-nt guide segments (sgRNA20–23), and linearized plasmid DNA containing a target sequence and the canonical TTGAAT PAM (Supplementary Table 1). SaCas9 with all sgRNAs efficiently cleaved the DNA target, and sgRNA21 showed superior activity (Fig. 1a and Supplementary Fig. 1a), consistent with a previous study showing that 21-nt guide sgRNAs are optimal for SaCas9-mediated genome editing in human cells11. Therefore, we employed sgRNAs with 21-nt guides for the following experiments.

Fig. 1. Engineering of the SaCas9-NNG variant.

Fig. 1

a In vitro DNA cleavage activities of SaCas9 with the 20–23-nt guide sgRNAs. The linearized plasmid target bearing the TTGAAT PAM was incubated with the SaCas9–sgRNA complex at 37 °C for 0.5 and 2 min. The cleavage products were then analyzed by a MultiNA microchip electrophoresis system. Data are mean ± s.d. from n  =  3 biologically independent experiments. b In vitro DNA cleavage activities of the N985A/R991A, N985A/R991A/E782K, N985A/R991A/L800R, N985A/R991A/T927K, N985A/R991A/N968R, N985A/R991A/A1021S, and N985A/R991A/E782K/L800R/T927K/N968R/A1021S (AAKRKRS) mutants. A linearized plasmid target bearing the TTGAAT PAM was incubated with the SaCas9–sgRNA complex (50 nM) at 37 °C for 2 and 5 min, and the reaction products were then analyzed using a MultiNA microchip electrophoresis system. Data are mean ± s.d. from n  =  3 biologically independent experiments. c, d In vitro DNA cleavage activities of SaCas9 (c) and SaCas9-NNG (d) toward DNA targets with different PAMs. The linearized plasmid targets were incubated with the SaCas9–sgRNA complex at 37 °C for 0.5 and 2 min. Data are mean ± s.d. from n  =  3 biologically independent experiments. e Sequence logos of the PAMs for SaCas9 (left) and SaCas9-NNG (right), obtained from the PAM identification assay. Data are mean ± s.d. from n  =  3 biologically independent experiments. f In vitro DNA cleavage activities of SaCas9, SaCas9-NNG, and SaCas9-KKH toward DNA targets with eight different PAMs. The linearized plasmid targets were incubated with the SaCas9–sgRNA complex at 37 °C for 0.5 and 2 min. Data are mean ± s.d. from n  =  3 biologically independent experiments.

To expand the targeting range of SaCas9, we sought to engineer a SaCas9 variant with relaxed recognition for the fourth to sixth positions in the NNGRRT PAM. Our previous structural analysis revealed that the third G in the PAM is recognized by Arg1015 in SaCas9, while the fourth and fifth Rs and the sixth T are recognized by Asn985 and Arg991, respectively12 (Supplementary Fig. 1b, c). We and others previously reported that PAM recognition can be relaxed by combining the elimination of base-specific interactions with PAM nucleotides and the introduction of non-base-specific backbone interactions with the PAM duplex5,6,8,16,17. Thus, we first purified the SaCas9 N985A/R991A variant, and measured its in vitro cleavage activity toward a target DNA bearing the TTGAAT PAM. As expected, the N985A/R991A variant showed almost no activity (Fig. 1b). We then examined whether the N985A/R991A activity could be restored by replacing the residues surrounding the PAM duplex with basic or hydrophilic residues. Through an extensive in vitro evaluation of individual variants, we found that the E782K, L800R, T927K, N968R, and A1021S mutations partially restored the DNA cleavage activity toward the TTGAAT target (Fig. 1b). The combination of all these mutations (N985A/R991A/E782K/L800R/T927K/N968R/A1021S; referred to as AAKRKRS) further enhanced the DNA cleavage activity (Fig. 1b). However, the cleavage rate of AAKRKRS was still slightly lower than that of wild-type SaCas9 (referred to as SaCas9 for simplicity) (Supplementary Fig. 1d). Molecular modeling suggested that the K929N and I1017F mutations form hydrogen-bonding and van der Waals interactions with Lys927 (T927K) and Arg1015, respectively, thereby stabilizing the interactions with the PAM duplex (Supplementary Fig. 1e, f). Indeed, the inclusion of these two mutations into the AAKRKRS variant enhanced the cleavage activity toward the TTGAAT target to a level comparable to that of SaCas9 (Supplementary Fig. 1d).

To investigate whether the AAKRKNRFS (N985A/R991A/E782K/L800R/T927K/K929N/N968R/I1017F/A1021S) variant exhibits relaxed PAM recognition, we assessed the cleavage activities of SaCas9 and AAKRKNRFS toward target DNAs with 19 different PAMs, including TTGNNT and TTGAAN PAMs. SaCas9 efficiently cleaved the target DNAs with TTGRRN PAMs and showed a slight preference for T at the sixth position (Fig. 1c), consistent with the previous study11. In contrast, AAKRKNRFS efficiently cleaved all target DNAs except for the TTGGTT target (Supplementary Fig. 1g). The SaCas9 structure suggested a steric clash between the side chain of Asn986 and the methyl group of the fifth T, which could reduce the activity of AAKRKNRFS toward the TTGGTT target (Supplementary Fig. 1h). Indeed, the addition of the N986S mutation (N985A/R991A/E782K/L800R/T927K/K929N/N968R/N986S/I1017F/A1021S; referred to as AAKRKNRSFS) enhanced the cleavage activity toward the TTGGTT target, although this variant still showed relatively lower activities for TTGGNN PAMs (Fig. 1d). To comprehensively explore the PAM specificities of AAKRKNRSFS, we performed in vitro PAM discovery assays, using a DNA library containing the target sequence adjacent to a randomized 8-bp sequence. We confirmed that, whereas SaCas9 is specific to NNGRR PAMs, AAKRKNRSFS recognizes simple NNG sequences as the PAMs, albeit with slightly reduced activity toward NNGG sequences (Fig. 1e and Supplementary Fig. 2). These results demonstrated that the engineered AAKRKNRSFS variant exhibits a relaxed PAM constraint, and thus we refer to it as SaCas9-NNG.

A previous study reported the SaCas9-KKH variant (E782K/N968K/R1015H), which was engineered via directed evolution and recognizes relaxed NNNRRT PAMs18. To compare the PAM preferences between SaCas9-NNG and SaCas9-KKH, we evaluated their in vitro cleavage activities toward target DNAs bearing TTNAAT PAMs. SaCas9-NNG exhibited no activity except for the TTGAAT target, whereas SaCas9-KKH showed activity against all the PAMs (Fig. 1f), consistent with the reported relaxed preference of SaCas9-KKH for the third G in the PAM. Nonetheless, under our assay conditions (100 nM Cas9), SaCas9-NNG was much more active than SaCas9-KKH toward the TTGAAT target (Fig. 1f), demonstrating the superiority of SaCas9-NNG for targeting the NNG PAM targets. Furthermore, in vitro cleavage assays using target DNAs containing TTGYYT (where Y is C or T) PAMs confirmed that only SaCas9-NNG efficiently catalyzed DNA cleavage at these sites (Fig. 1f), establishing SaCas9-NNG as a particularly effective nuclease for targeting DNA regions harboring NNGYY PAMs, which are inaccessible to both SaCas9 and SaCas9-KKH.

Genome editing by SaCas9-NNG in mammalian cells

To assess the genome editing activity of SaCas9-NNG, we measured indel (insertions or deletions) formations induced by SaCas9 and SaCas9-NNG at 35 endogenous target sites with NNG PAMs in human embryonic kidney (HEK) 293 T cells (Supplementary Table 2). As expected, SaCas9 induced indels at the NNGRR, but not NNGYY, target sites (Fig. 2a). In contrast, SaCas9-NNG efficiently induced indels at the NNGRR sites (except for the NNGGG sites) and the NNGYY sites (Fig. 2a), consistent with our in vitro cleavage data. We also measured the genome-editing efficiencies of SaCas9 and SaCas9-NNG in murine immortalized liver (TLR3) cells. SaCas9 efficiently edited only NNGRR sites, whereas SaCas9-NNG modified all NNG sites, albeit with lower activities at the NNGG sites (Supplementary Fig. 3a). These results demonstrated that SaCas9-NNG can edit target sites with NNG PAMs in mammalian cells, although with relatively reduced activities at NNGG PAM targets.

Fig. 2. Genome- and base-editing activities in human cells.

Fig. 2

a, b Efficiencies of indel formation (a) and C-to-T conversions (b) by SaCas9 (gray) and SaCas9-NNG (orange) at endogenous target sites in HEK293T cells. Data are mean ± s.d. from n  =  3 biologically independent experiments.

Base editing by SaCas9-NNG in human cells

Next, we examined whether SaCas9-NNG can be harnessed for base-editing technology. We designed the D10A nickase versions of SaCas9 and SaCas9-NNG fused to the activation-induced cytidine deaminase (referred to as SaCas9-AID and SaCas9-NNG-AID, respectively), as in the SpCas9-based cytidine base editor19, and measured C-to-T conversion efficiencies at 35 endogenous target sites with NNG PAMs (identical to those tested for indel formation) in HEK293T cells (Supplementary Table 2). SaCas9-AID efficiently mediated C-to-T conversions at the NNGRR, but not NNGYY, target sites (Fig. 2b). In contrast, SaCas9-NNG-AID showed C-to-T base conversions toward all the target sites, albeit with lower efficiencies at the NNGG sites (Fig. 2b). In both SaCas9-AID and SaCas9-NNG-AID, C-to-T base conversions predominantly occurred between positions 6 and 19 relative to the PAM within the protospacer (Supplementary Fig. 3b), indicating that the editing window is preserved between SaCas9-AID and SaCas9-NNG-AID.

Furthermore, we designed the nickase versions of SaCas9 and SaCas9-NNG fused to TadA-8e (referred to as SaCas9-ABE8e and SaCas9-NNG-ABE8e, respectively), as in the SpCas9-based adenine base editor2022, and measured A-to-G conversions toward various pathogenic point mutations known to cause hemophilia B23 in HEK293 cells (Supplementary Tables 2,3). SaCas9-ABE8e exhibited A-to-G conversions only at NNGRR targets with relatively low efficiencies, whereas SaCas9-NNG-ABE8e induced A-to-G conversions at all targets (Supplementary Fig. 3c, d). A-to-G editing predominantly occurred between positions 7 and 20 relative to the PAM within the protospacer, which is largely consistent with the previously reported target window of SaCas9-ABEmax14,15 (Supplementary Fig. 3c, d). These results demonstrated that the catalytically inactive version of SaCas9-NNG can serve as a useful RNA-guided DNA-targeting platform.

Genome and base editing by SaCas9-NNG in mice

Since SaCas9 (1,053 residues) is smaller than SpCas9 (1,368 residues), the SaCas9-NNG gene, along with its sgRNA and/or accessory components, can be packaged into an all-in-one AAV vector, enabling its genome-editing applications in living organisms. We designed a single AAV vector, encoding SaCas9 or SaCas9-NNG under the HCRhAAT promoter and a U6 promoter-driven sgRNA targeting the human coagulation factor 9 (hF9) gene for hemophilia B (Fig. 3a). We injected 7-week-old male mice with 1 × 1012 vector genomes (vg) of the single AAV serotype 8 (AAV8) vector, and measured the indel formation at 12 weeks post-injection by amplicon sequencing of genomic DNA extracted from mouse liver. SaCas9 induced indels at the NNGGAT and NNGAAA, but not NNGCAA and NNGTCA, target sites (Fig. 3b). In contrast, SaCas9-NNG efficiently induced indels at all the targets (Fig. 3b). Consistently, SaCas9-mediated editing of the hF9 target with the NNGGAT and NNGAAA PAMs reduced the plasma coagulation factor IX (FIX) activity, whereas SaCas9-NNG-mediated editing at all target sites attenuated the FIX activity (Fig. 3c).

Fig. 3. Genome- and base-editing activities in mice.

Fig. 3

a A schematic of the AAV vector used for mouse liver genome editing. Created in BioRender. Ohmori, T. (2026) https://BioRender.com/seh620r. b Indel frequencies in mouse livers. Data are mean ± s.d. from n  =  4 biologically independent experiments. Statistical significance between SaCas9 and SaCas9-NNG was analyzed by two-tailed Student’s t test. **p = 0.0017 (NNGA); ns, p = 0.8996 (NNGG); ****p < 0.0001 (NNGT and NNGC). c Time course of plasma factor IX activities (FIX:C) in mouse livers. Data are mean ± s.d. from n  =  4 biologically independent experiments. d Efficiencies of A-to-G base editing mediated by SaCas9-ABE8e (gray) and SaCas9-NNG-ABE8e (orange) at pathogenic hF9 variants associated with hemophilia B (c.280 G > A and c.364 G > A) in mouse liver. Total A-to-G editing across the spacer region (left) and targeted single A-to-G editing at the pathogenic nucleotide (right) are quantified. Data are presented as mean ± s.d. from n  =  3 biologically independent experiments. Statistical significance between SaCas9 and SaCas9-NNG was assessed using a two-tailed Student’s t-test. ****p < 0.0001 (NNGA); ns, p = 0.0481 (c.364 G > A, total); ns, p = 0.0205 (c.364 G > A, targeted). e Time course of plasma factor IX activities (FIX:C) in mouse livers. Data are mean ± s.d. from n  =  3 biologically independent experiments.

We next evaluated the in vivo A-to-G base editing efficiencies of SaCas9-NNG using knock-in mouse models carrying pathogenic hF9 variants associated with hemophilia B (c.280 G > A and c.364 G > A). Male mice aged 6–10 weeks were administered AAV8 vectors encoding nickase versions of SaCas9-ABE8e or SaCas9-NNG-ABE8e, and genomic DNA extracted from mouse liver at 12 weeks after injection was subjected to deep sequencing. Whereas SaCas9-ABE8e mediated A-to-G base conversions only in mice carrying the c.364 G > A variant with an AAGAAC PAM, SaCas9-NNG-ABE8e efficiently induced A-to-G base conversions in both c.280 G > A mice bearing a GAGTCC PAM and c.364 G > A mice (Fig. 3d). Consistently, in mice carrying the c.280 G > A variant, treatment with SaCas9-NNG-ABE8e, but not SaCas9-ABE8e, resulted in a significant increase in coagulation FIX activity (FIX:C) (Fig. 3e). FIX:C levels were also elevated in c.364 G > A mice treated with SaCas9-NNG-ABE8e relative to those treated with SaCas9-ABE8e, although this difference did not reach statistical significance (Fig. 3e). Together, these results indicated that SaCas9-NNG exhibits the expanded target scope in mice, and could be used as a therapeutic genome-editing tool deliverable via a single AAV vector.

Structure-guided engineering of SaCas9-NNG for improved fidelity

In addition to the limited target ranges due to the PAM requirement, off-target effects pose an obstacle to therapeutic applications of CRISPR-based technologies2426, particularly for PAM-relaxed Cas9 variants, where expanded targeting is often accompanied by reduced specificity68. To reduce off-target cleavage by SaCas9 and SaCas9-NNG, we sought to engineer a high-fidelity SaCas9 variant. Previous studies revealed that reducing non-specific interactions between SpCas9 and the DNA backbone improves cleavage fidelity2730. In the SaCas9 structure, Asn413 interacts with the ribose moiety of dC19 in the target DNA12 (Supplementary Fig. 4a). Indeed, the N413A mutation did not alter the on-target activity in vitro, but reduced the cleavage activity of SaCas9 against an off-target DNA containing a single mismatch at the PAM distal end (Fig. 4a). A previous study also revealed that disrupting the salt bridges within the REC domain enhances the fidelity of SpCas930. As observed in SpCas9, the Ala substitution of Arg420, which forms salt bridges with Glu406 and Asp412, reduced off-target cleavage by SaCas9, while maintaining its on-target activity (Fig. 4a and Supplementary Fig. 4a). Notably, the N413A/R420A double mutations further enhanced the fidelity of SaCas9, and the inclusion of these mutations also substantially reduced the off-target activity of SaCas9-NNG (Fig. 4a). The N413A/R420A mutations reduced the cleavage kinetics of both SaCas9 and SaCas9-NNG toward the on-target DNA substrate, as also observed in high-fidelity SpCas9 variants31 (Supplementary Fig. 4b).

Fig. 4. Engineering of the eSaCas9 variant.

Fig. 4

a In vitro DNA cleavage activities of SaCas9, N413A, R420A, eSaCas9 (N413A/R420A), SaCas9-NNG, eSaCas9-NNG, and SaCas9-HF32 toward a fully matched on-target DNA and an off-target DNA with a mismatch at position 21 from the PAM. The linearized plasmid targets were incubated with the SaCas9–sgRNA complex at 37 °C for 0.5 and 2 min. Data are mean ± s.d. from n  =  3 biologically independent experiments. b In vitro DNA cleavage activities of SaCas9, eSaCas9, SaCas9-NNG, eSaCas9-NNG, and SaCas9-HF toward a fully matched on-target DNA and off-target DNAs containing a mismatch at positions 1–21. The linearized plasmid targets were incubated with the SaCas9–sgRNA complex at 37 °C for 3 min. Data are mean ± s.d. from n  =  3 biologically independent experiments. c Indel formation efficiencies of SaCas9 (white), eSaCas9 (gray), SaCas9-NNG (orange), and eSaCas9-NNG (red) at endogenous target sites in HEK293T cells. For each PAM sequence, five independent guide RNAs were analysed, each tested in n  =  3 biologically independent experiments. Data are shown as pooled values (total n  =  15). d Schematic representation of the endogenous target loci selected for comparing the indel efficiencies of SpCas9, SpRY, SpG, iGeoCas9, and eSaCas9-NNG. Three target sites with NGGNC PAMs (above) were chosen, which are accessible by eSaCas9-NNG, SpG (NG PAM), SpRY (virtually no PAM restrictions), and iGeoCas9 (NNNNC PAM). Additionally, three target sites with NTGNW PAMs were selected (below), which can only be targeted by eSaCas9-NNG and SpRY. While 21-nt guide sgRNAs were used for eSaCas9-NNG and iGeoCas9, 20-nt guide sgRNAs were used for SpRY and SpG for their optimal activities. e Indel formation efficiencies of SpCas9 (yellow), SpRY (purple), SpG (blue), iGeoCas9 (red), and eSaCas9-NNG (pink) at endogenous target sites in HEK293T cells. Data are mean ± s.d. from n  =  3 biologically independent experiments.

To comprehensively assess the effect of the N413A/R420A double mutations on the target specificity, we examined the in vitro DNA cleavage activities of SaCas9 and SaCas9-NNG toward DNA substrates containing a mismatch at positions 1–21. While SaCas9 and SaCas9-NNG were tolerant to most single mismatches, the N413A/R420A mutations reduced the activities of SaCas9 and SaCas9-NNG toward mismatch-containing substrates, especially those with PAM-distal mismatches (positions 16, 17, 20, and 21) (Fig. 4b). Thus, we refer to these high-fidelity variants as enhanced-specificity SaCas9 (eSaCas9) and eSaCas9-NNG, respectively. Furthermore, we compared the on- and off-target activities of eSaCas9 and eSaCas9-NNG with those of SaCas9-HF (R245A/N413A/N419A/R654A), a rationally engineered high-fidelity SaCas9 variant32. Both variants exhibited comparable on- and off-target activities (Fig. 4b), demonstrating that eSaCas9 and eSaCas9-NNG achieve cleavage specificities similar to those of established high-fidelity SaCas9 variants.

Genome editing by eSaCas9 and eSaCas9-NNG in human cells

We compared the genome-editing efficiencies of eSaCas9 and eSaCas9-NNG with those of SaCas9 and SaCas9-NNG at 35 endogenous target sites with NNG PAMs in HEK293T cells. Consistent with our in vitro data, eSaCas9 and eSaCas9-NNG exhibited comparable genome-editing efficiencies to those of SaCas9 and SaCas9-NNG, respectively (Fig. 4c). Using GUIDE-seq (genome-wide, unbiased identification of double-stranded breaks enabled by sequencing), we analyzed the genome-wide specificities of SaCas9, eSaCas9, SaCas9-NNG, and eSaCas9-NNG at the VEGFA site in human cells. As expected, eSaCas9 and eSaCas9-NNG exhibited lower off-target activities than SaCas9 and SaCas9-NNG, respectively (Supplementary Fig. 4c, d). These results demonstrated that the N413A/R420A mutation substantially reduces off-target cleavage by SaCas9 in mammalian cells while maintaining robust on-target activity.

To evaluate the advantages of eSaCas9-NNG over other Cas9 variants, particularly those with relaxed PAM constraints, we compared the genome-editing efficiencies of eSaCas9-NNG with those of SpCas9, SpRY6, SpG6, and iGeoCas916,17, which recognize NGG, NN, NG, and NNNNC as the PAMs, respectively, at six different target sites with NNG PAMs in HEK293T cells (Fig. 4d and Supplementary Table 2). eSaCas9-NNG and SpRY induced indels at the six target sites with 15.4% and 12.9% efficiencies on average, respectively, indicating that eSaCas9-NNG outperforms SpRY at target sites with NNG PAMs (Fig. 4d, e). As expected, eSaCas9-NNG efficiently induced indels at target sites with NTGNW PAMs (where W is A or T), whereas SpCas9, SpG, and iGeoCas9 showed little to no activity at these sites (Fig. 4d, e). Moreover, even for targets with NGGNC PAMs, which are equally compatible with the five Cas9 nucleases, eSaCas9-NNG exhibited genome-editing efficiencies comparable to those of SpCas9, SpG, and iGeoCas9 (Fig. 4d, e). Collectively, these findings establish eSaCas9-NNG as a versatile genome-editing tool that can be used for precise gene therapy with a broad target range and robust genome-editing activities.

Cryo-EM structure of the eSaCas9-NNG–guide RNA–target DNA complex

To elucidate the molecular mechanism underlying the relaxed PAM recognition and improved fidelity of eSaCas9-NNG, we determined the cryo-EM structure of eSaCas9-NNG in complex with a 98-nt sgRNA (containing a 21-nt guide) and its 43-bp target dsDNA containing the TTGCCT PAM at 3.1-Å resolution (Supplementary Figs. 5 and 6, and Supplementary Tables 4 and 5). eSaCas9-NNG adopts a bilobed architecture consisting of recognition (REC) and nuclease (NUC) lobes, which are connected by a bridge helix (BH) and a linker loop (Fig. 5a and Supplementary Fig. 6a–c). The NUC lobe consists of the RuvC, HNH, WED, and PI domains, while the REC lobe consists of the REC1 and REC2 domains. Within the NUC lobe, the HNH and RuvC domains are connected by the L1 and L2 linkers. The sgRNA guide segment base-pairs with the TS to form a 21-bp guide RNA–target DNA heteroduplex, which is accommodated between the REC and NUC lobes (Fig. 5a and Supplementary Fig. 6b–e). The sgRNA scaffold comprises a repeat:anti-repeat duplex, stem-loop 1, and stem-loop 2 (Supplementary Fig. 6d, e). The repeat:anti-repeat duplex is sandwiched between the REC1 and WED domains, while stem-loop 1 and stem-loop 2 are recognized by the BH/REC1 and RuvC/PI domains, respectively (Supplementary Fig. 6c–e). Nucleotides dG(−21*)–dA(−17*) and dG(−3*)–dC(−1*) in the single-stranded NTS are bound to the positively charged surfaces of the RuvC and RuvC/L2/PI domains, respectively, while nucleotides dA(−16*)–dT(−4*) are disordered (Supplementary Fig. 6c–e). The PAM-containing DNA duplex binds to the surface formed by the WED and PI domains, while nucleotides dA(−25*)–dC(−22*) in the NTS re-hybridize with nucleotides dG22–dT25 in the TS to form the PAM-distal DNA duplex (Fig. 5a and Supplementary Fig. 6c–e).

Fig. 5. Structural basis for relaxed PAM recognition and improved specificity.

Fig. 5

a Recognition of the relaxed NNG PAM. Mutated residues are highlighted in red. Hydrogen bonds are shown as dashed lines. Molecular graphics figures were prepared with UCSF ChimeraX-1.7.166. b, c Structures of SaCas9 (b) and eSaCas9-NNG (c) bound to the sgRNAs and target DNA containing a G:dG mismatch at position 21 from the PAM. d Structural comparison between SaCas9 (light blue) and eSaCas9-NNG (colored as in c). e, f Close-up views around the REC2 domain and the PAM-distal region of the guide–target heteroduplex in SaCas9 (e) and eSaCas9-NNG (f). The G1:dG21 mismatch at the PAM-distal end forms a Hoogsteen base pair in both SaCas9 and eSaCas9-NNG. Hydrogen bonds are shown as dashed lines. g Structural comparison of the PAM-distal region between SaCas9 and eSaCas9-NNG. In eSaCas9-NNG, the ESH undergoes an outward displacement from the PAM-distal guide–target heteroduplex due to the N413A/R420A mutations, and Asn419 adopts a flipped-out conformation. All molecular graphics figures were prepared with UCSF ChimeraX-1.7.166.

Except for the HNH domain, eSaCas9-NNG is structurally similar to the previously reported crystal structure of SaCas912, suggesting that the introduced mutations do not substantially affect the complex structure (Supplementary Fig. 7a, b). In the SaCas9 crystal structure, the HNH domain is distant from the cleavage site of the TS and interacts with the RuvC domain, indicating that it represents the catalytically inactive state. In contrast, the HNH domain in the present structure docks onto the TS cleavage site and interacts with the REC1 domain and the L1 linker (Supplementary Fig. 7a, b), and thus represents the catalytically activated state. Indeed, the TS was cleaved between dC3 and dA4 (Supplementary Fig. 7c, d), although it contains phosphorothioate modifications, possibly due to contamination with hydrolysable stereoisomers. The 3′-hydroxy group of dC3 and the phosphate group of dA4 are stabilized by a Mg2+ ion, which is coordinated by Asp556 and Asn580 in the HNH domain (Supplementary Fig. 7c, d). In addition, the dA4 phosphate group is recognized by the catalytic residue His557, corresponding to His840 of SpCas9 (Supplementary Fig. 7c, d). As in the SaCas9 structure12, Asp10, Glu477, and His701 in the RuvC domain coordinate two Mg2+ ions and form the active site responsible for the NTS cleavage (Supplementary Fig. 7c, e). These structural observations demonstrated that eSaCas9-NNG cleaves the TS and NTS in Mg2+-dependent manners, using the HNH and RuvC nuclease domains, respectively, as observed in other Cas9 orthologs3335.

Structural basis for the relaxed PAM recognition by eSaCas9-NNG

The TTGCCT PAM is recognized by the WED and PI domains in the eSaCas9-NNG structure (Fig. 5a and Supplementary Fig. 7f). As in the SaCas9 structure12, the third G in the PAM (dG3*) forms bidentate hydrogen bonds with the side chain of Arg1015, which is stabilized by interactions with Glu993 and Phe1017 (I1017F) (Fig. 5a). By contrast, the fourth to sixth nucleobases in the PAM (dC4*–dT6*) lack base-specific interactions with the protein, due to the Ala985 (N985A), Ser986 (N986S), and Ala991 (R991A) replacements (Fig. 5a). Notably, the newly incorporated Lys782 (E782K), Lys927 (T927K), and Arg968 (N968R) residues directly interact with the backbone phosphates of the PAM duplex, while Arg800 (L800R) and Ser1021 (A1021S) likely form water-mediated hydrogen bonds with the PAM duplex (Fig. 5a). In addition, Asn929 (K929N) stabilizes the conformation of Lys927, which interacts with the backbone phosphate of dG3* (Fig. 5a). These substitutions collectively increase the positive electrostatic potential around the WED and PI domains, similar to that observed in the SpRY structure8, thereby enhancing the electrostatic interactions with the PAM duplex (Supplementary Fig. 7g, h). These newly formed non-base-specific interactions compensate for the loss of base-specific interactions with the fourth to sixth PAM nucleobases, achieving the relaxed NNG PAM recognition by eSaCas9-NNG. Although eSaCas9-NNG exhibits modestly reduced activity toward substrates bearing NNGG PAMs, the present structure does not provide a definitive structural explanation for this preference. Therefore, further structural characterization of the eSaCas9-NNG–guide RNA–target DNA complex bound to an NNGG PAM would help elucidate the molecular basis underlying the reduced activity toward these DNA substrates.

Structural basis for the improved specificity of eSaCas9-NNG

To understand the structural basis for the improved specificity of eSaCas9-NNG, we determined the cryo-EM structures of SaCas9 and eSaCas9-NNG in complex with an sgRNA and its 43-bp target dsDNA containing a single-nucleotide mismatch at position 21 from the PAM (Fig. 5b, c, and Supplementary Fig. 8). SaCas9 and eSaCas9-NNG adopt almost identical overall structures, indicating that the N413A/R420A mutations do not alter the overall conformation of the protein when bound to a mismatched target (Fig. 5d). The PAM-distal G1:dG21 mismatch likely forms a Hoogsteen base pair in both SaCas9 and eSaCas9-NNG, resulting in the slight distortion of the TS backbone (Fig. 5e, f and Supplementary Fig. 9a, b). This distortion contrasts with the undistorted TS backbone observed in the eSaCas9-NNG–sgRNA–target DNA complex with a fully matched target (Fig. 5a and Supplementary Fig. 6).

While the overall structures of SaCas9 and eSaCas9-NNG are similar, we observed a conformational difference in a helix of the REC2 domain (residues 414–421, referred to as the PAM-distal End Stabilizing Helix (ESH)) (Fig. 5e–g and Supplementary Fig. 9c, d). In the SaCas9 structure, the ESH interacts with the PAM-distal TS backbone, with Asn413 and Asn419 forming hydrogen-bonding interactions with dC19 and dC20, respectively. Notably, Arg420 forms salt bridges with Glu406 and Asp412, stabilizing the relative position of the ESH. These interactions facilitate the stable binding of the mismatch-containing distorted TS to SaCas9, thereby enabling the efficient cleavage of a target DNA with a PAM-distal mismatch. In contrast, in the eSaCas9-NNG structure, the Asn413–dC19 hydrogen bond and the Glu406–Arg420–Asp412 salt bridge network are eliminated due to the N413A and R420A mutations, leading to an outward displacement of the ESH from the TS backbone. Furthermore, in the eSaCas9-NNG structure, the side chain of Asn419 adopts a flipped-out conformation and disrupts the Asn419–dC20 hydrogen bond (Fig. 5e–g and Supplementary Fig. 9c, d). These local structural rearrangements likely reduce the protein–DNA interactions in the eSaCas9-NNG complex, decreasing the efficiency of REC2 docking and subsequent HNH domain activation. Consequently, eSaCas9-NNG requires a longer time to achieve a catalytically competent conformation and is more prone to dissociate from mismatched targets with a distorted conformation prior to cleavage, as the distorted TS is energetically unfavorable for stable accommodation within the complex.

Cryo-EM structures of eSaCas9-NNG in distinct functional states

In addition to the catalytically active state (State I), our cryo-EM analysis revealed three distinct classes (States II–IV), which are primarily distinguished by the guide RNA–target DNA heteroduplex lengths (Supplementary Fig. 5). We therefore determined the three additional structures at overall resolutions of 3.2 Å (State II), 2.9 Å (State III), and 2.8 Å (State IV) (Fig. 6a–c and Supplementary Fig. 5).

Fig. 6. Cryo-EM structures of eSaCas9-NNG in distinct functional states.

Fig. 6

ac Cryo-EM density map (top) and structural models of the entire complex (middle) and the nucleic acids (bottom) of the eSaCas9-NNG–guide RNA–target DNA complexes in the interrogation (a), intermediate (b), and translocation (c) states. d Recognition of the PAM in the interrogation state. e Recognition of the PAM-proximal region of the guide RNA–target DNA heteroduplex in the intermediate state. (f) Close-up view of the RuvC active site in the translocation state. The magnesium ion coordinated with Asp10 and His701 is shown as a gray sphere. In (df), cryo-EM density maps are shown as blue meshes. All molecular graphics figures were prepared with UCSF ChimeraX-1.7.166.

In State II, the sgRNA guide does not hybridize with the double-stranded target DNA, which instead binds to the groove between the WED and PI domains, with the third G PAM nucleotide recognized by Arg1015 in the PI domain, as in the catalytically active state (Fig. 6a, d). Thus, this structure likely represents the “interrogation state”, in which the eSaCas9-NNG engages a potential NNG PAM site to interrogate for target but fails to unwind the DNA duplex due to the absence of an adjacent complementary sequence. In the interrogation state, the BH, REC1, and REC2 domains are disordered, suggesting that they are flexible before the heteroduplex formation (Fig. 6a). Notably, nucleotides G15–C21 in the sgRNA are pre-ordered in an A-form geometry for base-pairing with the TS, even without interactions with the BH (Fig. 6a).

In State III, nucleotides A4–C21 in the sgRNA base-pair with nucleotides dG1–dT18 in the TS to form an 18-bp heteroduplex, while the three PAM-distal nucleotides (G1–G3) in the sgRNA are disordered (Fig. 6b). This structure represents the “intermediate state”, in which the Cas9–sgRNA complex has partially formed the guide–target heteroduplex. In this state, the BH and the REC1 domain become ordered and extensively interact with the PAM-proximal region of the heteroduplex (Fig. 6e). In contrast, the REC2 domain remains disordered, with the PAM-distal region of the heteroduplex exposed to the solvent (Fig. 6b). The HNH domain is located far from the TS cleavage site, while the RuvC active site is occluded by the L1 linker, as in the interrogation state (Fig. 6a, b), indicating that the formation of the 18-bp heteroduplex is insufficient to activate eSaCas9-NNG. The introduced N413A/R420A mutations might delay REC2-domain docking, potentially stabilizing this intermediate state. This intermediate conformation may therefore serve as a “checkpoint” in eSaCas9-NNG, enabling the enzyme to assess target DNA complementarity before activation.

In State IV, nucleotides G1–C21 in the sgRNA base-pair with nucleotides dG1–dC21 in the TS to form the complete 21-bp heteroduplex, as in the catalytically active state (Fig. 6c). In this state, the REC2 domain becomes ordered and interacts with the PAM-distal region of the heteroduplex, while the HNH domain, along with the L1 and L2 linkers, becomes disordered (Fig. 6c). Thus, this structure represents a “translocation state”, in which the HNH domain is moving toward the TS cleavage site. Although the dissociation of the L1 linker allows the NTS to access the RuvC domain, unlike the catalytically active state, only one Mg2+ ion is bound to the RuvC active site (Fig. 6f), indicating that the RuvC domain does not adopt a cleavage-competent active conformation in the translocation state.

Nuclease activation mechanism

These four structures, captured in different functional states, likely represent the local energetic minima encountered during eSaCas9-NNG activation and thereby provide key mechanistic insights into its nuclease activation pathway. A structural comparison of the interrogation and intermediate states reveals that the ordering of the BH and the REC1 domain is coupled with the shift of the pre-ordered guide region (G15–C21) toward the interior of the protein, resulting in the formation of the 18-bp guide–target heteroduplex (Fig. 6a, b). A structural comparison between the intermediate and translocation states demonstrates that the docking of the REC2 domain is coupled with the formation of the 21-bp guide–target heteroduplex (Fig. 6b, c). The heteroduplex elongation facilitates a conformational change in the L2 linker (residues 629–649) (Supplementary Fig. 10a–f). In the intermediate state, residues 635–645 in the L2 linker form an α helix and interact with both the RuvC and HNH domains, stabilizing the inactive conformation (Supplementary Fig. 10a, d). In contrast, in the translocation state, this α helix is structurally melted and the HNH domain becomes disordered (Supplementary Fig. 10b, e). A structural comparison of the two states suggests a steric clash between the L2 α helix in the intermediate state and the heteroduplex in the translocation state (Supplementary Fig. 10c, f), indicating that the formation of the 21-bp heteroduplex induces the structural rearrangement of the L2 linker, resulting in the dissociation of the HNH domain from the RuvC domain (Supplementary Fig. 10a–f). In the catalytically active state, the HNH domain undergoes an approximately 160° rotation from its inactive position and docks onto the TS cleavage site in the heteroduplex (Supplementary Fig. 10g, h). This HNH rearrangement is accompanied by structural changes in the REC1 domain and the L1/L2 linkers. Whereas residues 126–146 in the REC1 domain are located near the TS cleavage site in the intermediate and translocation states, these residues become disordered in the active state, due to the docking of the HNH domain (Supplementary Fig. 10h). The L1 linker undergoes an approximately 180° rotation from its position in the intermediate state and binds to the minor groove of the PAM-distal heteroduplex (Supplementary Fig. 10i, j). This L1 rearrangement is accompanied by a structural change in the β4 strand (Glu477) of the RuvC domain, thus forming the RuvC active site (Supplementary Fig. 10k). In the active state, the L2 linker adopts a loop conformation to form an NTS-binding pathway toward the RuvC active site (Supplementary Fig. 10j). In particular, Phe635 in the L2 linker stacks with the dA(−1)-dT1* base pair in the PAM duplex, while Arg1002 in the PI domain interacts with the flipped-out dC(−1*) in the NTS (Supplementary Fig. 10l). Collectively, these structural observations revealed the coordinated domain rearrangements coupled with the formation of the guide–target heteroduplex to achieve target DNA cleavage by eSaCas9-NNG (Supplementary movies 1 and 2).

Discussion

SaCas9 was identified in 2015 as the first compact Cas9, and has since been used as a versatile genome editing tool in human cells and various organisms11,36,37. However, the development of useful SaCas9 variants has been limited as compared to SpCas9. In this study, we rationally engineered the eSaCas9-NNG variant with an expanded targeting range and reduced off-target activity. Because genome editing with Cas9 inherently involves a trade-off between targeting breadth and specificity, eSaCas9-NNG effectively addresses both constraints while retaining a size compatible with packaging into a single AAV vector, thereby substantially expanding the CRISPR-Cas genome-editing toolbox.

We also determined the cryo-EM structures of eSaCas9-NNG in four distinct states, illuminating the activation mechanism of the small eSaCas9-NNG enzyme (Fig. 7). In the interrogation state, where the guide–target heteroduplex has not yet formed, the entire REC lobe is structurally disordered probably due to its flexibility and the HNH domain is located far from the TS cleavage site. Upon the guide–target heteroduplex formation, the REC1 domain becomes ordered and engages the PAM-proximal region of the guide–target heteroduplex. Further heteroduplex elongation facilitates docking of the REC2 domain, which ensures sequence complementarity at the PAM-distal region, and is accompanied by an approximately 160° rotation of the HNH domain to dock onto the TS cleavage site. The L1 and L2 linkers participate in recognizing the PAM-distal region of the heteroduplex and guiding the NTS toward the RuvC active site, respectively. These structural observations underscore the essential domain movements and conformational rearrangements in the activation of eSaCas9-NNG.

Fig. 7. Activation mechanisms of eSaCas9-NNG and SpRY.

Fig. 7

Schematics showing conformational changes in eSaCas9-NNG (top) and SpRY (bottom) during DNA cleavage. First, both eSaCas9-NNG and SpRY bind to a linear target dsDNA to interrogate their cognate PAM sequences. In SpRY, the entire REC lobe is ordered and adopts a closed conformation (PDB: 8T6O), while in eSaCas9-NNG, the entire REC lobe is disordered. The dsDNA is then unwound and the guide–target heteroduplex is formed. In SpRY, partial disordering of the REC3 domain is observed when a 10-bp heteroduplex is formed, reflecting the absence of PAM-distal base pairing (PDB: 8T6S). Similarly, in eSaCas9-NNG, the REC2 domain remains disordered even upon the formation of an 18-bp heteroduplex, owing to incomplete PAM-distal heteroduplex formation. Subsequently, in SpRY, the REC3 domain becomes fully ordered and the HNH domain moves to the TS cleavage position upon the 18-bp heteroduplex formation (PDB: 8T6X), whereas the formation of the 18-bp heteroduplex in eSaCas9-NNG does not induce such conformational changes. Finally, the HNH domains of SpRY and eSaCas9-NNG commonly undergo significant conformational rearrangements to dock onto the TS cleavage site, with the L2 linker forming loading paths for the NTS toward the RuvC active sites. Upon HNH domain docking, the REC2 domain of SpRY moves outward (PDB: 8SRS), while residues 126–146 in the REC1 domain become disordered in eSaCas9-NNG. The active sites of RuvC and HNH domains are indicated with yellow stars. Molecular graphics figures were prepared with UCSF ChimeraX-1.7.166.

Recent studies have elucidated the nuclease activation pathway of the PAM-relaxed SpCas9 variant SpRY, capturing interrogation, intermediate, and product states8. Structural comparisons of these SpRY structures with those of eSaCas9-NNG reveal the conserved mechanistic features underlying the activation of PAM-relaxed Cas9 variants derived from both SpCas9 and SaCas9 (Fig. 7). During an intermediate stage of R-loop formation, the REC3 domain of SpRY and the REC2 domain of eSaCas9-NNG are disordered, reflecting the absence of the PAM-distal guide–target heteroduplex. In these inactive states, the HNH domain is positioned far from the TS cleavage site, which is instead occupied by the REC2 domain in SpRY or by the REC1 domain in eSaCas9-NNG. Moreover, in the active states of both enzymes, the L1 and L2 linkers participate in recognizing the PAM-distal region of the heteroduplex and guiding the NTS toward the RuvC active site, respectively (Fig. 7). These structural observations suggest that the REC3/REC2 domain docking coupled with the complete heteroduplex formation and the inactive-to-active HNH domain rearrangements via the L1/L2 linkers are conserved features of the activation mechanisms of SpRY and eSaCas9-NNG.

We also found notable differences in the activation mechanisms between SpRY and eSaCas9-NNG (Fig. 7). In SpRY, the entire REC lobe is ordered in the interrogation state, creating a positively charged cleft that accommodates the PAM-distal DNA duplex. By contrast, in the interrogation state of eSaCas9-NNG, the entire REC lobe is disordered. Furthermore, although the REC3 domain becomes ordered upon the formation of the 18-bp heteroduplex in SpRY, the REC2 domain in eSaCas9-NNG remains disordered even after the 18-bp heteroduplex has formed. These differences are likely attributable to the presence of the larger WED domain and the absence of REC1–REC2 interactions in eSaCas9-NNG, as compared to SpRY. In addition, whereas the formation of an 18-bp heteroduplex induces a structural change in the HNH domain in SpRY, the formation of an 18-bp heteroduplex in eSaCas9-NNG does not induce the HNH rearrangement. This variation is likely due to the structural differences in the HNH domains and the L1/L2 linkers between SpRY and eSaCas9-NNG. Collectively, the structurally divergent REC lobes and L1/L2 linkers contribute to the distinct activation mechanisms of the Cas9 orthologs, which may result in their different cleavage efficiencies and fidelities.

While SpCas9 efficiently cleaves its target DNA using a 20-nt guide sgRNA, most small Cas9 orthologs require 1–2-nt longer guides for efficient DNA cleavage (e.g., 21- and 22-nt guides are optimal for SaCas9 and CjCas9, respectively)1,11,38,39. Although the choice of optimal guide lengths is an important factor in Cas9-mediated genome engineering, it remains unclear why the Cas9 orthologs require different guide lengths for efficient DNA cleavage. A structural comparison between the active states of SpCas9 and eSaCas9-NNG provides an explanation for their different optimal guide lengths. In the active state of SpCas9, Arg765 in the L1 linker interacts with the backbone phosphate of the 5′ end (G1) of the 20-nt guide40 (Supplementary Fig. 11a). In the eSaCas9-NNG structure, Arg480 and Lys482 in the L1 linker interact with the backbone phosphate of G1 in the 21-nt guide (Supplementary Fig. 11b). These interactions between Cas9s and the 5′ ends of their guides are consistent with their optimal guide lengths. Although further elongation of the guide–target heteroduplex seems possible without steric clashes, the PAM-proximal end of the heteroduplex is fixed within the central groove of Cas9, likely resulting in an energetically unfavorable supercoiling effect with extended base pairing outside the Cas9 complex.

In summary, we developed the eSaCas9-NNG variant, which will expand the CRISPR toolbox for in vivo therapeutic genome editing. eSaCas9-NNG efficiently induces both indels and base conversions in mice, underscoring its potential utility for gene therapy applications. With further adaptations for prime editing41 or click editing42, eSaCas9-NNG may enable even more precise and versatile therapeutic genome manipulations. In addition, our cryo-EM analysis provides structural snapshots of the small eSaCas9-NNG during target DNA cleavage, improving our mechanistic understanding of diverse CRISPR-Cas9 enzymes. To date, structure-guided Cas9 engineering has primarily relied on crystal structures captured in the inactive state. By leveraging active- and intermediate-state structures, such as those resolved in this study, future Cas9 engineering efforts can be guided by a more rational and precise structural framework, enabling the development of next-generation genome editors with enhanced specificity and efficiency.

Methods

Ethical statements

All animal experimental procedures were approved by The Institutional Animal Care and Concern Committee of Jichi Medical University (permission number: 20051-10).

Protein and RNA preparation for structural analysis

The gene encoding full-length SaCas9 (residues 1–1,053) was codon optimized, synthesized (Genscript), and cloned between the NdeI and XhoI sites of the modified pE-SUMO vector (LifeSensors). The mutations were introduced by a PCR-based method, using the vector encoding full-length SaCas9 as the template, and the sequences were confirmed by DNA sequencing (Supplementary Table S2). The SaCas9 protein was expressed and purified using the protocol reported previously43. Briefly, the N-terminally His6-tagged SaCas9 proteins were expressed in Escherichia coli Rosetta2 (DE3). The SaCas9-expressing E. coli cells were cultured at 37 °C in LB medium (containing 20 mg/L kanamycin) until the OD600 reached 0.8, and protein expression was then induced by the addition of 0.1 mM isopropyl-ß-D-thiogalactopyranoside (Nacalai Tesque). The E. coli cells were further cultured at 20 °C for 18 h, and harvested by centrifugation at 5000 g for 10 min. The E. coli cells were resuspended in buffer A (20 mM Tris-HCl, pH 8.0, 20 mM imidazole, and 1 M NaCl), lysed by sonication, and then centrifuged at 10,000 g for 20 min. The supernatant was mixed with 0.3 mL Ni-NTA Superflow resin (QIAGEN) equilibrated with buffer A, and the mixture was loaded into a Poly-Prep Column (Bio-Rad). The protein was eluted with buffer B (20 mM Tris-HCl, pH 8.0, 300 mM imidazole, and 300 mM NaCl). To remove the His6-SUMO-tag, the eluted protein was mixed with SUMO protease, and then dialyzed at 4 °C overnight against buffer C (20 mM Tris-HCl, pH 8.0, and 300 mM NaCl). The protein was loaded onto a HiTrap SP HP column (GE Healthcare) equilibrated with buffer C, and eluted with a linear gradient of 0.3–2 M NaCl. The protein was further purified by chromatography on a HiLoad 16/600 Superdex 200 column (GE Healthcare) equilibrated with buffer D (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 2 mM MgCl2, and 1 mM DTT). The purified proteins were stored at −80 °C until use. The sgRNA was transcribed in vitro with T7 RNA polymerase, using a partially double-stranded DNA template. The transcribed sgRNA was purified by 8% denaturing urea PAGE, extracted from gel slices with Tris-Borate-EDTA Buffer (Takara), and then ethanol precipitated. The sgRNA pellet was dissolved in nuclease-free water and stored at −20 °C.

In vitro cleavage assay

The linearized pUC119 plasmid (100 ng, 4.7 nM), containing the 23-nt target sequence and the PAMs (Supplementary Table 1), was incubated at 37 °C for 0.5–5 min with the SaCas9–sgRNA complex (50 nM or 100 nM) in 10 μL of reaction buffer (20 mM HEPES, pH 7.5, 100 mM KCl, 2 mM MgCl2, 1 mM DTT, and 5% glycerol). The reactions were stopped by the addition of quench buffer, containing EDTA (20 mM final concentration) and Proteinase K (40 ng). The reaction products were resolved, visualized, and quantified with a MultiNA microchip electrophoresis device (SHIMADZU).

PAM identification assay

The PAM identification assay was performed as described previously5. The PAM library (100 ng), containing eight randomized nucleotides downstream of a 21-nt target sequence, was incubated at 37 °C for 5 min with the purified SaCas9 (SaCas9 and SaCas9-NNG) (50 nM) and the sgRNA (21-nt guide) in 10 µL of reaction buffer (20 mM HEPES, pH 7.5, 100 mM KCl, 2 mM MgCl2, 1 mM DTT, and 5% glycerol). The reactions were stopped by the addition of quench buffer, containing EDTA (20 mM final concentration) and Proteinase K (40 ng), and then purified using a Wizard DNA Clean-Up System (Promega). The purified DNA samples were amplified for 25 cycles, using primers containing common adapter sequences. After column purification, each PCR product (~5 ng) was subjected to a second round of PCR for 15 cycles, to add custom Illumina TruSeq adapters and sample indices. The sequencing libraries were quantified by qPCR (KAPA Biosystems), and then subjected to paired-end sequencing on a MiSeq sequencer (Illumina) with 20% PhiX spike-in (Illumina). The sequencing reads were demultiplexed by primer sequences and sample indices, using NCBI Blast + (version 2.8.1) with the blastn-short option. For each sequencing sample, the number of reads for every possible 8-nt PAM sequence pattern (48  =  65,536 patterns in total) was counted and normalized by the total number of reads in each sample. For a given PAM sequence, the enrichment score was calculated as log2-fold enrichment as compared to the untreated sample. PAM sequences with enrichment scores of –2.0 or less were used to generate the sequence logo representation, using WebLogo (version 3.7.1)44. The cumulative distribution and histogram of the read count of each PAM in the unedited sample confirmed that the plasmid library has sufficient coverage for the individual PAM sequences.

Eukaryotic cell lines

HEK293 and TLR3 cell lines were obtained from JCRB cell bank(HEK293:JCRB9068; TLR3:IFO50680). HEK293T cell lines were obtained from ATCC #CRL-3216. AAVpro293T cell line was obtained from Takara Bio(632273). All cell lines were routinely tested and confirmed to be free of mycoplasma contamination.

Genome- and base-editing analyses in human cells

Genome- and base-editing analyses were performed, according to the protocol described previously45. Briefly, HEK293T cells were maintained in DMEM (Sigma) supplemented with 10% (v/v) fetal bovine serum (FBS) (Thermo Fisher Scientific) and 1% penicillin-streptomycin (Sigma), at 37 °C in a 0.05% CO2 atmosphere. HEK239Ta cells were seeded at 5 × 103 cells per well in collagen I-coated 96-well plates, 24 h prior to transfection. HEK239T cells were transfected with a SaCas9 plasmid or a SaCas9-derived base-editor plasmid (120 ng) and an sgRNA plasmid (40 ng), using Polyethylenimine Max (Polysciences) (1 mg/mL, 0.5 µL) in PBS (50 µL) (Supplementary Table S2). The cells were harvested 3 days after transfection, treated with 50 mM NaOH (100 µL), incubated at 95 °C for 10 min, and then neutralized with 1 M Tris-HCl, pH 8.0 (10 µL). The obtained genomic DNA was subjected to two rounds of PCR, to prepare the library for high-throughput amplicon sequencing. Genomic regions targeted by sgRNAs were PCR-amplified to add custom primer-landing sequences. The PCR products were purified by AMPure XP magnetic beads (Agencourt), and then subjected to a second round of PCR to attach the custom Illumina TruSeq adapters with sample indices. After size-selection by agarose gel electrophoresis and column purification, the sequencing libraries were quantified using a KAPA Library Quantification Kit Illumina (KAPA Biosystems), multiplexed, and subjected to paired-end sequencing (600 cycles), using a MiSeq sequencer (Illumina) with 20% PhiX spike-in (Illumina). The sequencing reads were demultiplexed, based on sample indices and primer sequences. Using NCBI BLAST + (version 2.6.0) with the blastn-short option, the sequencing reads were mapped to the reference sequences to identify indels and substitutions in the target regions. To remove common PCR errors and somatic mutations, we deleted sequencing reads containing mutations (> 1% frequency) commonly observed in the control samples from the edited samples, and then normalized the editing frequencies for the target sites by subtracting the mutation frequencies of the control samples from those of the edited samples.

Comparison of genome-editing efficiencies among various Cas9 variants in human cells

HEK293T cells were maintained in DMEM (Sigma) supplemented with 10% (v/v) fetal bovine serum (FBS) (Nichirei), 1% GlutaMAX (Thermo Fisher Scientific) and 1% penicillin-streptomycin (Sigma), at 37 °C in a 0.05% CO2 atmosphere. Cells (1 × 104 cells/well) were seeded in 96 well plates coated with collagen type I (Cellmatrix type I-C, Nitta Gelatin) the day before transduction. The plasmids (100 ng) were incubated together with Lipofectamine 3000 (Thermo Fisher Scientific), and then directly added to the cell culture according to the manufacturer’s recommendations. At 48 h after the transduction, the cells were lysed with the SimplePrep reagent for DNA (Takara). The supernatants were directly used for PCR. DNA fragments were amplified with Phusion DNA polymerase (New England Biolabs). PCR amplicons were subjected to 150-bp pair-end read sequencing using a MiSeq sequencer (Illumina). The frequencies of the mutations were assessed by CRISPResso246.

GUIDE-Seq analysis

HEK293 and U2OS cells were maintained and cultured as described previously47. Cells were nucleofected following the manufacturer’s instructions (Lonza) in 20 μL Solution SE, using programs CM-104 (293) and DN-100 (U2OS) on a Lonza Nucleofector 4-D. Cells were transfected with SaCas9 plasmids (500 ng), sgRNA plasmids (250 ng), and dsODN [100 pmol; complementary oligonucleotide sequences derived from Malinin et al. 202148].

GUIDE-seq library preparation and analysis were performed as previously described47,48. Briefly, genomic DNA was purified via an Agencourt DNAdvance kit (Beckman Coulter). A Covaris E220 ultrasonicator was used to shear purified genomic DNA to an average fragment size of 500 bp. After sonication, 400 ng was used for library preparation. Genomic DNA was treated with end-repair mix (Qiagen), A-tailed with Taq polymerase (Thermo Fisher Scientific), ligated to single-tailed sequencing adapters, and purified using SPRI magnetic beads. Two rounds of nested PCR with the dsODN sense- and antisense-specific primers in separate reactions were performed on the adapter-ligated library. After purification with SPRI magnetic beads, libraries were quantified using a Kapa qPCR Library Quantification Kit (Kapa). Equimolar amounts of samples were pooled and sequenced with 150 bp paired end reads on the Illumina NextSeq 550 sequencer.

For GUIDE-seq-2 library preparation, Tn5 transposase was prepared by combining hyperactive Tn5 with annealed i5 adapter oligos containing an 8-nucleotide barcode and a 10-nucleotide unique molecular index, in 2x Tn5 dialysis buffer (100 mM HEPES-KOH, pH 7.2, 200 mM NaCl, 0.2 mM EDTA, 2 mM DTT, 0.2% Triton X-100, and 20% glycerol), for one h at 24 °C. Tagmentation was performed in 40 µL reactions for 7 min at 55 °C, using 250 ng of genomic DNA, 4 µL of assembled Tn5/i5-transposome, and 8 µL of fresh 5x TAPS-DMF buffer (50 mM TAPS-NaOH, 25 mM MgCl2, and 50% dimethylformamide (DMF)). To stop the reaction, 5 µL of a 50% proteinase K (NEB) solution was added, and the solution was incubated for 15 min at 55 °C. Samples were purified using SPRI-guanidine magnetic beads, and separate PCR reactions were performed using dsODN sense- and antisense-specific primers. Reactions were conducted with Platinum Taq (Thermo Fisher) using the following thermocycler settings: 95 °C for 5 min, 15 cycles of temperature cycling (95 °C for 30 s, 70 °C (−1 °C per cycle) for 120 s, and 72 °C for 30 s), 20 constant cycles (95 °C for 30 s, 55 °C for 60 s, and 72 °C for 30 s), and 72 °C for 5 min. PCR products were purified using SPRI beads and quantified using a Kapa qPCR Library Quantification Kit (Kapa). Libraries were purified using Lightbench (Yourgene Health) selection, and sequenced using a NextSeq 1000/2000 (Illumina) sequencer with cycle settings of 146, 8, 18, and 146. Data analysis was performed using the updated open-source GUIDE-seq2 analysis software (https://github.com/tsailabSJ/guideseq/tree/V2).

Plasmid construction, AAV production, and assessment of genome editing

AAVpro293T cells (Takara) and HEK293 cells were cultured in DMEM (Sigma) supplemented with 10% FBS (Thermo Fisher Scientific) and GlutaMAX (Thermo Fisher Scientific). Murine immortalized liver cells (TLR3 cells, JCRB Cell Bank) were maintained in DMEM containing 2% FBS, 5 ng/mL of human epidermal growth factor (EGF), and ITS-X Supplement (Thermo Fisher Scientific). The SaCas9 cDNA was codon-optimized in GenScript. A DNA fragment comprising a promoter, the SaCas9 cDNA (SaCas9 or SaCas9-NNG), the SV40 polyadenylation signal, and the sgRNA sequence driven by the U6 promoter was introduced into the p1.1c plasmid. The HCRhAAT liver-tropic promoter (an enhancer element of the hepatic control region of the ApoE/C1 gene and the human anti-trypsin promoter) was employed. SaCas9-ABE8e and the human coagulation factor IX (FIX) cDNA were introduced into the pcDNA3 (Thermo Fisher Scientific) and pBApo-EF1α Neo (Takara) vectors, respectively. The sgRNA driven by the U6 promoter was incorporated into pUC57. The DNA fragment for the SaCas9 or SaCas9-ABE8e expression cassette and the sgRNA driven by the U6 promoter were introduced between the inverted terminal repeats of the pAAV plasmid. The AAV genes were packaged by triple plasmid transfection of AAVpro293T cells to produce the AAV8 vector (helper-free system), as described previously49. The titers of recombinant AAV vectors were determined by quantitative PCR, as previously described50.

Cells (5×104 cells/well) were seeded in 48 well plates coated with collagen type I (Cellmatrix type I-C, Nitta Gelatin) the day before transduction. The plasmids (200 ng) were incubated together with Lipofectamine 3000 (Thermo Fisher Scientific), and then directly added to the cell culture according to the manufacturer’s recommendations. To obtain stable expressing clones, 400 µg/mL G418 (Nacalai Tesque) was added to the culture medium after the transfection with pBApo-EF1α Neo. The DNAs were isolated at 72 h after the transduction.

DNA fragments at the target site were amplified with ExTaq DNA polymerase (Takara). Purified PCR products were denatured and re-annealed using a thermal cycler, and then treated with T7 endonuclease (Nippon Gene). DNA fragments were analyzed by a MultiNA microchip electrophoresis system. When indicated, PCR amplicons were subjected to 300-bp paired-end read sequencing at the NGS core facility at the Research Institute for Microbial Diseases of The University of Osaka (Osaka, Japan). The mutation frequencies were assessed by CRISPResso246.

Animal experiments

All animal experimental procedures were approved by The Institutional Animal Care and Concern Committee of Jichi Medical University (permission number: 20051-10), and animal care was conducted in accordance with the committee’s guidelines and ARRIVE guidelines51,52. All mice were housed in isolators in the specific pathogen-free facility of Jichi Medical University under controlled environmental conditions (23 °C ± 3 °C, 50% ± 10% relative humidity) with a 12:12 h light/dark cycle. C57BL/6 mice were purchased from SLC Japan (Shizuoka, Japan), and all mice used in this study were male. Knock-in mice expressing human hF9 with the hemophilia B variant (c.280 G > A and c.364 G > A) were developed through gene targeting as previously described in ref.53. For indel analysis, four mice were used per construct (eight constructs in total; 32 mice), whereas for base-editing analysis, three mice were used per construct (four constructs in total; 12 mice). The AAV8 vector was administered intravenously through the jugular vein (100–150 µL) of mice anesthetized with isoflurane (1–3%). To obtain plasma samples, blood samples were drawn from the jugular vein using a 29 G micro-syringe (TERUMO) containing 1/10 (volume/volume) sodium citrate. Platelet-poor plasma was obtained by centrifugation and then frozen and stored at −80 °C until analysis. Plasma FIX activity (FIX:C) was measured by a one-stage clotting-time assay with an automated coagulation analyzer (Sysmex CS-1600).

Cryo-EM sample preparation and data collection

The eSaCas9-NNG–sgRNA–target DNA ternary complex was reconstituted by mixing the purified eSaCas9-NNG, the 98-nt sgRNA, the 43-nt target DNA, and the 43-nt non-target DNA at a molar ratio of 1:1.2:1.25:1.25 at room temperature for 10 min. Each DNA strand contained phosphorothioate modifications within the phosphate backbone around the cleavage site to prevent DNA cleavage (Supplementary Table 4). The SaCas9 and eSaCas9-NNG in complex with the sgRNA and target dsDNA containing the single-nucleotide mismatch at position 21 were reconstituted in the same way, except that the incubation was performed at room temperature for 10 min. The ternary complexes were purified by size-exclusion chromatography on a Superdex 200 Increase 10/300 column (GE Healthcare), equilibrated with buffer E (20 mM HEPES-NaOH, pH 7.6, 50 mM NaCl, 2 mM MgCl2, 10 μM ZnCl2, and 1 mM DTT). The purified complex solution (A260 nm = 26) was mixed with 0.005% Tween 20, and then applied to Au 300-mesh R1.2/1.3 grids (Quantifoil) that were glow-discharged in a Vitrobot Mark IV (FEI) at 4 °C, with a waiting time of 10 sec and a blotting time of 4 s under 100% humidity conditions. The grids were plunge-frozen in liquid ethane and cooled to the temperature of liquid nitrogen.

Micrographs for all datasets were collected with a Titan Krios G3i microscope (Thermo Fisher Scientific) running at 300 kV and equipped with a Gatan Quantum-LS Energy Filter (GIF) and a Gatan K3 Summit direct electron detector in the electron counting mode (The University of Tokyo, Japan). Datasets were collected with a total dose of approximately 50 electrons per Å2 per 48 frames by the standard mode, using the EPU software (Thermo Fisher Scientific). The dose-fractionated movies were subjected to beam-induced motion correction and dose weighting using Patch Motion Correction, and the contrast transfer function (CTF) parameters were estimated using Patch-based CTF estimation in cryoSPARC v4.4.054,55.

Cryo-EM data processing

Data were processed with the cryoSPARC v4.4.0 software platform54. For the eSaCas9-NNG–sgRNA–target DNA ternary complex, 4,900,536 particles were automatically picked using Template Picker from the 8625 motion-corrected and dose-weighted micrographs, followed by several rounds of reference-free 2D classification to curate particle sets. Using maps derived from ab-initio reconstruction as templates, 1,659,239 selected particles were subjected to heterogeneous refinement, resulting in the reconstructions of six distinct conformational states. Four of these classes, corresponding to State I (catalytically active state), State II (interrogation state), State III (intermediate state), and State IV (translocation state), were subjected to further processing. For States I, III, and IV, the particles were subjected to CTF refinement, Reference-Based Motion Correction (RBMC), and 3D classification without alignment. Non-uniform refinement after subsequent postprocessing yielded maps at overall resolutions of 3.14 Å (catalytically active state), 2.90 Å (intermediate state), and 2.76 Å (translocation state), according to the Fourier shell correlation (FSC) criterion of 0.14356,57. For the interrogation state, the selected particles were subjected to 3D variability analysis58. The resulting maps with different conformations were used for subsequent heterogeneous refinement. The particles with the most detailed features after heterogeneous refinement were refined using non-uniform refinement after CTF refinement and RBMC, and yielded a map at an overall resolution of 3.17 Å according to the FSC criterion of 0.143. The local resolution was estimated by BlocRes in cryoSPARC.

The datasets for SaCas9 and eSaCas9-NNG in complex with the sgRNA and the mismatched target dsDNA were processed using cryoSPARC in a similar manner as described above. For data processing details, see Supplementary Figs. 5 and 8.

Model building and validation

The models were built using the crystal structure of SaCas9 (PDB: 5CZZ) as the reference12, followed by manual model building with Coot59,60. The models were refined using Servalcat against unsharpened half maps61. The reference structure restraints were used for the refinement of the catalytically active, interrogation, and intermediate states, which were generated from the AlphaFold2-predicted models and the intermediate and translocation state models using ProSmart62,63. The stereochemical restraints for phosphorothioate-modified DNA links were generated using AceDRG64. The models were validated using MolProbity65. Molecular graphics figures were prepared with UCSF ChimeraX-1.7.166.

Statistics & Reproducibility

No statistical methods were used to predetermine sample size. Sample size was based on experimental feasibility and sample availability. Samples were processed in random order. Statistical analyses were performed using GraphPad Prism 10 (Graph Pad Software, San Diego, CA). All data are presented as the mean ± standard deviation (s.d.).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

41467_2026_71626_MOESM2_ESM.pdf (5.4KB, pdf)

Description of Additional Supplementary Files

Supplementary Movie 1 (66.6MB, mp4)
Supplementary Movie 2 (28.3MB, mp4)
Reporting Summary (2.4MB, pdf)

Source data

Source Data (5MB, xlsx)

Acknowledgements

We thank Sachiyo Kamimura and Yuiko Ogihara of Jichi Medical University and Keiko Ogomori of The University of Tokyo for their technical assistance. Molecular graphics and analyses performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from National Institutes of Health R01-GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases. T.O. was supported by AMED Grant Numbers JP23fk0410037, JP24fk0410061, JP24bm1323001, JP23ae0201007, JP25fk0410061, 25bm1123046h0003. H.N. is supported by JSPS KAKENHI Grant Numbers 21H05281 and 22H00403, the Takeda Medical Research Foundation, the Inamori Research Institute for Science, and JST, CREST Grant Number JPMJCR23B6. O.N. was supported by AMED Grant Numbers JP23fa627001 and JP19am0401005, the Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED, under grant numbers JP23ama121002 (support number 3272, M.K.) and JP23ama121012 (support no. 4894, O.N.), JP25ama121012, and the Cabinet Office, Government of Japan, Public/Private R&D Investment Strategic Expansion Program (PRISM) Grant Number JPJ008000.

Author contributions

S.N.O., R.N., S.K., and H.N. performed biochemical experiments with assistance from S.O., K.O., and K.H.; S.N.O., S.I., H.M., and N.Y. performed cell biological experiments with assistance from M.T. and K.H.; Y.K., T.H., and T.O. conducted AAV preparation and mouse experiments; K.J. and S.Q.T. conducted GUIDE-seq analysis; S.N.O. and R.N. performed structural analyses with assistance from H.H., K.Y., and H.N.; S.N.O., R.N., H.N., and O.N. wrote the manuscript with help from all authors; H.N. and O.N. supervised the research.

Peer review

Peer review information

Nature Communications thanks Weizhong Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

Plasmids used in this study, pcDNA3_SaCas9-NNG and pcDNA3_eSaCas9-NNG, are available from Addgene under accession numbers 252406 and 252407, respectively. The atomic models of eSaCas9-NNG–guide RNA–target DNA complexes have been deposited in the Protein Data Bank under the accession codes 8ZCY (interrogation state) [https://www.rcsb.org/structure/8ZCY], 8ZCZ (intermediate state) [https://www.rcsb.org/structure/8ZCZ], 8ZD0 (translocation state) [https://www.rcsb.org/structure/8ZD0], and 8ZDA (active state) [https://www.rcsb.org/structure/8ZDA]. The cryo-EM density maps have been deposited in the Electron Microscopy Data Bank under the accession codes EMD-39941 (interrogation state) [https://www.ebi.ac.uk/emdb/EMD-39941], EMD-39942 (intermediate state) [https://www.ebi.ac.uk/emdb/EMD-39942], EMD-39944 (translocation state) [https://www.ebi.ac.uk/emdb/EMD-39944], and EMD-39954 (active state) [https://www.ebi.ac.uk/emdb/EMD-39954]. The atomic models of SaCas9 and eSaCas9-NNG bound to the sgRNAs and target DNA containing a mismatch at position 21 have been deposited in the Protein Data Bank under the accession codes 9MB6 (SaCas9) [https://www.rcsb.org/structure/9MB6] and 9MB7 (eSaCas9-NNG) [https://www.rcsb.org/structure/9MB7]. The cryo-EM density maps have been deposited in the Electron Microscopy Data Bank under the accession codes EMD-63767 (SaCas9) and EMD-63768 (eSaCas9-NNG). The NGS data have been deposited in the NCBI under accession code PRJNA1088532. Source data are provided as a Source Data file. Source data are provided with this paper.

Code availability

This study did not generate new code.

Competing interests

The University of Tokyo and MODALIS Corporation have filed a patent application related to this work. The inventors are Osamu Nureki, Hiroshi Nishimasu, Shohei Kajimoto, and Hisato Hirano. The patent application pertains to modified Cas9 proteins and their use for genome editing (application number JP2018/032948, published as WO2019/049913). The remaining authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Satoshi N. Omura, Ryoya Nakagawa, Shohei Kajimoto.

Contributor Information

Hiroshi Nishimasu, Email: nisimasu@g.ecc.u-tokyo.ac.jp.

Osamu Nureki, Email: nureki@bs.s.u-tokyo.ac.jp.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-026-71626-2.

References

  • 1.Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science337, 816–821 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science339, 819–823 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mali, P. et al. RNA-guided human genome engineering via Cas9. Science339, 823–826 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature507, 62–67 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science361, 1259–1262 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Walton, R. T., Christie, K. A., Whittaker, M. N. & Kleinstiver, B. P. Unconstrained genome targeting with near-PAMless engineered CRISPR-Cas9 variants. Science368, 290–296 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zhang, W. et al. In-depth assessment of the PAM compatibility and editing activities of Cas9 variants. Nucleic Acids Res49, 8785–8795 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hibshman, G. N. et al. Unraveling the mechanisms of PAMless DNA interrogation by SpRY-Cas9. Nat. Commun.15, 3663 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Silverstein, R. A. et al. Custom CRISPR-Cas9 PAM variants via scalable engineering and machine learning. Nature643, 539–550 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Swiech, L. et al. In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9. Nat. Biotechnol.33, 102–106 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature520, 186–191 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Nishimasu, H. et al. Crystal Structure of Staphylococcus aureus Cas9. Cell162, 1113–1126 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kim, Y. B. et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol.35, 371–376 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Huang, T. P. et al. Circularly permuted and PAM-modified Cas9 variants broaden the targeting scope of base editors. Nat. Biotechnol.37, 626–631 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Nguyen Tran, M. T. et al. Engineering domain-inlaid SaCas9 adenine base editors with reduced RNA off-targets and increased on-target DNA editing. Nat. Commun.11, 4871 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Eggers, A. R. et al. Rapid DNA unwinding accelerates genome editing by engineered CRISPR-Cas9. Cell187, 3249–3261.e14 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Chen, K. et al. Lung and liver editing by lipid nanoparticle delivery of a stable CRISPR-Cas9 ribonucleoprotein. Nat. Biotechnol. 10.1038/s41587-024-02437-3 (2024). [DOI] [PMC free article] [PubMed]
  • 18.Kleinstiver, B. P. et al. Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition. Nat. Biotechnol.33, 1293–1298 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science353, aaf8729 (2016). [DOI] [PubMed]
  • 20.Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature551, 464–471 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol.38, 883–891 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jeong, Y. K. et al. Adenine base editor engineering reduces editing of bystander cytosines. Nat. Biotechnol.39, 1426–1433 (2021). [DOI] [PubMed] [Google Scholar]
  • 23.Li, T., Miller, C. H., Payne, A. B. & Craig Hooper, W. The CDC Hemophilia B mutation project mutation list: a new online resource. Mol. Genet. Genom. Med.1, 238–245 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol.31, 822–826 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol.31, 827–832 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Pattanayak, V. et al. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nat. Biotechnol.31, 839–843 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science351, 84–88 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kleinstiver, B. P. et al. High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects. Nature529, 490–495 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Chen, J. S. et al. Enhanced proofreading governs CRISPR-Cas9 targeting accuracy. Nature550, 407–410 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Vakulskas, C. A. et al. A high-fidelity Cas9 mutant delivered as a ribonucleoprotein complex enables efficient gene editing in human hematopoietic stem and progenitor cells. Nat. Med.24, 1216–1224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Liu, M.-S. et al. Engineered CRISPR/Cas9 enzymes improve discrimination by slowing DNA cleavage to allow release of off-target DNA. Nat. Commun.11, 3576 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Tan, Y. et al. Rationally engineered Staphylococcus aureus Cas9 nucleases with high genome-wide specificity. Proc. Natl. Acad. Sci. USA.116, 20969–20976 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Bravo, J. P. K. et al. Structural basis for mismatch surveillance by CRISPR-Cas9. Nature603, 343–347 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Das, A. et al. Coupled catalytic states and the role of metal coordination in Cas9. Nat. Catal.6, 969–977 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Pacesa, M. et al. R-loop formation and conformational activation mechanisms of Cas9. Nature609, 191–196 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Maeder, M. L. et al. Development of a gene-editing approach to restore vision loss in Leber congenital amaurosis type 10. Nat. Med.25, 229–233 (2019). [DOI] [PubMed] [Google Scholar]
  • 37.Li, Q. et al. In vivo PCSK9 gene editing using an all-in-one self-cleavage AAV-CRISPR system. Mol. Ther. Methods Clin. Dev.20, 652–659 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kim, E. et al. In vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni. Nat. Commun.8, 14500 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Nakagawa, R. et al. Engineered Campylobacter jejuni Cas9 variant with enhanced activity and broader targeting range. Commun. Biol.5, 211 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zhu, X. et al. Cryo-EM structures reveal coordinated domain motions that govern DNA cleavage by Cas9. Nat. Struct. Mol. Biol.26, 679–685 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature576, 149–157 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ferreira da Silva, J. et al. Click editing enables programmable genome writing using DNA polymerases and HUH endonucleases. Nat. Biotechnol.43, 923–935 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Omura, S. N. & Nureki, O. General and robust sample preparation strategies for cryo-EM studies of CRISPR-Cas9 and Cas12 enzymes. in Methods Enzymol. 712, 23–39 (Elsevier, 2025). [DOI] [PubMed]
  • 44.Crooks, G. E., Hon, G., Chandonia, J.-M. & Brenner, S. E. WebLogo: a sequence logo generator. Genome Res14, 1188–1190 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Ishiguro, S. & Yachie, N. Highly Multiplexed Analysis of CRISPR Genome Editing Outcomes in Mammalian Cells. Methods Mol. Biol.2312, 193–223 (2021). [DOI] [PubMed] [Google Scholar]
  • 46.Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol.37, 224–246 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol.33, 187–197 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Malinin, N. L. et al. Defining genome-wide CRISPR-Cas genome-editing nuclease activity with GUIDE-seq. Nat. Protoc.16, 5592–5615 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kashiwakura, Y. et al. Efficient gene transduction in pigs and macaques with the engineered AAV vector AAV.GT5 for hemophilia B gene therapy. Mol. Ther. Methods Clin. Dev.30, 502–514 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Kashiwakura, Y. & Ohmori, T. Genome Editing of Murine Liver Hepatocytes by AAV Vector-Mediated Expression of Cas9 In Vivo. Methods Mol. Biol.2637, 195–211 (2023). [DOI] [PubMed] [Google Scholar]
  • 51.Kilkenny, C., Browne, W. J., Cuthill, I. C., Emerson, M. & Altman, D. G. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. Osteoarthr. Cartil.20, 256–260 (2012). [DOI] [PubMed] [Google Scholar]
  • 52.Percie du Sert, N. et al. The ARRIVE guidelines 2.0: updated guidelines for reporting animal research. BMJ Open Sci.4, e100115 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Baatartsogt, N. et al. Therapeutic base editing to generate a gain-of-function F9 variant for hemophilia B. Blood blood.2024027870 (2025). [DOI] [PMC free article] [PubMed]
  • 54.Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination supple info. Nat. Methods14, 290–296 (2017). [DOI] [PubMed] [Google Scholar]
  • 55.Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods14, 331–332 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Punjani, A., Zhang, H. & Fleet, D. J. Non-uniform refinement: adaptive regularization improves single-particle cryo-EM reconstruction. Nat. Methods17, 1214–1221 (2020). [DOI] [PubMed] [Google Scholar]
  • 57.Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol.333, 721–745 (2003). [DOI] [PubMed] [Google Scholar]
  • 58.Punjani, A. & Fleet, D. J. 3D variability analysis: Resolving continuous flexibility and discrete heterogeneity from single particle cryo-EM. J. Struct. Biol.213, 107702 (2021). [DOI] [PubMed] [Google Scholar]
  • 59.Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D. Biol. Crystallogr.60, 2126–2132 (2004). [DOI] [PubMed] [Google Scholar]
  • 60.Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D. Biol. Crystallogr.66, 486–501 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Yamashita, K., Palmer, C. M., Burnley, T. & Murshudov, G. N. Cryo-EM single-particle structure refinement and map calculation using Servalcat. Acta Crystallogr. D. Struct. Biol.77, 1282–1291 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Nicholls, R. A., Fischer, M., McNicholas, S. & Murshudov, G. N. Conformation-independent structural comparison of macromolecules with ProSMART. Acta Crystallogr. D. Biol. Crystallogr.70, 2487–2499 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Long, F. et al. AceDRG: a stereochemical description generator for ligands. Acta Crystallogr. D. Struct. Biol.73, 112–122 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Williams, C. J. et al. MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci.27, 293–315 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pettersen, E. F. et al. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci.30, 70–82 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

41467_2026_71626_MOESM2_ESM.pdf (5.4KB, pdf)

Description of Additional Supplementary Files

Supplementary Movie 1 (66.6MB, mp4)
Supplementary Movie 2 (28.3MB, mp4)
Reporting Summary (2.4MB, pdf)
Source Data (5MB, xlsx)

Data Availability Statement

Plasmids used in this study, pcDNA3_SaCas9-NNG and pcDNA3_eSaCas9-NNG, are available from Addgene under accession numbers 252406 and 252407, respectively. The atomic models of eSaCas9-NNG–guide RNA–target DNA complexes have been deposited in the Protein Data Bank under the accession codes 8ZCY (interrogation state) [https://www.rcsb.org/structure/8ZCY], 8ZCZ (intermediate state) [https://www.rcsb.org/structure/8ZCZ], 8ZD0 (translocation state) [https://www.rcsb.org/structure/8ZD0], and 8ZDA (active state) [https://www.rcsb.org/structure/8ZDA]. The cryo-EM density maps have been deposited in the Electron Microscopy Data Bank under the accession codes EMD-39941 (interrogation state) [https://www.ebi.ac.uk/emdb/EMD-39941], EMD-39942 (intermediate state) [https://www.ebi.ac.uk/emdb/EMD-39942], EMD-39944 (translocation state) [https://www.ebi.ac.uk/emdb/EMD-39944], and EMD-39954 (active state) [https://www.ebi.ac.uk/emdb/EMD-39954]. The atomic models of SaCas9 and eSaCas9-NNG bound to the sgRNAs and target DNA containing a mismatch at position 21 have been deposited in the Protein Data Bank under the accession codes 9MB6 (SaCas9) [https://www.rcsb.org/structure/9MB6] and 9MB7 (eSaCas9-NNG) [https://www.rcsb.org/structure/9MB7]. The cryo-EM density maps have been deposited in the Electron Microscopy Data Bank under the accession codes EMD-63767 (SaCas9) and EMD-63768 (eSaCas9-NNG). The NGS data have been deposited in the NCBI under accession code PRJNA1088532. Source data are provided as a Source Data file. Source data are provided with this paper.

This study did not generate new code.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES