Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2025 Aug 13;53(15):gkaf782. doi: 10.1093/nar/gkaf782

PAM-interacting domain turn-helix 51 motifs can improve Cas9–SpRY activity

Reto Eggenschwiler 1,2,, Thomas Hoffmann 3, Oleg Dmytrenko 4, Mika Opitz 5,6, Marlene Ackel-Zakour 7,8, Pascal Wang 9,10, Shannon A McCallan 11,12, Mariane Fráguas-Eggenschwiler 13, Heiner Niemann 14, Atanas Patronov 15, Chase L Beisel 16,17, Tobias Cantz 18,19,
PMCID: PMC12350091  PMID: 40808297

Abstract

Cas9–SpRY is an engineered variant of the Streptococcus pyogenes Cas9 with relaxed PAM recognition, which can technically be utilized at any target in the genome but some targets are addressed with low efficiency. Here, we show that a previously unexplored motif at the turn and beginning of α-helix 51 (TH51) can be engineered to improve both nuclease and prime-editing activity of Cas9–SpRY. Interaction of the lysine-rich PID loop 2 (PL2) with the target DNA downstream of the PAM (post-PAM) mediates initiation of R-loop formation and subsequent cleavage yet it was unclear if other regions of the PID engage with post-PAM as well. To this end, the NAAN–PAM-targeting iSpyMac hybrid nuclease, which lacks all lysine residues in PL2, was compared with Cas9–SpRY at identical targets using molecular dynamics simulation and in cell culture models, uncovering four crucial post-PAM-interacting lysines in TH51 and TH53 of iSpyMac. Ectopic insertion of a lysine-rich PL2 into iSpyMac boosted its nuclease and prime-editing activities and, in turn, Cas9–SpRY benefited from certain lysine-rich TH51 motifs. Specifically, TH51 from an uncultured Abiotrophia Cas9 species boosted overall Cas9–SpRY activity. Together, this study demonstrates that engineering of post-PAM interacting motifs opens new avenues for the design of advanced CRISPR enzymes.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

Target cleavage by Streptococcus pyogenes (Spy) Cas9 strongly depends on post-PAM (downstream of PAM, protospacer adjacent motif) DNA interaction with its PID’s lysine-rich-loop (here termed PID loop 2, PL2). The positive charge conveyed by the lysine residues aids in initiation of R-loop formation by nonspecific interaction with the negatively charged target DNA backbone [1, 2]. This mechanism mediates proper Spy–Cas9 action and it was previously shown that lysine residues are replaceable by arginines [2]. Strikingly, the Streptococcus macacae (Smac) Cas9 and certain Streptococcus mutans PIDs harbor a PL2 with eleven few residues without any lysines or arginines (Fig. 1A and Supplementary Fig. S1A) [3–6]. Presently, it is not known if and how Cas9 enzymes without a lysine-rich PL2 can engage in post-PAM target DNA interaction.

Figure 1.

Figure 1.

Ectopic swapping of PL2 loops between Cas9–SpRY and iSpyMac (iSM). (A) PIDs from AF2 models of SpRY and iSM. SpRY PL2, TH51, and TH53 motifs are colored in green and iSM motifs in red. Important lysine residues are shown in licorice. (B) Activation of eGFP by indel formation in human reporter cells using SpRY and iSM nucleases as well as their PL2-swapped counterparts and three NAAN–PAM targeting sgRNAs. (C) Analysis of iSM_PPL2 with 15 sgRNAs targeting NAAN–PAMs (in brackets), relative to iSM. (D) Schematic of mutated mKO2 sequence in the TLR_AA reporter and pegRNAs used for mKO2 reconstitution. The last two amino acids in the FCYG fluorophore are mutated by one point mutation each (orange/pink). PAM and silent PAM-disrupting mutation in the R26P13_dP reverse transcriptase template are colored in green and primer binding sites in light blue. (E) Activation of mKO2 fluorescence by iSM and iSM_PPL2 prime editors in human reporter cells with two different pegRNAs. (F) Log2-fold depletion of a 5 nt PAM library after TXTL-assay of iSM and iSM_PPL2 chimeric enzymes using g + 10b as target. Data are represented as ± SD from n = 3 (1B) or n = 6 (1E) biological replicates and significance was calculated using two-way ANOVA with Tukey’s (1B) and Šídák’s (1E) multiple comparisons tests. Data in Fig. 1C was analyzed using two-tailed unpaired Mann–Whitney test, with dashed line showing median and fine dotted lines at quartiles. Relative guide efficiencies in Fig. 1C were determined using n = 3 biological replicates, as represented by corresponding colored dots (****P ≤ 0.0001; ***P ≤ 0.001; **P ≤ 0.01).

Here, an established NAAN–PAM targeting hybrid enzyme of Spy–Cas9 with a Smac PID (iSpyMac, iSM) and the near PAM-less SpRY variant of Spy–Cas9 were utilized [3, 7–9]. On ectopic swapping of their respective PL2 motifs, enzymes were then compared using identical single guide RNAs (sgRNAs) at identical DNA targets. While iSM revealed activity, both with or without a lysine-rich PL2, SpRY activity strongly dropped when equipped with the lysine-free PL2 from iSM. This finding raised the question whether iSM was overall less dependent on post-PAM interaction than SpRY or if it mediated such interaction by alternative mechanisms. We therefore investigate the Smac PID for structural features, which can also interact with the post-PAM target DNA, and thus are pivotal for the enzyme’s activity.

MD simulation is a powerful tool to predict behavior of macromolecules in silico and was employed for prediction of protein–DNA interaction, including Cas9 nuclease and adenine base editors [10–12]. The PDB:5Y36 cryo-EM structure, a SpCas9–sgRNA–DNA ternary complex including 15 post-PAM nucleotides, models the lysine-rich PID loop in proximity of the post-PAM sequence and was used for MD simulation [10]. In homology models [13, 14] of 5Y36, we replaced Spy–Cas9 by Cas9–SpRY or iSM and the sgRNA and target DNA with experimentally tested NAAN–PAM targeting sequences. Following interaction analysis [15], several regions with post-PAM interaction were uncovered in the PIDs of both enzymes. In particular, the two regions at the turn and beginning of α-helix 51 (TH51) and TH53 harbor additional lysine residues in iSM compared to SpRY and were therefore systematically evaluated in human reporter cell assays. Analysis of iSM lysine mutants on reporter vectors with swapped post-PAM sequences hinted at a more complex mechanism of post-PAM DNA interaction.

Finally, we evaluated the TH51 motif from iSM (MTH51) in a novel hybrid enzyme of SpRY and uncovered that it benefits cleavage at some targets while it can compromise activity at others. Additional lysine-rich TH51 motifs derived from cluster 1 group 13 Cas9 enzymes (C1 G13, including Spy– and Smac–Cas9) [5] were evaluated and particularly the TH51 motif from an uncultured Abiotrophia species (ATH51) [16] boosted overall SpRY nuclease and prime editor activity, and the same held true for the NGNN–PAM targeting Cas9–SpG enzyme [7]. Together, this report presents evidence that modification of post-PAM interacting PID motifs can be explored for promoting activity of Cas9 enzymes.

Materials and methods

Molecular cloning

A detailed description of molecular cloning procedures including original plasmids and oligo sequences are provided in Supplementary Methods.

Cell lines

HEK-293 cells were obtained from DMSZ Braunschweig, Germany (ACC 305). Cells were cultivated in Dulbecco's Modified Eagle Medium (DMEM) 4.5 g/l glucose GlutaMAX (Thermo Fisher Scientific #31966021) supplemented with 100× Pen/Strep and 10% FBS.

Generation of HEK-293 traffic light reporter cell populations

Lentiviral vectors harboring TLR_AA and TLR_pPAM reporter sequences were generated as described previously [17]. HEK-293 cells were transduced at a calculated multiplicity of infection (MOI) of 0.02 with TLR_AA or TLR_pPAM lentiviral vectors, followed by selection with 500 μg/ml hygromycin B (Thermo Fisher Scientific #10687010), as reported previously [17]. Surviving reporter bulk cell populations were subsequently sorted using flow cytometry to remove trace amounts of enhanced Green Fluorescent Protein (eGFP) and monomeric Kusabira Orange 2 (mKO2) positive cells harboring randomly activated reporter sequences introduced during the lentiviral production process. Generation of 293 TLR_A2G cells was described in a previous study [17]. Briefly, all three cell populations harboring the different traffic light reporter (TLR) vectors (AA, pPAM, and A2G) can be activated by Cas9 nuclease to display eGFP fluorescence, which depends on random indel-based frame-shifting of a corresponding eGFP gene [17]. However, TLR_AA and TLR_A2G can also report (PE-based) repair of the mKO2 gene, whereas in TLR_pPAM the mKO2 coding sequence was disrupted at three separate locations. Thus, TLR_pPAM cannot be considered a “true” TLR.

Transfection and antibiotic selection of TLR cells

For Cas9 nuclease assays, 293 TLR_AA or TLR_pPAM cells were seeded onto 48-well adherent tissue culture test plates (TPP #92448) at a density of 23 000 cells/cm2 one day before transfection and then transfected using Mirus TransIT®-LT1 Transfection Reagent (Mirusbio #MIR2300) according to the manufacturer’s instructions. For example, for a transfection of three wells, separate plasmids encoding Cas9 nuclease and pU6 vectors harboring sgRNAs were diluted to 100 ng/μl in OptiMEM (Thermo Fisher Scientific #31985070) before mixing 4.15 μl nuclease and 4.15 μl sgRNA with 74.2 μl OptiMEM and adding 3.3 μl of transfection agent. During the 15 min of incubation, medium on overnight attached cells was exchanged with 250 μl fresh medium without Pen/Strep before adding 26 μl of transfection mix per well. Medium was exchanged the following day with medium containing 0.5 μg/ml puromycin (Merck #P8833) and incubated for 48h to enrich for successfully transfected cells (“puro-pulse”).

For piggyBac transposon-based prime editor (PB-PE) based assays, HEK-293 TLR_AA or TLR_A2G cells were seeded onto 24-well adherent tissue culture test plates (TPP #92424) at a density of 23 000 cells/cm2 one day before transfection and then transfected using Lipofectamine LTX (Thermo Fisher Scientific #15338100) according to the manufacturer’s instructions. Briefly, for a transfection of three wells, PB-PE transposon and piggyBac transposase encoding pCAG-hyPBase [17] plasmids were diluted to 50 ng/μl in OptiMEM and 28.5 μl of diluted transposon plasmid was mixed with 1.5 μl of diluted transposase plasmid before adding 270 μl of OptiMEM on top. 4.5 μl of Lipofectamine LTX was added and transfection mix was incubated for 30 min. Medium on the reporter cells was exchanged with 500 μl fresh medium without Pen/Strep and 100 μl of transfection mix was added. Seven-two hours post transfection, medium was exchanged with medium containing 3 μg/ml puromycin and cells were kept on puromycin until analysis.

Flow cytometry

Transfected and selected 293 TLR cells were harvested for analysis, typically at day 6 post transfection for nuclease assays and at day 9 of selection for PB-PE based prime editing assays. Flow cytometric analysis was performed using the Beckman Coulter CytoFLEX S device.

Cell-free transcription-translation (TXTL)-based PAM analysis

A detailed description of protospacer adjacent motif (PAM) library preparation, depletion, sequencing and analysis is provided in Supplementary Methods. Briefly, the protospacer from the g + 10b sgRNA was introduced next to a 5 nt randomized PAM library. The plasmid library was subsequently digested in two replicates, with the corresponding nuclease and g + 10b targeting sgRNA plasmids using the myTXTL cell-free expression system (Arbor Biosciences #540300) whereas a nontargeting sgRNA plasmid served as control. The digested library was purified and sequenced using the Illumina NovaSeq 6000 platform (paired-end, 150 bp reads), with each library yielding at least 2 million reads.

Homology modeling

Homology models of the iSM and the SpRY constructs were computed with the software MODELLER v10.2 [13] using the cryo-EM structure of the SpCas9–sgRNA–DNA ternary complex (PDB ID: 5Y36) [10] as template. Atoms of the DNA and RNA as well as three Mg2+ active site ions were renamed to HETATM records in the template structure and were included in the resulting 1000 models of which the respective best one was selected according to the DOPE score. For each nuclease, RNA and DNA bases were mutated according to the target sequences of g + 10b with g + 10b post-PAM and g + 57b with g + 57b post-PAM (“TLR_AA” models) as well as g + 10b with g + 57b post-PAM and g + 57b with g + 10b post-PAM (“TLR_pPAM” models) using the Chimera v1.16 software [14], resulting in four different DNA-bound 3D models per nuclease (see also Supplementary Table S2).

Molecular dynamics simulations

For each of the four Cas9–SpRY and iSpyMac homology models, triplicates of unrestrained MD simulations in explicit solvent for 300 ns were carried out using GROMACS v 2022.05 software [18] and amber99SB-ILDN force field [19] with the following protocol:

A model was placed in a triclinic simulation box with a minimum distance to the box border of 0.125 nm and solvated in randomly placed TIP3P [20] explicit water. Charges were neutralized with NaCl and further ions were added to achieve a concentration of 150 mM and 50 mM of NaCl and MgCl2, respectively (three active site Mg ion positions inherited from the HM template structure 5Y36 were kept). Bond lengths were constrained with the LINCS algorithm [21] for all atoms or only for hydrogen atoms at temperatures >20 K or ≤20 K, respectively. A nonbonded cutoff was set to 1.0 nm and long-range interactions were computed using the particle mesh Ewald method [22]. Steepest descent energy minimization of 50 000 steps was succeeded by constrained heat up simulations gradually from 0 K to 200 K for 0.4 ns under NVT condition, while initially the positions of all heavy atoms and afterwards of all backbone atoms were restrained with a force constant of 1000 kJ·mol−1·nm−2 and 100 kJ·mol−1·nm−2, respectively. The system was further gradually heated without position restraints for 0.1 ns from 200 K to 298 K under isobaric–isothermal conditions using the Parrinello–Rahman barostat [23] (1.0 bar, τp = 2.0) and velocity rescaling (τt = 0.1) for temperature coupling [24]. Finally the system was equilibrated for 0.25 ns using the conditions of the 300 ns production run switching to stochastic dynamics (τt = 2.0) [25]. For the heat-up, and the production simulation a time step of 2 fs was used. For the production runs, trajectory frames were stored in intervals of 0.1 ns.

Interaction analysis by FoldX

On each production trajectory frame the FoldX v5.0 software [15] was applied for pairwise accounting of hydrogen bonds, electrostatic interactions, and distance based interactions between the protein residues and residues of DNA or RNA, respectively (command = PrintNetworks). The first 100 ns were discarded and interaction counts were averaged over the time window of 100–300 ns (see also Supplementary Table S3).

MAFFT alignments

Multiple alignment using fast Fourier transform (MAFFT) alignments of C1 G13 Cas9 amino acid sequences (Supplementary Table S1) were performed using the MAFFT online algorithm (https://mafft.cbrc.jp/alignment/server/index.html) using automated settings strategy [26].

Neural network-based predictions of Cas9 enzymes

Structural models of various Cas9 enzymes in absence of bound DNA and RNA were predicted using AlphaFold v2.3.2 [27]. For this purpose, the monomer preset parameter and the full database parameter were used, the maximum template date was set to the future, and all resulting models were subjected to Amber relaxation. The different models were subsequently superimposed to the cryo-EM structure of the SpCas9–sgRNA–DNA ternary complex (PDB ID: 5Y36) using the software PyMOL v2.5.0 OS [28].

Results

The Smac–Cas9 PID activity depends less on a lysine-rich loop than the Cas9–SpRY PID

A peculiarity was identified in the PIDs of S. macacae Cas9 and a subgroup of the S. mutans Cas9 enzymes, namely, encompassing a shorter PL2 without any lysine residues (Fig. 1A and Supplementary Fig. S1A) [3–6]. In order to assess the SmacCas9 and SpyCas9 PIDs side-by-side at identical targets, iSpyMac (iSM) and Cas9–SpRY (SpRY) enzymes were employed, which both use Spy–Cas9 sgRNAs and act on NAAN–PAMs [3, 7]. The PID loops of both enzymes were ectopically swapped, resulting in iSM_PPL2 (S. pyogenes PID loop 2) and SpRY_MPL2 (S. macacae PID loop 2; Fig. 1A and Supplementary Table S1). For a standardized comparison of different enzymes, Cas9 nuclease encoding sequences were coupled via an internal ribosomal entry site (IRES) to a puromycin N-acetlytransferase (PAC) resistance gene and a puromycin pulse protocol was established (Supplementary Fig. S1B). Activities of all four enzymes were then examined on NAAN–PAM targets in a slightly modified version of a previously described TLR system, which can report ∼1/3 of indel formation via eGFP and gene correction via mKO2 fluorescence, termed TLR_AA (Fig. 1D) [17]. As expected, SpRY_MPL2 underperformed at investigated targets when compared to SpRY (Fig. 1B). However, activities of iSM and iSM_PPL2 were in the range of SpRY at the same targets (Fig. 1B). Most importantly, iSM_PPL2 performed as well or better than iSM at 15 analyzed NAAN–PAM targets, with a mean improvement factor of 1.43 and up to 2.5 for some targets (Fig. 1C and Supplementary Fig. S1C). ISM and iSM_PPL2 were then inserted into a PB-PE vector [17] and compared for their capability to reconstitute the mKO2 reporter gene using g + 10b as pegRNA spacer (Fig. 1D). ISM_PPL2-PE outperformed iSM-PE by 10-fold, indicating that the presence of a lysine-rich PL2 can be crucial for proper prime editing activity (Fig. 1E and Supplementary Fig. S1D). PAM-specificity of iSM and iSM_PPL2 nucleases was assessed using a TXTL-based PAM-library depletion assay [29, 30] revealing comparable preference of NAAN–PAMs by both enzymes (Fig. 1F and Supplementary Fig. S1E). Together, despite its basic functionality without a lysine-rich PL2, the iSpyMac enzyme can be further improved when a lysine-rich PL2 is introduced at ectopic site.

MD simulation interaction analysis predicts additional post-PAM interactions in the Cas9 PID

Based on the aforementioned initial findings, it appeared likely to assume that iSM harbors elements in its PID, which enable nuclease function despite the absence of a lysine-rich PL2. In order to narrow down on regions of interest, 3D homology models (HM) of target-bound iSM and SpRY were created based on the PDB:5Y36 cryo-EM structure [10]. Those models were subjected to 300 ns of MD simulation. The first 100 ns were discarded, whereas 0.1 ns frames of the last 200 ns were subjected to interaction analysis using FoldX [15]. Residue distance, electrostatic interactions (ion bonds) and hydrogen (H-) bonds between DNA target strand (TS) or nontarget strand (NTS) and the PID were extracted and plotted as heatmaps (Fig. 2A and Supplementary Table S2). Nomenclature of α-helices and β-sheets was adapted from a previous study [31]. As expected, SpRY interaction analysis confirmed post-PAM interaction of the PPL2 motif. Remarkably, additional regions with both, TS and NTS post-PAM interactions were also identified in the two enzymes. One of them was located at the loop of β-strands 22 and 23 (β22-23), one at turn-helix 51 (TH51) and another one at TH53 (Fig. 2A). MD simulation of iSM predicted post-PAM interaction with the same motifs. However, MPL2 interaction was observed less frequently with less ion- or H-bonds whereas iSM TH51 showed stronger interaction with additional ion- and H-bonds when compared to SpRY (Fig. 2A and B). Alignment and comparison in 3D models created by AlphaFold2 (Supplementary Fig. S2 and Supplementary Table S1) [27] revealed that TH51 and TH53 contain two and one additional lysine residues in iSM compared to SpRY, respectively, whereas β22-23 does not (Fig. 2B). Of note, iSM helix α53 is extended by four amino acids, forming a proper 2.2-turn helix, whereas the same helix in Spy–Cas9 comprises only four amino acids and was thus not counted as a full α-helix by a former study [31]. In AF2 3D models, helix 53 was observed in different Cas9 variants however (Supplementary Fig. S2). Thus, TH51 and TH53 were designated as potential candidate features performing post-PAM interaction in the absence of PL2 lysines in iSM.

Figure 2.

Figure 2.

MD simulation of SpRY and iSM 5Y36 homology models bound to TLR targets. (A) Heatmaps of SpRY and iSM PID : post-PAM target DNA interaction analysis by FoldX. Averages of four 3D models with three MD runs each model are shown (details in “Materials and methods” section). Distance is shown in red (upper sections each strand), electrostatic interactions (ion bonds) in blue (middle sections each strand), and H-bonds in green (lower sections each strand). A value of one corresponds to an interaction detected in 2000 out of 2000 0.1 ns frames. Y-axes show 15 post-PAM (+1 to + 15), 4 PAM, and 3 protospacer nucleotides (−1 to −3) of TS or NTS, respectively. PID regions with post-PAM interactions are framed in rounded boxes. (B) Sums of all TS and NTS post-PAM interactions for each amino acid position in PIDs of SpRY and iSM. Ectopically swapped motifs of PL2 as well as TH51 and TH53 motifs subsequently inserted from iSM to SpRY are framed in pink. PID amino acids mutated in SpRY compared to wild-type Spy–Cas9 are indicated in bold.

The immediate post-PAM sequence can influence iSM TH51 and TH53 lysine activity

The two iSM TH51 (K1268 and K1271) and two TH53 (K1327 and K1328) lysines were substituted by alanine, creating single (AK-KK, KA-KK, KK-AK, and KK-KA), double (AA-KK and KK-AA), triple (KA-AA, AK-AA, AA-KA, and AA-AK), and quadruple mutants (AA-AA). On three tested targets (g + 10b, g + 57b, and g + 182b), single and double mutants showed reduced activities relative to iSM (Fig. 3A and Supplementary Fig. S3A). Strikingly, the triple mutants with one lysine remaining, were on average 27% (±8%) active at the g + 10b target, whereas this value dropped to 2% (±0.7%) and 9% (±5%) for the g + 57b and g + 182b targets (Fig. 3B and Supplementary Fig. S3B). Thus, at the g + 10b target, the activities of the single lysines approximately added up to full iSM activity, here designated as “cumulative mode” of action. At the g + 57b and g + 182b targets however, sums of activities of individual lysines were lower than unmodified iSM, herein designated “cooperative mode.”

Figure 3.

Figure 3.

Evaluation of iSM TH51 and TH53 lysine mutants on two reporter vectors with swapped post-PAM regions. (A) Activity of iSM single, double, and quadruple lysine-to-alanine mutants in TLR_AA human reporter cells, relative to iSM. TH51 residues 1268 and 1271 as well as TH53 residues 1327 and 1328 were mutated. (B) Activity of iSM triple lysine-to-alanine mutants on TLR_AA reporter, relative to iSM. (C) Analysis of iSM TH51 and TH53 lysine to arginine mutant, relative to iSM. (D) Schematic of swapped post-PAM nucleotides in TLR_pPAM reporter. Purple: g + 10b post-PAM (CCCGGTGTT), sunset-orange: g + 57b post-PAM (TGCTTGTTT), turquoise: g + 182b post-PAM (TTCACCCCA). (E) Efficiency of iSM with g + 10b, g + 57b, and g + 182b sgRNAs on TLR_AA and TLR_pPAM reporters, relative to g + 680t (a distant target without post-PAM modification). (F) Activity of iSM triple lysine-to-alanine mutants on TLR_pPAM reporter, relative to iSM. (G) Activity of iSM double lysine-to-alanine mutants on TLR_AA and TLR_pPAM, relative to iSM. Data are represented as ± SD from n = 3 (3A–C, 3F–G) or n = 6 (3E) biological replicates and significance was calculated using two-way ANOVA with Dunnett’s (3A and B, 3F) and Šídák’s (3C, 3E, 3G) multiple comparisons tests (****P ≤ 0.0001, ***P ≤ 0.001, **P ≤ 0.01, ns: P > 0.05).

PL2 lysines in SpCas9 can be functionally replaced by arginines [2]. Therefore, TH51 and TH53 lysines were replaced by arginines in iSM and its triple alanine mutants, creating quadruple (RR-RR) and single (RA-AA, AR-AA, AA-RA, and AA-AR) arginine mutants. The quadruple arginine mutant underperformed at investigated targets when compared to iSM (Fig. 3C and Supplementary Fig. S3C). However and remarkably, comparison of single arginine mutants to iSM_RR-RR indicated a cumulative mode at the g + 10b target, while g + 57b and g + 182b showed patterns of a cooperative mode (Supplementary Fig. S3C and D).

To investigate, whether cooperative and cumulative modes of TH51 and TH53 lysines depend on the immediate post-PAM region, nine PAM-adjacent post-PAM nucleotides of g + 10b and g + 57b showing TH51 and TH53 interaction in the MD simulation of iSM were swapped on the TLR_AA reporter vector (Fig. 2A and Supplementary Table S2). Notably, post-PAM positions + 8 and + 9 were identical at those two targets (Fig. 3D). In addition, the g + 10b post-PAM sequence was introduced next to g + 182b target (TLR_pPAM, Fig. 3D). The sgRNA g + 680t binds further downstream from introduced modifications and was therefore employed for comparison and normalization of the two reporter cell populations. Strikingly, activities of iSM with g + 10b and g + 57b sgRNAs were both higher when their targets abutted the g + 10b post-PAM compared to the g + 57b post-PAM sequence, by factors of 1.23 (±0.08) and 2.26 (±0.05), respectively (Fig. 3E and Supplementary Fig. S3E). However, no significant difference was observed with the g + 182b sgRNA in between the two reporters (Fig. 3E). The guides g + 9b and g + 56b targeting adjacent NAAN-PAMs one nucleotide upstream of g + 10b or g + 57b were also more effective when targets encompassed a g + 10b post-PAM sequence, by factors of 4.55 (± 0.16) and 3.15 (± 0.51), respectively (Supplementary Fig. S3F). Remarkably, cleavage with iSM nuclease and g + 183b sgRNA, targeting one nucleotide downstream of g + 182b, improved by a factor of 1.63 (±0.1) on the TLR_pPAM reporter harboring a g + 10b post-PAM compared to TLR_AA with a g + 182b post-PAM sequence (Supplementary Fig. S3F). The g + 350t sgRNA targets downstream of the modified region on the two reporters and, relative to activity of g + 680t, no significant difference was observed in the two reporter cell populations (Supplementary Fig. S3F). Analysis of the iSM triple alanine mutants on TLR_pPAM revealed 8% (±2%) of iSM activity for g + 10b and 30% (±5%) for g + 57b, indicating that the cumulative and cooperative modes had inverted at those targets (Fig. 3F and Supplementary Fig. S3G). Again, no change was observed at the g + 182b target (Fig. 3F). ISM double alanine mutants with either only TH51 (iSM_KK-AA) or only TH53 (iSM_AA-KK) lysines showed a similar pattern, with 51% (±5%) or 50% (±4%) remaining activity when g + 10b or g + 57b targets abutted the g + 10b post-PAM and 27% (±3%) or 14% (±2%) when neighboring the g + 57b post-PAM, respectively (Fig. 3G and Supplementary Fig. S3H).

To assess whether and how the presence of a lysine-rich PL2 might affect post-PAM behavior of the iSM TH51 and TH53 lysines, PPL2 was introduced into iSM double alanine mutants and the resulting constructs were compared with iSM_PPL2. Analysis with g + 10b and g + 57b sgRNAs revealed 95% (±5%) and 87% (±3%) activity relative to iSM_PPL2, respectively, when targets were adjacent to the g + 10b post-PAM sequence. Notably, those values declined to 68% (±5%) for g + 10b and to 33% (±6%) for g + 57b when the targets abutted the g + 57b post-PAM sequence (Supplementary Fig. S3I). These data confirm that iSM TH51 and TH53 lysines convey specific post-PAM interaction and mediate proper enzyme activity, in particular when a lysine-rich PL2 is absent.

The S. macacae TH51 motif can improve or diminish Cas9–SpRY activity, depending on the target

Introduction of PPL2 into iSM improved its overall activity (Fig. 1BE). In turn, it was possible that post-PAM interacting TH51 and TH53 motifs from iSM might modulate Cas9–SpRY activity. Consequently, seven amino acids of the S. macacae TH51 (MTH51) motif were ectopically introduced into Cas9–SpRY (SpRY_MTH51, Supplementary Table S1). In vivo analysis at N(R/Y)NN-PAM targets in reporter cells revealed that SpRY_MTH51 activity relative to SpRY either improved or remained unchanged at approximately two-thirds of targets at a mean factor of 1.15 but was compromised at the other third by a mean factor of 0.79 (Fig. 4A and Supplementary Fig. S4A–C). Furthermore, eight or twelve amino acids of the S. macacae TH53 motif (MTH53_8 and MTH53_12) were introduced to replace four or eight amino acids at an ectopic site in Cas9–SpRY, respectively (Fig. 2B and Supplementary Table S1). Both TH53 engineered enzymes showed impaired activity at analyzed targets (Fig. 4B and C, and Supplementary Fig. S4D and E). Together, ectopic replacement of the TH51 motif was successfully employed to improve activity at some targets, while engineering at TH53 proved more challenging. Consequently, further efforts were focused on TH51.

Figure 4.

Figure 4.

Ectopic insertion of iSM-derived MTH51 and MTH53 motifs into Cas9–SpRY. (A) Analysis of SpRY_MTH51 with 25 sgRNAs targeting N(R/Y)NN-PAMs (in brackets), relative to SpRY (left) and separation into groups of targets with comparable or higher efficiency and lower efficiency (right). (B) Activity of SpRY_MTH53_8 with ectopic insertion of eight amino acids and (C) SpRY_MTH53_12 with twelve amino acids from iSM TH53, relative to SpRY. Data were analyzed using two-tailed unpaired Mann–Whitney test, with dashed lines showing median and fine dotted lines at quartiles. Relative guide efficiencies were determined using n = 3 biological replicates, as represented by corresponding colored dots (****P ≤ 0.0001 and **P ≤ 0.01).

The Abiotrophia TH51 motif improves overall Cas9–SpRY and Cas9–SpG nuclease and prime editor activity

Absence of the PL2 lysines in the S. macacae PID likely impacts its nuclease efficiency and its lysine-rich TH51 and TH53 motifs might provide post-PAM interaction by alternative means (Fig. 2A and B). Therefore, other Cas9 variants belonging to the same group as Spy–Cas9 and Smac–Cas9 could harbor activity modifying TH51 motifs as well provided that their PL2 motifs comprised fewer amino acids with positive charges. In order to examine a set of comparable enzymes, cluster 1 group 13 (C1 G13) [5] Cas9 variants were filtered using the following five criteria: (i) they ought to have a total length of at least 1200 amino acids, (ii) they harbor D10 (Spy–Cas9 numbering) in the RuvC domain as well as (iii) H840 in the HNH domain and (iv) contain a highly conserved YGG motif in PID loop 1 together with (v) a QSITGLYETR motif between β26 and β27 at the end of the PID (Supplementary Fig. S5A and B, and Supplementary Table S1). Applying those criteria, 73.5% of C1 G13 PID sequences were MAFFT aligned. For better prediction of 3-dimensional positioning of the amino acids, 21 Cas9 variants were chosen as representatives and subjected to neural network-based structure prediction using AF2 (Supplementary Fig. S2). MAFFT alignments of PL2 and TH51 motifs were adjusted accordingly (Supplementary Table S1).

Eight TH51 motifs from Cas9 variants with four or less lysines and arginines (R + K) in PL2 and with at least two R + K in TH51 were chosen for ectopic introduction into Cas9–SpRY followed by analysis in reporter cells (Supplementary Table S1). Overall activity was first assessed at g + 10b, g + 57b, and g + 182b targets and the best six performing motifs were then analyzed on additional targets (Supplementary Fig. S5C and D). Of those, SpRY_ATH51 (from Abiotrophia) and SpRY_SlTH51 (from Streptococcus lutetiensis) showed the highest mean gains of activity relative to Cas9–SpRY and were designated for in-depth analysis on 34 targets (Fig. 5A and B, and Supplementary Fig. S5E–H). Strikingly, introduction of SlTH51 lead to a decrease of Cas9–SpRY activity at all of the targets which were negatively affected by MTH51 (Fig. 5B and Supplementary Fig. S5E–H). In contrast, ATH51 provided either no effects or improved Cas9–SpRY activity at analyzed targets, at a mean factor of 1.17 and up to 1.5 (Fig. 5A and B, and Supplementary Fig. S5E–H). Analyzing guide efficiencies with Cas9–SpRY relative to g + 10b suggested that less active guides were significantly overrepresented in the group where introduction of MTH51 (and SlTH51) decreased nuclease efficiency (Fig. 5C). MTH51, SlTH51, and ATH51 motifs were then inserted into all-in-one piggyBac transposon vectors harboring a Cas9–SpRY prime editor and pegRNAs for correction of the mKO2 mutant gene in the TLR_AA vector. While SpRY_MTH51-PE showed decreased activity at two targets, the effect of SlTH51 was neutral and ATH51 improved PE efficiency (Fig. 5D and Supplementary Fig. S5I). In further evaluation of SpRY_ATH51-PE on another reporter vector with previously published NGGN–PAM targeting pegRNAs (TLR_A2G) [17], up to 1.3-fold improvement of PE-efficiency was measured (Fig. 5E and Supplementary Fig. S5H). TXTL-based PAM analysis [29, 30] of SpRY nuclease and its TH51 hybrids showed that all enzymes address similar PAMs without strong nucleotide preference (Fig. 5F and Supplementary Fig. S5K). Finally, the ATH51 motif was introduced into the NGNN–PAM targeting Cas9–SpG enzyme, which is the parental version of Cas9–SpRY [7]. Analysis of SpG_ATH51 nuclease activity was performed using NGNN- as well as previously published NGGN-targeting guides [17], revealing improvement at a mean factor of 1.5- and up to 2.5-fold, whereas SpG_ATH51-PE activity improved at a mean factor of 2.9- and up to 4.6-fold compared to SpG-PE (Supplementary Figs S5L–O). In summary, engineering of Cas9–SpRY and SpG with lysine-rich TH51 motifs from other Cas9 variants can improve their activity and, specifically, TH51 from Abiotrophia boosts overall nuclease and prime editing efficiency.

Figure 5.

Figure 5.

Ectopic insertion of ATH51 and SlTH51 motifs into Cas9–SpRY. (A) Analysis of SpRY_MTH51, SpRY_ATH51, and SpRY_SlTH51 at 24 N(R/Y)NN-PAM targets where incorporation of MTH51 results in comparable or higher efficiency and (B) 10 targets with lower efficiency, relative to SpRY. (C) SpRY guide efficiencies relative to g + 10b. (D) Efficiency of NAAN–PAM targeting SpRY prime editors (PE) with MTH51, ATH51, and SlTH51 motifs on TLR_AA (E) and NGGN–PAM targeting PE with ATH51 on TLR_A2G, relative to SpRY-PE. (F) Log2-fold depletion of a 5 nt PAM library after TXTL-assay of SpRY and TH51 chimeric enzymes using g + 10b as target. Data in Fig. 5A and B were analyzed using Kruskal–Wallis multiple comparisons test, while Fig 5C was analyzed using two-tailed unpaired Mann–Whitney test, with dashed lines showing median and fine dotted lines at quartiles. Relative guide efficiencies were determined using n = 3 biological replicates, as represented by corresponding colored dots. Data in Fig. 5D and E are represented as ± SD from n = 3 biological replicates and significance was calculated using two-way ANOVA with Dunnett’s (5D) and Šídák’s (5E) multiple comparisons tests (****P ≤ 0.0001; ***P ≤ 0.001; **P ≤ 0.01, *P ≤ 0.05, ns: P > 0.05).

Discussion

Several modifications were described impairing or improving the activity or specificity of Spy–Cas9, many residing in the enzyme’s PID [2, 3, 32–34]. Excitingly, here we describe for the first time, that lysine-rich motifs at the turn and beginning of α-helix 51 (TH51) can improve the activity of Cas9–SpRY, a near PAM-less variant of Spy–Cas9 [7]. Moreover, this report presents striking evidence that the lysine-rich TH51 motif from S. macacae mediates previously uncharted post-PAM interaction. Remarkably, we found that at first glance rather diverse appearing TH51 motifs from S. macacae (MTH51; C/KLG/KEH) and S. lutetiensis (SlTH51; T/KLID/KKL) Cas9 modified SpRY activity in a similar manner, having a neutral or boosting effect at around two thirds of analyzed targets and decreasing activity at the other third. Amongst 34 analyzed targets only g + 20t revealed opposite behavior of the two motifs, with MTH51 boosting activity and SlTH51 compromising it (Supplementary Fig. S5E). This was in stark contrast to the Abiotrophia TH51 motif (ATH51; N/IKM/PKV), which showed either neutral effects or boosted activity at investigated targets, with the exception of g + 106b, where a slight but nonsignificant activity drop was observed (Supplementary Fig. S5F). Notably, MTH51 and SlTH51 motifs share three amino acids, a lysine followed by a leucine at the beginning of the turn and a lysine at the first position of α-helix 51. In comparison, the two lysines of ATH51 are shifted by one position each, one residing at position two of the turn and the other one at position two of α-helix 51 (Supplementary Fig. S2 and Supplementary Table S1). Possibly, MTH51 and SlTH51 motifs acted selectively against certain targets when employed in SpRY hybrid enzymes. Alignment and comparison of sgRNA target sequences tested in this study however did not reveal a concise pattern regarding target selection by those TH51 motifs (Supplementary Table S3). Currently, our data indicate that activity of guides with lower efficiency can be further compromised when employed with SpRY_MTH51 and SlTH51 (Fig. 5C).

All three tested TH51 motifs mediated improved SpRY nuclease activity at the g + 10b target and ATH51 also boosted prime editor efficiency when employing a g + 10b targeting pegRNA, while MTH51 and SlTH51 did not (Fig. 5D and Supplementary Fig. S5E). By default, the PE2 prime editors utilized here [17, 35], are H840A nickase-based, and thus cleave the NTS. Generated MD simulation data suggested that the MTH51 motif in iSM interacts with the first five post-PAM TS nucleotides, while the majority of ion- and H-bonds with the NTS were observed at post-PAM positions four to seven (Fig. 2A). In contrast, overall weaker TS interaction was found with TH51 from SpRY and no ion- or H-bonds were formed with the NTS. Therefore, the presence of MTH51, ATH51, and SlTH51 motifs might also affect TS and NTS interaction in SpRY hybrids, which could in turn have consequences for nuclease and nickase activity. However, additional studies will be needed to clarify whether and how TH51 motifs modify TS and NTS cleavage.

Interaction analysis of iSM using MD simulation as well as data generated in reporter cells with iSM and its mutants presented in this study hints at other PID motifs interacting with the post-PAM sequence when a lysine-rich-loop is absent. Notably, MD simulation suggested minor interaction of the first three NTS post-PAM nucleotides with residues in PID loop 1, as well as TS and NTS interaction around the β22-β23 loop. Here, we did not follow up on those regions but instead focused on motifs where iSM harbors additional lysines when compared to SpRY. Applying those criteria, the TH51 and TH53 regions stand out, harboring two lysine residues each and all four were found to be essential for proper iSM functionality. A previous study demonstrated that Spy–Cas9 PL2 lysines can be replaced with arginines [2] and a mechanism of nonspecific interaction of the positively charged amino acids with the negatively charged DNA backbone in the post-PAM region was described. A different study showed that nonspecific interactions between a positive-charge-enriched α-helix in the REC2 domain of Cas12a and post-PAM DNA mediate target search and assist for DNA cleavage [36]. The present report demonstrates that TH51 and TH53 lysines in iSM can be replaced by arginines but doing so compromises the enzyme’s overall function (Fig. 3C). Moreover, swapping of the immediate post-PAM sequence in between targets affected the activity of iSM as well as behavior of its separate lysine residues. Specifically, evidence is presented for two different modes of action of iSM lysines, depending on both the target as well as its post-PAM sequence. In cumulative mode, addition of activities measured for the separate lysines approximately added up to the activity of unaltered iSM. In contrast, a cooperative mode was observed for other targets, where individual lysine activity did not add up to full activity, indicating that individual TH51 and TH53 lysines performed better when other lysines were present as well. Here, five out of six analyzed NAAN–PAM targets were sensitive to modifications in the post-PAM sequence, whereas the g + 182b target was not (Fig 3E and Supplementary Fig. S3F). Together, our data hints at the possibility that the post-PAM interaction mechanism(s) of TH51 and TH53 lysines might be different than the nonspecific interactions described for the Spy–Cas9 PL2 and the Cas12a REC2 motif [2, 36]. Notably, sequence-specific interaction between TH51 and TH53 lysines and the post-PAM DNA does not exclude potential pervasion by nonspecific interactions at the DNA backbone. Here, we investigated if post-PAM dependent modes of TH51 and TH53 lysine action were influenced by presence of a lysine-rich-loop. Strikingly, iSM_PPL2 was overall less sensitive to TH51 and TH53 lysine mutation than iSM and cumulative and cooperative modes were no longer strictly enforced in presence of PPL2 (Supplementary Fig. S3I). However, activity of iSM_PPL2-AA-AA, entailing mutations at all four TH51 and TH53 lysines, revealed activities of 67% (± 5%) for g + 10b and 43% (± 4%) for g + 57b relative to iSM_PPL2 when their targets abutted the g + 10b post-PAM, compared to 16% (± 1%) and 6% (±0.04%) next to a g + 57b post-PAM sequence, respectively (Supplementary Fig. S3I). In contrast, iSM-AA-AA without the lysine rich loop showed 9% (±2%) remaining activity when the g + 10b and g + 57b targets were adjacent to the g + 10b post-PAM (Fig. 1A,B, and 1F, and Supplementary Fig. S1C). Thus, cumulative and cooperative modes might be affected by a different base activity (i.e. without TH51 and TH53 lysines) of iSM_PPL2-AA-AA compared to iSM-AA-AA at those targets (Supplementary Fig. S3I).

Ectopic insertion of eight or twelve amino acids of MTH53 into SpRY resulted in lowered activity at six tested NAAN–PAM targets. Importantly, TH53 is located directly adjacent to crucial PAM-recognizing residues between β-sheet 25 and α-helix 53, three of which had been modified in the engineering of Cas9–SpRY for PAM-relaxation [7]. MD simulation of SpRY suggested that the K1340 lysine at the turn towards α-helix 53 conveys PAM and post-PAM interaction as well (Fig. 2A and Supplementary Table S2) and the finding of a previous study where a K1340T mutation compromised function of wild type Spy–Cas9 [33] is in agreement with the notion that modifications at TH53 can affect enzyme activity.

It is conceivable that higher overall activity of ATH51-bearing SpRY and SpG hybrid enzymes could elicit not only higher on-target activity, but also elevated off-target effects, an issue that is beyond the scope of the present study. Both, SpRY and SpG enzymes also have high-fidelity (HF) variants, which deliver higher genome editing accuracy [7]. Interestingly, these variants exhibited four modifications outside of the PID, while our study explores motifs within the PID. Additional investigation might be required to assess if SpRY–HF and SpG–HF can also benefit from ectopic insertion of ATH51 and what the consequences for their accuracy will be. State-of-the-art methods such as Discover-seq [37] or Circle-seq [38] could be the way to go for in-depth investigation of this issue.

Moreover, this study was conducted using lentiviral reporter vectors and, thus, nonendogenous DNA elements were examined. While overall beneficial effects of ATH51 were demonstrated, some targets were also nonpermissive to TH51 modification. Thus, novel hybrid enzymes will require careful testing at intended endogenous genomic target sites.

In summary, the Smac–Cas9 PID of iSM interacts with the post-PAM target DNA in a different manner than the Spy–Cas9 derivative SpRY. In particular, evidence is presented for specific interaction of iSM with the immediate post-PAM sequence. It is anticipated that deciphering the exact mechanism(s) behind this observation will likely have consequences for prediction of both, on- and off-target activities of the iSM enzyme. Furthermore, this report highlights that different TH51 motifs can modify the activity of Cas9–SpRY and a total of nine naturally occurring TH51 motifs were examined. We envision that it should also be possible to design artificial TH51 motifs for Cas9–SpRY and other related nucleases, potentially with desired target-specific activities.

Statistical information

All error bars represent positive and negative standard deviation calculated from at least n = 3 biological replicates. A biological replicate was classified as a spatially separated transfection. Statistical evaluation of data presented in columns was performed using one-way ANOVA with Tukey’s multiple comparisons test or ANOVA Kruskal–Wallis multiple comparisons test for data where Gaussian distribution was not assumed, whereas data shown in groups was analyzed using two-way ANOVA with multiple comparisons tests recommended for the respective datasets by GraphPad Prism software. All multiple comparisons tests were performed with family-wise α-threshold and confidence level of 0.95. Data presented as two single groups was analyzed by two-tailed unpaired Mann–Whitney test when Gaussian distribution was not assumed. For all main and supplementary figures, F values and degrees of freedom for one-way and two-way ANOVA and sums of ranks and Mann–Whitney U values, as well as Kruskal–Wallis statistics are provided in Supplementary Table S4.

Supplementary Material

gkaf782_Supplemental_Files

Acknowledgements

We thank Jurij Pečar from EMBL IT Services, Heidelberg, Germany, for the maintenance of the EMBL Heidelberg HPC cluster used to carry out HM, MD, and AF2 computations. Dr. Jörn Stitz from the Faculty of Applied Sciences at TH Cologne, Leverkusen, Germany, proofread of the manuscript. Dr. Kerstin Haase from the UCL Cancer Institute, London, UK, assisted with alignment of PIDs from Cas9 variants MGYP000843596116 and WP_315271523.1. Dr. Matthias Ballmaier from the MHH Zentrale Forschungseinrichtung Zellsortierung, Hannover, Germany provided his expertise for flow cytometry-based purification of the 293-TLR reporter cell populations. Finally, we thank Susanne Alfken, MTA at the MHH-based Research Group Translational Hepatology and Stem Cell Biology, Hannover, Germany for her assistance with FACS-based analysis of transfected 293-TLR cells.

Author contributions: R.E.: Conceptualization of study, design of experiments, data analysis and interpretation, and writing of manuscript. T.H.: Conceptualization of study, 3D homology modeling, neural network-based structure prediction using AlphaFold2, MD simulation and interaction analysis, data analysis and interpretation, and writing of manuscript. O.D. and C.L.B.: TXTL-based PAM analysis, and writing and proofreading of manuscript. M.O., M.A.Z., P.W., and S.A.M.: Conduct of molecular cloning work and TLR assays. M.F.E., H.N., and A.P.: Data analysis and interpretation. T.C.: Data analysis and interpretation, design and proofreading of manuscript, and final approval of manuscript. All authors did review the manuscript and have agreed to the published version of the manuscript.

Contributor Information

Reto Eggenschwiler, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Thomas Hoffmann, Molecular Systems Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany.

Oleg Dmytrenko, Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, 97080 Würzburg, Germany.

Mika Opitz, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Marlene Ackel-Zakour, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Pascal Wang, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Shannon A McCallan, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Mariane Fráguas-Eggenschwiler, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany.

Heiner Niemann, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany.

Atanas Patronov, Discovery Sciences, R&D, AstraZeneca Gothenburg, 43183 Mölndal, Sweden.

Chase L Beisel, Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection Research, 97080 Würzburg, Germany; Medical Faculty, University of Würzburg, 97080 Würzburg, Germany.

Tobias Cantz, Research Group Translational Hepatology and Stem Cell Biology, Department of Gastroenterology, Hepatology, Infectious Diseases and Endocrinology, Hannover Medical School, 30625 Hannover, Germany; REBIRTH-Research Center for Translational Regenerative Medicine, Hannover Medical School, 30625 Hannover, Germany.

Supplementary data

Supplementary Data is available at NAR online.

Conflict of interest

C.L.B. is a co-founder and officer of Leopard Biosciences, co-founder and Scientific Advisor to Locus Biosciences, and Scientific Advisor to Benson Hill.

Funding

Parts of this study were supported by grants from the German Federal Ministry for Education and Research (HiChol, grant 01GM2204C; NANoSoGT, grant 01GP2205C) and the Ministry of Science and Culture of Lower Saxony (REBIRTH, grant ZN3440) to T.C. and through a European Research Council Consolidator grant (865973) awarded to C.L.B. Funding to pay the Open Access publication charges for this article was provided by Institute LOM.

Data availability

Raw FCS-files of FACS datasets generated and/or analyzed during the current study are available from the corresponding authors on reasonable request. Full trajectories of MD simulations are available from the corresponding authors on reasonable request with provision of sufficient data storage space at the requester’s discretion. PAM library raw sequencing data are available at https://syncandshare.desy.de/index.php/s/TamfYwkG3WA9LwT. AlphaFold2 models are available in ModelArchive (www.modelarchive.org) with the accession codes ma-5anej, ma-5cudq, ma-b4tf9, ma-rtuba, ma-e85zs, ma-l9 × 96, ma-4xd2v, ma-n9q3t, ma-eoc8k, ma-2wcxv, ma-16d4c, ma-emq7p, ma-8v2fs, ma-yikje, ma-kn1v5, ma-yr1l4, ma-bfqn3, ma-9sg4q, ma-n5okj, ma-j6be1, ma-jnqyu, ma-c7ngg, ma-ni52n, ma-i1afh, ma-xa6mp, and homology models are available in ModelArchive with the accession codes ma-g8dr2, ma-ww498, ma-ck81q, ma-ex6um, ma-8xk9d, ma-y71uv, ma-o6d2j, ma-dfc41. PB-PE SpG, PB-PE SpRY, iSM_PPL2-IP, iSM-IP, SpRY_ATH51-IP, SpRY_MTH51-IP, SpRY_SlTH51-IP, SpRY-IP, PB-PE iSM, PB-PE iSM_PPL2, PB-PE SpRY_ATH51, PB-PE SpG_ATH51, SpG-IP, and SpG_ATH51-IP have been deposited at Addgene (#179081; #179082 #235989-235996; #235998; #242072-242074), whereas all other plasmids and cell lines used in this study are available from the corresponding authors on reasonable request.

References

  • 1. Yang  M, Sun  R, Deng  P  et al.  Nonspecific interactions between SpCas9 and dsDNA sites located downstream of the PAM mediate facilitated diffusion to accelerate target search. Nat Protoc. 2021; 12:2615–42. 10.1039/D1SC02633J. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Zhang  Q, Chen  Z, Wang  F  et al.  Efficient DNA interrogation of SpCas9 governed by its electrostatic interaction with DNA beyond the PAM and protospacer. Nucleic Acids Res. 2021; 49:12433–44. 10.1093/nar/gkab1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Chatterjee  P, Lee  J, Nip  L  et al.  A Cas9 with PAM recognition for adenine dinucleotides. Nat Commun. 2020; 11:2474. 10.1038/s41467-020-16117-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Mosterd  C, Moineau  S  Characterization of a Type II-A CRISPR–Cas system in Streptococcus mutans. mSphere. 2020; 5:e00235-20. 10.1128/mSphere.00235-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Gasiunas  G, Young  JK, Karvelis  T  et al.  A catalogue of biochemically diverse CRISPR–Cas9 orthologs. Nat Commun. 2020; 11:5512. 10.1038/s41467-020-19344-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Shields  RC, Walker  AR, Maricic  N  et al.  Repurposing the Streptococcus mutansCRISPR–Cas9 system to understand essential gene function. PLoS Pathog. 2020; 16:e1008344. 10.1371/journal.ppat.1008344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Walton  RT, Christie  KA, Whittaker  MN  et al.  Unconstrained genome targeting with near-PAMless engineered CRISPR–Cas9 variants. Science. 2020; 368:290–6. 10.1126/science.aba8853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Zhao  L, Koseki  SRT, Silverstein  RA  et al.  PAM-flexible genome editing with an engineered chimeric Cas9. Nat Commun. 2023; 14:6175. 10.1038/s41467-023-41829-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Collias  D, Beisel  CL  CRISPR technologies and the search for the PAM-free nuclease. Nat Commun. 2021; 12:555. 10.1038/s41467-020-20633-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Huai  C, Li  G, Yao  R  et al.  Structural insights into DNA cleavage activation of CRISPR–Cas9 system. Nat Commun. 2017; 8:1375. 10.1038/s41467-017-01496-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Zhu  H, Wang  L, Wang  Y  et al.  Directed-evolution mutations enhance DNA-binding affinity and protein stability of the adenine base editor ABE8e. Cell Mol Life Sci. 2024; 81:257. 10.1007/s00018-024-05263-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Chen  Y, Li  Y, Li  P  et al.  Catching CRISPR–Cas9 in action. J Chem Theory Comput. 2025; 21:5023–36. 10.1021/acs.jctc.5c00165. [DOI] [PubMed] [Google Scholar]
  • 13. Sali  A, Blundell  TL  Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993; 234:779–815. 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
  • 14. Pettersen  EF, Goddard  TD, Huang  CC  et al.  UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem. 2004; 25:1605–12. 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  • 15. Delgado  J, Radusky  LG, Cianferoni  D  et al.  FoldX 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics. 2019; 35:4168–9. 10.1093/bioinformatics/btz184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Haft  DH, Selengut  J, Mongodin  EF  et al.  A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput Biol. 2005; 1:e60. 10.1371/journal.pcbi.0010060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Eggenschwiler  R, Gschwendtberger  T, Felski  C  et al.  A selectable all-in-one CRISPR prime editing piggyBac transposon allows for highly efficient gene editing in human cell lines. Sci Rep. 2021; 11:22154. 10.1038/s41598-021-01689-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Van  Der Spoel D, Lindahl  E, Hess  B  et al.  GROMACS: fast, flexible, and free. J Comput Chem. 2005; 26:1701–18. 10.1002/jcc.20291. [DOI] [PubMed] [Google Scholar]
  • 19. Lindorff-Larsen  K, Piana  S, Palmo  K  et al.  Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins. 2010; 78:1950–8. 10.1002/prot.22711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Mark  P, Nilsson  L  Structure and dynamics of liquid water with different long-range interaction truncation and temperature control methods in molecular dynamics simulations. J Comput Chem. 2002; 23:1211–9. 10.1002/jcc.10117. [DOI] [PubMed] [Google Scholar]
  • 21. Hess  B  P-LINCS: a parallel linear constraint solver for molecular simulation. J Chem Theory Comput. 2008; 4:116–22. 10.1021/ct700200b. [DOI] [PubMed] [Google Scholar]
  • 22. York  DM, Wlodawer  A, Pedersen  LG  et al.  Atomic-level accuracy in simulations of large protein crystals. Proc Natl Acad Sci USA. 1994; 91:8715–8. 10.1073/pnas.91.18.8715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Martonak  R, Laio  A, Parrinello  M  Predicting crystal structures: the Parrinello–Rahman method revisited. Phys Rev Lett. 2003; 90:075503. 10.1103/PhysRevLett.90.075503. [DOI] [PubMed] [Google Scholar]
  • 24. Bussi  G, Donadio  D, Parrinello  M  Canonical sampling through velocity rescaling. J Chem Phys. 2007; 126:014101. 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
  • 25. Goga  N, Rzepiela  AJ, de Vries  AH  et al.  Efficient algorithms for langevin and DPD dynamics. J Chem Theory Comput. 2012; 8:3637–49. 10.1021/ct3000876. [DOI] [PubMed] [Google Scholar]
  • 26. Katoh  K, Rozewicki  J, Yamada  KD  MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019; 20:1160–6. 10.1093/bib/bbx108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Jumper  J, Evans  R, Pritzel  A  et al.  Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596:583–9. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Yuan  Z, Zhang  M, Chang  L  et al.  Discovery of a novel SHP2 allosteric inhibitor using virtual screening, FMO calculation, and molecular dynamic simulation. J Mol Model. 2024; 30:131. 10.1007/s00894-024-05935-y. [DOI] [PubMed] [Google Scholar]
  • 29. Marshall  R, Maxwell  CS, Collins  SP  et al.  Rapid and scalable characterization of CRISPR technologies using an E. coli cell-free transcription-translation system. Mol Cell. 2018; 69:146–57. 10.1016/j.molcel.2017.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Dmytrenko  O, Neumann  GC, Hallmark  T  et al.  Cas12a2 elicits abortive infection through RNA-triggered destruction of dsDNA. Nature. 2023; 613:588–94. 10.1038/s41586-022-05559-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Nishimasu  H, Ran  FA, Hsu  PD  et al.  Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014; 156:935–49. 10.1016/j.cell.2014.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Kleinstiver  BP, Prew  MS, Tsai  SQ  et al.  Engineered CRISPR–Cas9 nucleases with altered PAM specificities. Nature. 2015; 523:481–5. 10.1038/nature14592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Spencer  JM, Zhang  X  Deep mutational scanning of S. pyogenes Cas9 reveals important functional domains. Sci Rep. 2017; 7:16836. 10.1038/s41598-017-17081-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Malbranke  C, Rostain  W, Depardieu  F  et al.  Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment. PLoS Comput Biol. 2023; 19:e1011621. 10.1371/journal.pcbi.1011621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Anzalone  AV, Randolph  PB, Davis  JR  et al.  Search-and-replace genome editing without double-strand breaks or donor DNA. Nature. 2019; 576:149–57. 10.1038/s41586-019-1711-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Sun  R, Zhao  Y, Wang  W  et al.  Nonspecific interactions between Cas12a and dsDNA located downstream of the PAM mediate target search and assist AsCas12a for DNA cleavage. Chem Sci. 2023; 14:3839–51. 10.1039/D2SC05463A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Zou  RS, Liu  Y, Gaido  OER  et al.  Improving the sensitivity of in vivo CRISPR off-target detection with DISCOVER-Seq. Nat Methods. 2023; 20:706–13. 10.1038/s41592-023-01840-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Lazzarotto  CR, Nguyen  NT, Tang  X  et al.  Defining CRISPR–Cas9 genome-wide nuclease activities with CIRCLE-seq. Nat Protoc. 2018; 13:2615–42. 10.1038/s41596-018-0055-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkaf782_Supplemental_Files

Data Availability Statement

Raw FCS-files of FACS datasets generated and/or analyzed during the current study are available from the corresponding authors on reasonable request. Full trajectories of MD simulations are available from the corresponding authors on reasonable request with provision of sufficient data storage space at the requester’s discretion. PAM library raw sequencing data are available at https://syncandshare.desy.de/index.php/s/TamfYwkG3WA9LwT. AlphaFold2 models are available in ModelArchive (www.modelarchive.org) with the accession codes ma-5anej, ma-5cudq, ma-b4tf9, ma-rtuba, ma-e85zs, ma-l9 × 96, ma-4xd2v, ma-n9q3t, ma-eoc8k, ma-2wcxv, ma-16d4c, ma-emq7p, ma-8v2fs, ma-yikje, ma-kn1v5, ma-yr1l4, ma-bfqn3, ma-9sg4q, ma-n5okj, ma-j6be1, ma-jnqyu, ma-c7ngg, ma-ni52n, ma-i1afh, ma-xa6mp, and homology models are available in ModelArchive with the accession codes ma-g8dr2, ma-ww498, ma-ck81q, ma-ex6um, ma-8xk9d, ma-y71uv, ma-o6d2j, ma-dfc41. PB-PE SpG, PB-PE SpRY, iSM_PPL2-IP, iSM-IP, SpRY_ATH51-IP, SpRY_MTH51-IP, SpRY_SlTH51-IP, SpRY-IP, PB-PE iSM, PB-PE iSM_PPL2, PB-PE SpRY_ATH51, PB-PE SpG_ATH51, SpG-IP, and SpG_ATH51-IP have been deposited at Addgene (#179081; #179082 #235989-235996; #235998; #242072-242074), whereas all other plasmids and cell lines used in this study are available from the corresponding authors on reasonable request.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES