Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2022 Feb 10;24(2):268–278. doi: 10.1038/s41556-021-00836-1

dCas9-based gene editing for cleavage-free genomic knock-in of long sequences

Chengkun Wang 1,2, Yuanhao Qu 1,2,#, Jason K W Cheng 1,2,#, Nicholas W Hughes 1,2, Qianhe Zhang 1,2, Mengdi Wang 3, Le Cong 1,2,
PMCID: PMC8843813  NIHMSID: NIHMS1766496  PMID: 35145221

Abstract

Gene editing is a powerful tool for genome and cell engineering. Exemplified by CRISPR–Cas, gene editing could cause DNA damage and trigger DNA repair processes that are often error-prone. Such unwanted mutations and safety concerns can be exacerbated when altering long sequences. Here we couple microbial single-strand annealing proteins (SSAPs) with catalytically inactive dCas9 for gene editing. This cleavage-free gene editor, dCas9–SSAP, promotes the knock-in of long sequences in mammalian cells. The dCas9–SSAP editor has low on-target errors and minimal off-target effects, showing higher accuracy than canonical Cas9 methods. It is effective for inserting kilobase-scale sequences, with an efficiency of up to approximately 20% and robust performance across donor designs and cell types, including human stem cells. We show that dCas9–SSAP is less sensitive to inhibition of DNA repair enzymes than Cas9 references. We further performed truncation and aptamer engineering to minimize its size to fit into a single adeno-associated-virus vector for future application. Together, this tool opens opportunities towards safer long-sequence genome engineering.

Subject terms: Genetic engineering, CRISPR-Cas9 genome editing, DNA damage and repair


Wang, Qu et al. developed a genome-editing system, utilizing catalytically inactive Cas9 fused to microbial single-strand annealing proteins, for kilobase-scale insertion in human cells without introducing DNA nicks or breaks.

Main

Since the initial demonstration of clustered regularly interspaced short palindromic repeats (CRISPR)–CRISPR associated protein 9 (Cas9) genome engineering, gene-editing technologies have gained broad applications in basic and translational settings110. New generations of tools have substantially improved the efficiency and fidelity of gene editing and are powerful for altering relatively short sequences11. Most gene-editing tools work by cleaving genomic DNA to induce single-strand nicks or double-stranded breaks (DSBs) that facilitate targeted editing12,13. These DNA modifications can be repaired by error-prone endogenous pathways such as non-homologous end-joining (NHEJ)12. This often leads to unwanted mutations and off-target effects, which could result in toxicity and raise safety concerns1418. Although recent advances in using messenger RNA/protein with optimized donors enhanced levels of efficiency12, the editing errors and off-target effects would become increasingly severe when engineering long sequences (≥100 base pairs (bp)). These unwanted effects limit the application of gene editing for genomic knock-in or in vivo gene therapy1921.

Available methods for long-sequence editing, such as homology-directed repair (HDR) and microhomology-mediated end-joining (MMEJ), rely on DNA cutting and often trigger random indel formation12,13,20. Recent efforts have enhanced long-sequence editing precision, using chemical enhancers, fusion of enhancement domains or modified donors2224. Nicking-based HDR has been shown to reduce errors but could lead to a lower efficiency13. Thus, there remains a need for efficient, safer CRISPR editing tools for long-sequence alterations19,21.

Bacteriophages have evolved enzymes that perform precise recombination2530. We reasoned that the key enzyme for microbial recombination, the single-strand annealing protein (SSAP), could be useful for cleavage-free gene editing in mammalian cells. Notably, it does not have DNA-cleavage activity26,27; thus, it may not trigger the error-prone pathways needed by Cas9 editing. Motivated by this hypothesis and our previous work showing its ability to stimulate genomic recombination, we developed a gene-editing tool using deactivated Cas9 (dCas9, or catalytically inactive Cas9) and microbial SSAPs3138. This dCas9 editor uses the SSAP for knock-in editing with a donor DNA without the need for DNA cleavage. We termed it dCas9–SSAP editor.

Our data show that dCas9–SSAP has comparable efficiencies to Cas9 references, achieving a knock-in efficiency of up to 20%, and is effective across genomic targets and cell lines for kilobase-scale editing. We also demonstrate dCas9–SSAP knock-in of different transgenes using functional assays. More importantly, our data show that dCas9–SSAP generates near zero on- and off-target errors. When inserting a 1 kb sequence, dCas9–SSAP resulted in less than 0.3% editing errors across the cells sampled, whereas Cas9 editors had similar yields but as much as 10–16% incorrectly edited cells. Across loci, dCas9–SSAP demonstrated an editing accuracy of 90–99.6%, in contrast to editing accuracy in the range of 10–38% for the Cas9 editors. Furthermore, we probed the mechanism of dCas9–SSAP editing by inhibiting DNA repair enzymes and cell-cycle blocking. The results of these assays supported our hypothetical model for a dCas9 editor mediated by SSAP activity when dCas9-guide RNA (gRNA) opens genomic DNAs via the R-loop and are consistent with the known biophysical, biochemical properties of dCas9 (refs. 3941).

Finally, we leveraged structural-guided truncation and aptamer engineering to obtain a minimized dSaCas9–mSSAP editor, achieving a reduction in size of more than 50% and retaining similar levels of efficiency. This minimal dCas9 editor would allow convenient delivery using adeno-associated virus (AAV) vector, which is useful for hard-to-transfect cells or in vivo applications20,21. Overall, the dCas9–SSAP editor is capable of efficient, accurate knock-in genome editing. With space for further improvement, it is a valuable cleavage-free gene-editing tool for mammalian cells.

Results

Use of phage SSAPs for dCas9 knock-in gene editing

Most CRISPR-based editors capable of long-sequence knock-in require single-strand nicks or DSBs, which can trigger the error-prone NHEJ pathways, resulting in variable efficiency and accuracy11,12. In contrast, bacteriophages integrate themselves into host bacteria via recombination systems—for example, lambda Red42,43. Such precise phage integration30,44,45 relies on a homology-directed step: recombination between genomic and donor DNA stimulated by the SSAPs—that is, lambda Bet or its homologue, RecT26,46,47. From previous studies48,49, we reasoned that phage SSAPs may not rely on DNA cleavage due to its unusual ATP-independent activity, in contrast to the ATP-dependent RAD51 in mammalian cells50. The high affinity of SSAPs for single- and double-stranded DNA may allow attachment to donors when multiple SSAPs are recruited to genomic targets via RNA-guided dCas9 (ref. 49). It could then promote DNA exchange without cleavage, as DNA strands become transiently accessible during dCas9-mediated DNA unwinding and R-loop formation3941.

Based on this hypothesis, we designed a system to recruit SSAPs to the catalytically inactive dCas9 (Fig. 1a). The dCas9 protein cannot cut DNA but retains the ability to unwind target sites and form an R-loop, rendering the non-target strand putatively accessible for SSAP-stimulated homologous recombination39,40. To test this, we engineered and evaluated three major microbial SSAPs: lambda phage Bet, Escherichia coli Rac prophage RecT and phage T7 gp2.5 (ref. 27). We recruited SSAPs to the deactivated version of Streptococcus pyogenes Cas9 (dSpCas9, simplified as dCas9 hereafter) via an RNA aptamer MS2 stem-loop (Fig. 1a,c)31. This MS2 aptamer was inserted into a single-guide-RNA (sgRNA) scaffold, and the candidate SSAPs were fused to a carboxyl (C)-terminal MS2 coat protein (MCP) that binds specifically to the MS2 aptamer, thus allowing multiple SSAPs to form a complex with dCas9-gRNA. To measure their gene-editing activity in human cells, we generated knock-in donors with an 800-bp transgene encoding a fluorescent protein cassette flanked by homology arms (HAs), which allow in-frame insertion of the fluorescent protein into housekeeping genes, for example, DYNLT1, HSP90AA1 and ACTB (Fig. 1b). Following precise knock-in, we measured the percentage of fluorescent protein-expressing cells to quantify the gene-editing efficiency (Fig. 1d). Our test identified that RecT has higher editing activities relative to other SSAPs in human cells, whereas no editing above background was observed with the dCas9-only or non-target controls (Fig. 1d). We validated this knock-in editing using imaging, gel electrophoresis and sequencing (Fig. 1e and Extended Data Fig. 1). This provided evidence that coupling SSAP to dCas9 enables knock-in gene editing.

Fig. 1. Development of a cleavage-free dCas9-based gene editor using microbial SSAPs.

Fig. 1

a, Schematic model of the dCas9–SSAP editor. b, Design of the genomic knock-in assay to measure the level of gene-editing efficiency. FL, fluorescent; PAM, protospacer adjacent motif. c, Construct designs for screening the gene-editing efficiency of SSAPs using a genomic knock-in assay with an 800-bp 2A–mKate transgene. NLS, nuclear localization sequence. d, Knock-in efficiency of the initial screen of three SSAPs: Bet protein from lambda phage (LBet), RecT protein from Rac prophage (RecT) and gp2.5 from T7 phage (gp2.5). NTC, non-target control. Donor templates with HA lengths of approximately 200 bp (DYNLT1) and 300 bp (HSP90AA1 and ACTB) were added in all groups, except the no-donor controls. The error bars represent the s.e.m. of n = 3 biologically independent experiments. e, Imaging to verify mKate knock-in at endogenous genome loci using the dCas9–SSAP editor. Data represent n = 3 biologically independent experiments. dsDNA, double-stranded DNA.

Source data

Extended Data Fig. 1. Gel electrophoresis and sequencing verification of knock-in-specific PCR products using dCas9–SSAP.

Extended Data Fig. 1

(a) Agarose gel results of knock-in-specific junction PCR at DYNLT1 locus. (be) Sanger sequencing chromatogram of genomic junctions from knock-in experiments at DYNLT1 locus. For all samples, we amplified the 5′ (b, c) and 3′ (d, e) end of genomic DNA using junction-spanning primers outside of the donor DNAs to confirm knock-in. The assay has been performed 3 times with similar results.

Development of dCas9–SSAP as a mammalian gene-editing tool

We then conducted metagenomic mining to identify the best SSAP for mammalian gene editing. We focused on RecT homologues and sought to maximize evolutionary diversity via a phylogenetic analysis27. We searched the NCBI non-redundant sequence database for RecT homologues and identified 2,071 initial candidates. Next, we built phylogenetic trees, subsampled the evolutionary branches and obtained 16 SSAP candidates (Supplementary Notes and Extended Data Fig. 2). We evaluated the SSAP candidates by measuring the level of editing efficiency across three genomic loci. Among all candidates, EcRecT demonstrated the highest efficiency for dCas9 editing, with an efficiency of approximately 6% in human cells. This was notably higher than the dCas9 controls without SSAP, which were comparable to the no-donor controls, suggesting that dCas9 alone cannot perform efficient knock-in (Extended Data Fig. 2c). We also tested SSAP with a non-target control with gRNA that does not recognize the genomic targets, confirming that expression of SSAP alone is not sufficient for knock-in (Fig. 1d). Together, the proposed dCas9–SSAP editor could enable efficient knock-in in human cells. In what follows, we focus on this top design.

Extended Data Fig. 2. RecT-like SSAP candidate screening.

Extended Data Fig. 2

(a) Representative FACS plots of gating strategy for the current study. (b) Phylogenetic tree and amino acid alignment of representative RecT homologues along with the protein conserved domain annotated. (c) Screening RecT like SSAP candidates via metagenomics homologue mining and knock-in assay. The most active candidate is labelled as dCas9–SSAP. Data from 2 biologically independent experiments are shown.

Source data

Characterization of the accuracy of dCas9–SSAP gene editing

The motivation for developing dCas9–SSAP was to perform safer, cleavage-free knock-in editing with the help of SSAP. Thus, we experimentally evaluated the accuracy of dCas9–SSAP for knock-in editing targeting a sequence of approximately 1 kb in length. We measured the on-target errors, off-target insertions, cell-fitness effects and editing yields of dCas9–SSAP in comparison with Cas9 references.

On-target errors

There are two types of on-target errors: (1) indel only, where undesired indels are inserted but no template; and (2) imperfect knock-in, where complete or partial template is inserted but indels occur at the knock-in junctions.

To evaluate type (1), we used deep sequencing to measure the on-target indel formation of the dCas9 editor. We used a nested PCR design with an initial primer binding site outside the donor DNA to avoid template contamination (Fig. 2a and Extended Data Fig. 3). Deep sequencing showed that the level of on-target errors of the dCas9 editor were as low as those observed for the negative controls, in contrast to high levels of indels observed for the Cas9 editors (Fig. 2a).

Fig. 2. Measurement of the on- and off-target editing errors of dCas9–SSAP.

Fig. 2

a, On-target indel errors (800-kp knock-in). Deep sequencing was used to measure the levels of indel formation when using dCas9–SSAP and Cas9 references at the endogenous targets DYNLT1 (left) and HSP90AA1 (right). HDR templates with a 200-bp HA were used as the donor template. Details of the assay are provided in Methods; n = 3 biologically independent experiments. b, On-target knock-in errors (800 kp knock-in). Clonal Sanger sequencing analysis of the accuracy of knock-in editing using dCas9–SSAP and Cas9 references with different MMEJ and HDR templates. The MMEJ and HDR donor templates had HAs of 25 bp and approximately 200 bp, respectively (Methods). ce, Genome-wide detection of insertion sites of the knock-in cassette using unbiased sequencing. The workflow (c), representative reads aligned at the knock-in genomic site (d) and summary of the detected on-target and off-target insertion sites (e) are presented. f,g, Workflow (f) and results (g) of the measurement of the cell-fitness effect, defined by the percentage of live cells after editing (normalized to the mock controls). Statistical analyses and comparisons were performed using a Student’s t-test; *P < 0.05; ***P < 0.001; n = 5 biologically independent experiments. MESL, maximum edit site likelihood. Asterisks next to gene names indicate that the insertion site is within the transcription unit of the gene. a,g, The error bars represent s.e.m. h, Summary of the knock-in accuracy of the dCas9–SSAP editor in comparison with the Cas9 HDR and Cas9 MMEJ methods. Accuracy is defined as the overall yield (%) of correct knock-in in all edited outcomes (correct knock-in, knock-in with indels and NHEJ indels). NGS, next-generation sequencing.

Source data

Extended Data Fig. 3. Confirmation of knock-in using sanger sequencing.

Extended Data Fig. 3

Schematic showing the workflows used in Sanger sequencing of knock-in products (a) and the sequencing method used in deep on-target indel assay (b). Assays described here correspond to Fig. 2. gPCR, genomic PCR. Seq-F/seq-R are primers for Sanger sequencing binding upstream/downstream of the knock-in donors. Sanger sequencing chromatogram of genomic junctions from dCas9–SSAP experiments at DYNLT1 (c and d) and HSP90AA1 (e and f) locus. For all samples, we amplified the 5′ (c and e) and 3′ (d and f) end of genomic DNA using junction-spanning primers to confirm knock-in precision. The sequences in the red boxes were not precisely repaired. The genomic-binding primers used are completely outside of the donor DNAs to avoid contamination. The assay has been performed 3 times with similar results.

To evaluate type (2), we benchmarked the knock-in errors of dCas9–SSAP and measured the junction indels. We clonally isolated the edited cells, amplified the knock-in genomic loci using a similar two-step nested PCR design to avoid contamination (Fig. 2b and Extended Data Fig. 3) and assessed the edited genomic alleles via Sanger sequencing. The long-read Sanger sequencing allowed us to examine the entire knock-in junctions. Our results indicated that, although MMEJ donors were more efficient than HDR donors when using Cas9, they also led to a higher percentage of editing errors (Fig. 2b). More importantly, dCas9–SSAP outperformed Cas9 HDR and Cas9 MMEJ in terms of the percentage of edited clones with no knock-in errors (Fig. 2b and Extended Data Fig. 3). At one locus, dCas9–SSAP achieved 100% knock-in success (within the limit of the assay sensitivity).

Off-target errors

We also evaluated the off-target knock-in error rates of dCas9–SSAP editing via a genome-wide transgene insertion assay (Fig. 2c–e and Extended Data Fig. 4a)31. Briefly, we isolated high-molecular-weight genomic DNA, followed by fragmentation and unique molecular identifier (UMI)-adaptor ligation, and used transgene-specific primers for the unbiased identification of genomic insertion sites (Fig. 2c). Through a previously validated analysis pipeline modified from a Cas9 genome-wide off-target assay (Methods), we identified enriched peaks of reads that represent high-abundance transgene insertion sites (Fig. 2d). Considering insertion sites with >1% of total aligned reads, our results confirmed that dCas9–SSAP showed no detectable off-target insertions, whereas the Cas9 references led to a substantial number of off-target insertion events (Fig. 2e). Notably, there were fewer off-target sites when we considered all sites with at least one UMI aligned in the dCas9–SSAP samples compared with the Cas9 editor (Extended Data Fig. 4b). This result suggests that dCas9–SSAP could help to address the off-target issues that are prominent in long-sequence knock-in.

Extended Data Fig. 4. Unbiased genome-wide insertion quantification.

Extended Data Fig. 4

(a) Overall workflow for unbiased genome-wide insertion site mapping process. On-target and off-target insertions sites are recovered from reads that align to the reference genome (hg38). Full protocol and data analysis pipeline are detailed in Methods. (b) Quantification of genome-wide insertion sites counting all aligned reads (with valid UMI) showed decreased insertion site abundance using Cas9-SSAP compared with Cas9 HDR, across two genomic loci (DYNLT1 and HSP90AA1). The abundance of insertion sites is measured as RPKU, or Reads Per Thousand UMIs. n = 3 biologically independent experiments. All results in this figure are from replicate experiments with error bars representing standard error of the mean (S.E.M.). The statistical analysis and comparison were performed using t-test.

Source data

Cell-fitness effect and editing-yield analysis

We also compared the fitness of cells that went through Cas9/dCas9-based editing. We experimented with two target sites; our data suggested that dCas9 editing leads to higher cell fitness than Cas9 (Fig. 2f,g; defined as the normalized percentage of cells alive after editing).

For the full picture, we summarized the editing yields of dCas9–SSAP in comparison to the Cas9 references. We tabulated the percentages of accurate knock-ins, knock-ins with errors and on-target indels without knock-ins, where the sum of the latter two is the on-target-error total (Fig. 2h). We also measured the overall accuracy of the editing, defined as the ratio of successful knock-in cells to total edited cells (Fig. 2h). We observed that Cas9 editors suffered from frequent errors in long-sequence editing, where the percentage of erroneous edits were notably higher than the yields and the accuracy ranged between 10% and 38%. Although the knock-in yields for dCas9–SSAP were similar to the best Cas9 references, dCas9–SSAP generated minimal errors and achieved an accuracy rate of 90–99% across genomic loci.

Benchmark dCas9–SSAP across donor designs and cell types

Having established that dCas9–SSAP has higher accuracy in knock-in editing, we further validated its level of efficiency and usages across donor designs and cell types. As benchmarks, we used both wild-type and nicking-based Cas9 (nCas9) editors, including three HDR-enhancing tools5153. We examined their 1-kb knock-in activities across three genomic targets. The comparison demonstrated that dCas9–SSAP achieved higher efficiencies than the Cas9, nCas9 and nCas9-hRAD51 nickase editors, with similar levels of efficiency to Cas9-HE51 and Cas9-GEM52, two published HDR-enhancing editors (Fig. 3a). We also compared dCas9–SSAP with our previous SSAP-enhanced wild-type Cas9 tools31 and found that the dCas9-based editor had robust but reduced activity in comparison to when DNA cleavage was introduced (Extended Data Fig. 5a,b). In addition, our data showed that a single-guide dCas9–SSAP editor was sufficient for effective knock-in, with minor improvements when using two gRNAs (Extended Data Fig. 5c).

Fig. 3. Validation and benchmarking of dCas9–SSAP across donor designs and cell types.

Fig. 3

a, Comparison of the knock-in efficiencies of dCas9–SSAP and other alternative Cas9, nCas9 and HDR-enhancing tools. Cas9-HE, CtIP-fusion Cas9; Cas9-Gem, Geminin-fusion Cas9; nCas9, Cas9-D10A nickase reference; and nCas9-hRAD51, an improved Cas9 nickase editor. The same donor templates as those used in Fig. 1 were used. Statistical analyses and comparisons were performed using a Student’s t-test; *P < 0.05 and **P < 0.01. b, Design of knock-in donors with different transgene lengths. c, Gene-editing efficiencies at the DYNLT1 (left) and HSP90AA1 (right) loci in HEK293T cells for three types of donor designs with different HA lengths. a,c, The error bars represent the s.e.m. of n = 3 biologically independent experiments. d, Knock-in efficiencies for different transgene lengths using the dCas9–SSAP editors. Donor-HA lengths of approximately 200 bp (DYNLT1) and 300 bp (HSP90AA1) were used; n = 2 biologically independent experiments. e,f, Knock-in gene editing in hESC (H9) cells using the dCas9–SSAP editor. The knock-in efficiencies of the Cas9, Cas9 HDR and dCas9–SSAP editors (e; n = 2 biologically independent experiments), and flow cytometry analyses of the Cas9 HDR and dCas9–SSAP editors (f) are shown. Donor-HA lengths of approximately 200 bp (HSP90AA1 and ACTB) and 212 bp + 253 bp for OCT4 were used. Data were collected in duplicate.

Source data

Extended Data Fig. 5. Comparation and optimization of dCas9–SSAP editor.

Extended Data Fig. 5

Quantified knock-in efficiency for the comparation of dCas9–SSAP with Cas9/nCas9 based SSAP editor at DYNLT1 (a) and HSP90AA1 (b) locus. The donor DNAs used are the same as shown in Fig. 3a with 800-bp knock-in design. n = 3 biologically independent experiments. All results in this figure are from replicate experiments with error bars representing standard error of the mean (S.E.M.). Testing dCas9–SSAP editor tool using single-guide (c) and dual-guide (d) designs across three genomic targets (shown on the top). The distance between guides is 19 bp for HSP90AA1, 21 bp for DYNLT1, and 31 bp for ACTB. The donor DNAs used are the same as shown in Fig. 3a with 800-bp knock-in design. Data from 2 biologically independent experiments are shown.

Source data

Next, we tested the dCas9–SSAP editor with different donor DNA designs (Fig. 3b). We first tested the effect of the length of the HA on the efficiency of dCas9–SSAP (Fig. 3c). Our results suggested that SSAP-mediated editing is more efficient when using HDR than MMEJ donors and longer HAs generally result in a higher editing efficiency (Fig. 1f and Supplementary Notes). This is consistent with previous reports that MMEJ relies on DNA breaks, which are missing in dCas9 editing12,13,54. We then evaluated the editing efficiency of dCas9–SSAP when the sequence for knock-in has variable length, up to 2 kb for dual-fluorescent protein knock-in (Fig. 3d). Our data showed that dCas9–SSAP performed consistently, with a comparable, and often higher, efficiency to the Cas9 references irrespective of the transgene length (Fig. 3d). In addition, when using a donor that knocked in a 16-bp sequence, we observed dCas9–SSAP supported short-replacement gene editing (Extended Data Fig. 6).

Extended Data Fig. 6. Deep sequencing of short-sequence editing comparing dCas9–SSAP and Cas9 editors.

Extended Data Fig. 6

(a) donor design of 16-bp replacement at EMX1. (b) Analyse the precision HDR and indel editing outcomes using deep sequencing at EMX1 genomic locus. The first round of PCR used sequencing primers completely outside of the donor to ensure the sequencing results will be free from the donor template contamination, validated by the non-target control (where the donor DNAs are delivered into the cells). n = 2 biologically independent experiments.

Furthermore, we checked whether the dCas9–SSAP editor can be applied in other cell types beyond the model human embryonic kidney 293T (HEK293T) cell line. We applied dCas9–SSAP to three cell lines with distinctive tissue origins (cervix-derived HeLa cells, liver-derived HepG2 cells and bone-derived U2OS cells). We observed knock-in efficiencies comparable to the Cas9 references in all three lines (Extended Data Fig. 7a–c). Next, we used the dCas9–SSAP editor in human embryonic stem cells (hESCs) to engineer sequences in a more therapeutically relevant setting19,55. We observed robust knock-in editing across all three targets (Fig. 3e). To avoid background from donor DNA, the stem-cell editing was performed with short HAs (approximately 200 bp) and an efficiency of about 3% for kilobase-scale editing without selection was achieved. The dCas9–SSAP efficiencies were comparable and often higher than the Cas9 references (Fig. 3f and Extended Data Fig. 7d–f). Thus, we concluded that dCas9–SSAP has similar levels of efficiencies to the Cas9-based editors.

Extended Data Fig. 7. Validation of dCas9–SSAP knock-in efficiencies in different cell lines.

Extended Data Fig. 7

(ac) Results in HepG2, HeLa, and U2OS cell lines. The knock-in experiments used similar donor DNA with ~800-bp cassettes encoding 2A-mKate transgene for all cell lines tested. n = 2 biologically independent experiments. (df) Flow cytometry analysis of knock-in gene editing at HSP90AA1, ACTB, OCT4 endogenous loci in human embryonic stem cells (hESC, H9) using dCas9–SSAP compared with non-target controls and Cas9 (Cas9 HDR) references.

Source data

Optimization of the dCas9–SSAP efficiency for robust knock-in editing

We further optimized the dCas9–SSAP editor and tested its activities across a larger panel of genomic targets. We first examined whether adjustments to dosage could improve the level of editing efficiency (Fig. 4a and Extended Data Fig. 8a). When we increased the amount of SSAP-encoding plasmid, we observed higher editing efficiencies across all targets (Fig. 4a). This correlation further supported that the knock-in editing was driven by the SSAP. In contrast, increases in the amount of donor had negligible effects on the knock-in efficiency (Extended Data Fig. 8a), suggesting that the donor dosage was not a bottleneck in this setting. In addition to dosage optimization, we extended the donor-HA lengths and observed that further extension of the HAs helped to improve the knock-in efficiency, consistent with earlier results (Extended Data Fig. 8b,c).

Fig. 4. Optimization of dCas9–SSAP for efficient and durable gene editing.

Fig. 4

a, Knock-in efficiencies for SSAP-dosage optimization. Donor-HA lengths of approximately 200 bp (DYNLT1), 300 bp (HSP90AA1) and 200 bp (ACTB) were used; n = 3 biologically independent experiments. b, Performance (knock-in efficiency) of the dCas9–SSAP editor compared with Cas9 references across seven endogenous loci in HEK293T cells after SSAP-dosage optimization and donor-HA extension. Donor-HA lengths of 673 bp + 750 bp for HSP90AA1, 500 bp + 800 bp for ACTB, 608 bp + 740 bp for BCAP31, 212 bp + 413 bp for HIST1H2BK, 705 bp + 602 bp for CLTA, 464 bp + 440 bp for RAB11A and approximately 200 bp for DYNLT1 were used. All knock-in donors targeted the carboxy termini of the endogenous proteins, except for the CLTA and RAB11A donors which targeted the N termini. The error bars represent the s.e.m. of n = 3 biologically independent experiments. c, Long-term stability of transgene expression at HSP90AA1 (left) and ACTB (right) post sorting on Day 3 after dCas9–SSAP knock-in. Variable sorting efficiencies led to different starting mKate+ rates (full time course in Extended Data Fig. 9).

Source data

Extended Data Fig. 8. Optimization of donor dosages and homology arms of donor DNA.

Extended Data Fig. 8

(a) Quantify genomic mKate knock-in efficiency at DYNLT1, HSP90AA1, ACTB loci for donor dosage optimization when using dCas9–SSAP editor. non target, non-target controls. Donor-HA lengths are 200 bp+200 bp for DYNLT1, 200 bp + 400 bp for HSP90AA1, 200 bp + 400 bp for ACTB. Quantify mKate knock-in efficiency at HSP90AA1 (b) and ACTB (c) locus for donor homology arm (HA) optimization when using dCas9–SSAP editor. non target, non-target controls. Donor-HA lengths are 200 bp + 200 bp or 673 bp + 750 bp for HSP90AA1, 200 bp + 400 bp or 500 bp + 800 bp for ACTB. n = 3 biologically independent experiments. All results in this figure are from replicate experiments with error bars representing standard error of the mean (S.E.M.).

Source data

Using these optimized parameters, we measured the level knock-in efficiency of dCas9–SSAP at seven endogenous loci (DYNLT1, HSP90AA1, ACTB, BCAP31, HIST1H2BK, CLTA and RAB11A; Fig. 4b). We included two loci (CLTA and RAB11A) where the knock-in tag was inserted as a direct fusion at the N termini of the endogenous proteins, complementing the 2A-peptide designs. Across all targets, dCas9–SSAP demonstrated efficiencies of up to about 20% without selection, which was comparable and sometimes moderately higher than the Cas9 references (Fig. 4b).

To ensure the stability of editing mediated by dCas9–SSAP over an extended time span, we next examined the durability of knock-in-transgene expression. We sorted mKate+ cells on Day 3 post transfection of dCas9–SSAP and donor DNA, and then checked whether the transgene maintained its expression beyond the three-day window at different genomic loci (Fig. 4c). Consistent with our sequencing results showing accurate on-target editing (Fig. 2), we observed that expression of the knock-in cassette was stable on Days 5, 7 and 10 post the delivery of dCas9–SSAP (Fig. 4c and Extended Data Fig. 9). The knock-in cell populations had distinct, steady transgene expression compared with the controls (Extended Data Fig. 9b). Thus, these data provided support for the utility of dCas9–SSAP for stable knock-in editing in mammalian cells.

Extended Data Fig. 9. Validation the stability of on-target editing.

Extended Data Fig. 9

(a) Workflow of the long-term time-course experiments to evaluate the editing outcome stability using dCas9–SSAP editor. (b) Flow cytometry analysis of knock-in gene editing at HSP90AA1, ACTB endogenous loci at different time points post delivery of dCas9–SSAP and donor DNA. (c, d) Representative crystal violet staining images for the on-target puromycin knock-in at HSP90AA1 and ACTB locus. Scale bar = 500 um. The assay has been performed 3 times with similar results. (e) Quantify HSP90AA1 and ACTB gene expression levels in HEK293T cells by bulk RNA-seq analysis, demonstrating notable higher levels of HSP90AA1 expression. This led to the better cell survival in the HSP90AA1 group compared with ACTB group. Data from 2 biologically independent experiments are presented.

Source data

Finally, we sought to functionally validate the ability of the dCas9–SSAP editor to insert diverse payloads at endogenous loci (Fig. 5a). Briefly, we constructed knock-in donors with selectable payloads (puromycin- and blasticidin-resistance cassettes) as fusion protein with endogenous genes (Fig. 5b, left). We examined the knock-in results from the dCas9–SSAP and Cas9-reference editors using western blotting. Immunoblotting confirmed the presence and correct sizes of the expected knock-in fusion proteins using dCas9–SSAP across targets (HSP90AA1 and ACTB) and payloads (Fig. 5b). Furthermore, we quantified the relative knock-in efficiencies of the dCas9–SSAP and Cas9 methods using a functional assay (Fig. 5c and Extended Data Fig. 9c–e). We employed short-HA donors to insert a resistance cassette into endogenous loci and applied puromycin to select the knock-in cells. Colony formation assays validated that the dCas9–SSAP editor performed reliably using this protein-function readout (Fig. 5c).

Fig. 5. Validation of the dCas9–SSAP editor using protein functional assays.

Fig. 5

a, Design of the genomic puromycin- and blasticidin-resistance-cassette knock-in assay to validate functional on-target editing by dCas9–SSAP. b, Immunoblotting confirmation of the presence and sizes of on-target dCas9–SSAP knock-in products at the HSP90AA1 and ACTB loci, performed with antibody to V5, which recognizes in-frame fusion with endogenous protein. Data are representative of n = 3 biologically independent experiments. Schematics (not to scale) of the knock-in proteins are shown (left). c,d, Validation and quantification of on-target knock-in using dCas9–SSAP via colony formation assays. Cells were selected by the knock-in resistance cassettes, stained with crystal violet (c) and quantified (d). Scale bar, 500 µm. The error bars represent the s.e.m. of n = 4 biologically independent experiments.

Source data

Dependence of dCas9–SSAP on endogenous pathways

Recall our model that dCas9–SSAP performs gene editing without DNA cleavage. To better understand the nature of dCas9–SSAP editing, we used three orthogonal chemical perturbations to examine its dependency on endogenous pathways (Fig. 6).

Fig. 6. Chemical perturbation to probe the editing mechanism of dCas9–SSAP.

Fig. 6

ad, Schematics showing experiment designs to measure the influence of endogenous DNA repair enzymes and cell cycle progression on dCas9-SSAP editing (a,c). Gene-editing efficiency (b,d) of the dCas9–SSAP editor when treated with DNA repair-pathway inhibitors (mirin, RI-1 and B02) without (a,b) or with (c,d) cell-cycle synchronization (DTB). The donors were the same as those used in Fig. 1. b,d, Statistical analyses were from the t-test results with a false-detection rate of 1% from the two-stage step-up method of Benjamini, Krieger and Yekutieli. The error bars represent the s.e.m. of n = 4 biologically independent experiments. Statistical analyses and comparisons were performed using a Student’s t-test; ***P < 0.001 and NS, not significant.

Source data

First, we perturbed enzymes in DSB-repair pathways during dCas9–SSAP and canonical Cas9 editing and compared the effects (Fig. 6a). In Cas9-mediated knock-in, the recognition of DSBs by the Mre11–Rad50–Nbs1 (MRN) complex is a necessary step for downstream HDR12. We leveraged mirin, a potent inhibitor of DSB repair that has been shown to prevent MRN complex formation, ATM activation and Mre11 exonuclease activity56. We treated cells with mirin and determined the level of editing efficiency of the dCas9–SSAP and Cas9-reference editors on these cells. Across all targets, we observed that the dCas9–SSAP efficiencies were nearly unaffected by the mirin treatment and essentially the same as the vehicle-treated groups (Fig. 6b). However, as expected, the Cas9 methods demonstrated substantially reduced levels of editing efficiency under the mirin treatment (Fig. 6b).

Second, we investigated the dependence of dCas9–SSAP editing on core enzymes of the HDR pathway. We used two small-molecule inhibitors of the RAD51 protein, RI-1 and B02, to block this rate-limiting step in HDR57,58. Our data showed that RAD51 inhibition significantly reduced the efficiency of Cas9 editing at all genomic targets but did not have a significant effect on dCas9–SSAP editing (Fig. 6b, RI-1 and B02). These two repair-modulating experiments generated consistent results: dCas9–SSAP showed significantly less sensitivity to the perturbations of several endogenous repair enzymes than Cas9 references. They suggest that the mechanism of the dCas9–SSAP editor differs from Cas9 editing.

Third, we investigated how cell cycling affects the dCas9–SSAP editor. Cell cycling has been shown to facilitate the accessibility of mammalian genomes59. More specifically, genome replication (during the S phase) may provide a favourable environment for dCas9 to unwind DNA and allow SSAP-mediated recombination (Fig. 6c). To test this, we synchronized cells at the G1–S boundary using double thymidine blockage (DTB)60,61. The DTB treatment indeed reduced the efficiency of dCas9–SSAP editing (Fig. 6d). Nonetheless, when we combined mirin, RI-1 or B02 with DTB treatment, dCas9–SSAP maintained higher levels of editing efficiency than the Cas9 references (Fig. 6d).

Together, our data supported the hypothetical mechanism of dCas9–SSAP editing: RNA-guided dCas9 binds to genomic targets and makes them accessible to the SSAP, and SSAP promotes homology-directed insertion without the requirement for a DNA break (Fig. 1a). Deeper understanding of this process will require further investigation—for example, biophysical analysis of the dCas9–SSAP editor or additional assays to modulate genome accessibility and repair pathways. Such insights could help to further develop dCas9 editing approaches.

Minimization of dCas9–SSAP for convenient delivery

Finally, to optimize the dCas9–SSAP editor for future applications, we sought to develop a minimal version compatible with the size limitations of viral vectors such as AAV20,21. We designed 14 different truncated RecT SSAPs based on secondary-structure predictions (Fig. 7a and Extended Data Fig. 10) and tested their gene-editing activities alongside the full-length controls. We identified a short RecT variant (around 200 amino acids in length) that had comparable efficiencies to the original full-length RecT-based design (Fig. 7b).

Fig. 7. Minimization of dCas9–SSAP as a compact editing tool for convenient delivery.

Fig. 7

a, Predicted secondary structure and priming sites for constructing truncated EcRecT protein. b, Relative knock-in efficiencies of the truncation designs. All groups were normalized to the Cas9 references (individually for each target). aa, amino-acid residue. c,d, Schematic of the dSaCas9–mSSAP system in AAV construct using the compact SaCas9 (c; not shown to scale) and its knock-in (800 bp) efficiencies at AAVS1 and HSP90AA1 using in vitro delivery of AAV2 vectors carrying the original and minimized dSaCas9–SSAP editors in HEK293T cells (d). b,d, Data are for n = 2 biologically independent experiments.

Source data

Extended Data Fig. 10. Schematic showing the RecT protein secondary structure predicted using online tool (CFSSP, see Methods).

Extended Data Fig. 10

The prediction results (secondary structure visualized at top, alignment at bottom) formed the basis for developing a truncated functional RecT variant.

We then integrated this short SSAP with the more compact SaCas9 system62 and the smaller N22-BoxB aptamer63 to build a minimal-functional dSaCas9–mSSAP editor (Fig. 7c). This allowed us to fit the dSaCas9–mSSAP into a single AAV and employ a ≤4 kb donor AAV for long-sequence editing (Fig. 7c). We tested the dSaCas9–mSSAP editor via delivery of AAV2 particles and confirmed that it had comparable efficiencies to the full-length version in HEK293T cells (Fig. 7d). This design, while needing further in vivo validation, could provide a convenient option for delivering the dCas9–SSAP editor.

Discussion

Here we report the development of a dCas9–SSAP editor, which harmonizes the RNA-guided programmability of CRISPR with the SSAP activity of phage RecT. This dCas9–SSAP editor enables long-sequence editing with minimal DNA damage and errors. It provides research and therapeutic possibilities for addressing some of the currently intractable diseases involving large disease-causing variants, delivering therapeutic genes in vivo or minimizing undesirable modifications during gene editing19,21. Compared with other editing methods that depend on single-strand nicks or DSBs, dCas9–SSAP facilitates homology-mediated transgene insertion via non-cutting dCas9s. There are several remaining questions and development directions for this editing tool. First, it will be exciting to further understand the mechanism of dCas9–SSAP editing in mammalian cells. Based on our model and perturbation experiments, one possibility is that the strand-invasion activity of SSAP could help initiate the pairing of homologous sequences between the donor and accessible genomic DNA, followed by endogenous DNA synthesis and then resolution and integration of the knock-in sequences during DNA replication (which help explain the cell-cycle effects). Although dCas9–SSAP may be less dependent on certain endogenous repair enzymes, this process will still involve DNA repair or synthesis machinery. Thus, additional work—for example, systematic knock-out of repair enzymes—could help understand such involvement. Mining additional SSAPs from nature could also enhance the editing rates. Other delivery options, such as using mRNA or ribonucleoprotein, could help boost the dCas9–SSAP editor for broader applications, including primary-cell engineering using electroporation. Overall, this efficient low-error technology offers a complementary approach to existing CRISPR editing tools for long-sequence engineering.

Methods

Plasmid construction

Human codon-optimized DNA fragments were ordered from Genescript, Genewiz and IDT DNA. The fragments encoding the recombination enzymes were Gibson assembled into backbones (Addgene, plasmid 61423) using the NEBuilder HiFi DNA assembly master mix (New England BioLabs). The amino-acid sequence for these SSAP can be found in Supplementary Table 1. All sgRNAs were inserted into backbones (p-dCas9–SSAP-MS2-BB_BbsI and p-dsaCas9-SSAP-BoxB-BB_BsaI) using Golden Gate cloning. The dCas9–SSAP plasmids bearing sequences recognized by the restriction enzymes BbsI (dSpCas9) and BsaI (dSaCas9) as gRNA backbones were sequence-verified (Eton and Genewiz). The sgRNA sequences used in this research can be found in Supplementary Table 2. The list of all dCas9–SSAP plasmids are in Supplementary Table 3 and will be deposited to Addgene for open access.

Cell culture

HEK293T, HeLa, HepG2 and U2OS cells were obtained from the American Type Culture Collection and maintained in Dulbecco’s Modified Eagle’s Medium (DMEM; Life Technologies) with 10% fetal bovine serum (FBS; BenchMark), 100 U ml−1 penicillin and 100 µg ml−1 streptomycin (Life Technologies) at 37 °C with 5% CO2. The hESC (H9) cells were maintained in mTeSR1 medium (StemCell Technologies) at 37 °C with 5% CO2. Culture plates were pre-coated with Matrigel (Corning) 12 h before use. The plates were washed three times with PBS before seeding with cells. The Rho kinase inhibitor Y27632 (10 µM; Sigma) was added for the first 24 h after each passage. The culture medium was changed every 24 h.

Transfection

HEK293T, HeLa, HepG2 and U2OS cells were seeded into 96-well plates (Corning) at a density of 3 × 104 cells per well 12 h before transfection with 250 ng total DNA per well. The cells were transfected using Lipofectamine 3000 (Life Technologies) following the manufacturer’s instructions when the cells were at approximately 70% confluency. Briefly, we used 250 ng total DNA and 0.4 µl Lipofectamine 3000 reagent, mixed with 10 µl Opti-MEM, per well. For 250 ng DNA, we used 160 ng of dCas9-gRNA plasmids (for the double sgRNA design, we used equal amounts of the two gRNA plasmids; that is, 80 ng each), 60 ng pMCP-RecT or GFP control plasmid (Addgene, 64539) and 30 ng of PCR template DNA (the primer sequences are listed in Supplementary Table 4 and the template sequences are listed in Supplementary Notes). The cells were analysed after 3 d using FACS. The step-by-step dCas9–SSAP gene-editing protocol can be found at Protocol Exchange64.

Electroporation

For the transfection of hESC (H9) cells, a P3 primary cell 4D-NucleofectorTM X kit S (Lonza) was used following the manufacturer’s protocol. Briefly, the hESC (H9) cells were resuspended in Accutase (Innovative Cell Technology) and washed twice with PBS before electroporation. For each reaction, 3 × 105 cells were nucleofected with 4 µg total DNA mixed in 20 µl electroporation buffer using the DC100 Nucleofector program. For 4 µg DNA, we used 2.6 µg of the dCas9–SSAP gRNA plasmids, 1 µg of pMCP-RecT or GFP control plasmid and 0.4 µg of PCR template DNA (the primer sequences are listed in Supplementary Table 4 and the template sequence are listed in Supplementary Notes). After electroporation, the cells were seeded into 12-well plates with 1 ml mTeSR1 medium containing 10 µM Y27632. The culture medium was changed every 24 h. After 4 d, the cells were analysed using a CytoFLEX flow cytometer (Beckman Coulter; Stanford Stem Cell FACS Core).

FACS

The efficiency of the mKate knock-in was analysed on a CytoFLEX flow cytometer. The cells were washed twice with PBS 72 h after transfection or 96 h after electroporation and dissociated with TrypLE express enzyme (Thermo Fisher Scientific) or Accutase. The cell suspension was then transferred to a 96-well U-bottom plate (Thermo Fisher Scientific) and centrifuged at 300g for 5 min. The supernatant was aspirated, the pelleted cells were resuspended in 50 µl of 4% FBS in PBS and the cells were analysed on a CytoFLEX flow cytometer within 30 min following preparation.

Long time-point mKate fluorescence monitoring

To monitor the editing stability over time, we sorted the mKate+ cells 48 h after transfection using an Aria II SORP system and maintained these cells in DMEM medium with 10% FBS, 100 U ml−1 penicillin and 100 µg ml−1 streptomycin. The mKate ratio was analysed at different time points as mentioned earlier.

Western blotting

On-target knock-in of the GS-puromycin and blasticidin-V5 tag was verified by western blotting. The samples were collected 72 h after transfection and the proteins were extracted. Monoclonal antibody to the V5 tag (1:2,000; Thermo Scientific, R960-25) was used to detect the on-target editing product.

Crystal violet assay

The efficiency of the GS-puromycin-V5 tag knock-in was analysed using a crystal violet assay. Cells in a 24-well plate were dissociated 72 h after transfection with TrypLE express enzyme and transferred to a six-well plate. The cells were maintained in DMEM medium with 10% FBS, 100 U ml−1 penicillin, 100 µg ml−1 streptomycin and 0.5 µg ml−1 puromycin (InvivoGen, ant-pr-1) at 37 °C with 5% CO2 for another 3–5 d. The crystal violet assay was performed once the puromycin selection had completed. The medium was removed and the plates were washed with PBS. The PBS was then removed, 2 ml of a mixture of 4% paraformaldehyde and 0.5% crystal violet was added (Sigma, C6158-50G) and the plates were left at room temperature for 30 min. The crystal violet mixture was carefully removed and the samples were washed with PBS. The plates were left to dry at room temperature and imaged using a Keyence microscope. The clones were quantified using the ImageJ software.

Sanger sequencing analysis of knock-in junctions

HEK293T cells were harvested 72 h after transfection. The genomic DNA was extracted using QuickExtract DNA extraction solution (Biosearch Technologies) following the manufacturer’s instructions. The target genomic region was amplified using specific primers that bound outside of the HAs of the donor template. The primers used for the Sanger and NGS analyses are listed in Supplementary Table 4. The PCR products were purified using a Monarch PCR & DNA cleanup kit (New England BioLabs). The purified product (80–100 ng) was sent for Sanger sequencing with target-specific primers (EtonBio or Genewiz). The Sanger trace was analysed using the SnapGene software.

Treatment with HR and cell-cycle inhibitor

For different inhibitor assays, the cells were pre-treated with mirin (Sigma, M9948-5MG; 25 µM), B02 (Sigma, SML0364; 10 µM) or RI-1 (Sigma, 553514-10MG-M; 1 µM) for 16 h. For the cell-cycle arrest experiment, the cells were pre-treated with thymidine (Sigma, T9250-1G, 2 mM) for 18 h, the thymidine was removed, the cells were cultured in normal DMEM media with 10% FBS without thymidine for 9 h and thymidine was added to the cells (final concentration of 2 mM) for a second round of 18 h. For the DTB–mirin/RI-1/B02 groups, mirin (25 µM), B02 (10 µM) or RI-1 (1 µM) were added to the cells with the second treatment round with thymidine (2 mM). After the inhibitor and thymidine treatment, the cells were transfected using Lipofectamine 3000 following the manufacturer’s instructions. The cells were analysed on a CytoFLEX flow cytometer 3 d later.

NGS library preparation

Genomic DNA was extracted from cells 72 h after transfection using QuickExtract DNA extraction solution following the manufacturer’s instructions; 200 ng of genomic DNA was used for the NGS library preparation. Genes of interest were amplified using specific primers (primers are listed in Supplementary Table 4) for the first-round PCR reaction. Illumina adaptors and index barcodes were added with a second round of PCR using the primers listed in Supplementary Table 4. The PCR products were purified by gel electrophoresis on a 2% E-gel using a Monarch DNA gel extraction kit (New England BioLabs). The purified products were quantified using a Qubit dsDNA HS assay kit (Thermo Fisher) and sequenced on an Illumina MiSeq system using paired-end PE300 kits. All sequencing data were deposited to the NCBI Sequence Read Archive database under the accession code PRJNA683925.

TOPO cloning experiment

A total of 250 ng genomic DNA was used for the TOPO cloning experiment. The knock-in events were amplified using specific TA colony primers targeting the DYNLT1 or HSP90AA1 locus (the primers are listed in Supplementary Table 4) using Phusion flash high-fidelity PCR master mix (Thermo Scientific, F-548L). The PCR products were purified using a gel extraction kit (New England BioLabs, T1020L) following the manufacturer’s instructions. A poly(A) tail was added to the purified products using Taq polymerase (New England BioLabs, M0273S) with incubation at 72 °C for 30 min. The TOPO cloning reaction was set up and the transformation was performed following the manufacturer’s instructions (Thermo Scientific, K457501). The plates were sent for rolling-circle amplification/colony sequencing using the M13F (5′-GTAAAACGACGGCCAG-3′) and M13R (5′-CAGGAAACAGCTATGAC-3′) universal Sanger sequencing primers. The sequence results were analysed using the SnapGene software.

High-throughput sequencing data analysis

Processed (demultiplexed, trimmed and merged) sequencing reads were analysed to determine the editing outcomes using CRISPPResso2 by aligning the sequenced amplicons to the reference and expected HDR amplicons. The quantification window was increased to 10 bp surrounding the expected cut site to better capture diverse editing outcomes but substitutions were ignored to avoid the inclusion of sequencing errors. Only reads containing no mismatches to the expected amplicon were considered for HDR quantification; reads containing indels that partially matched the expected amplicons were included in the overall reported indel frequency.

Insertion-site mapping and analysis

We used a process that was previously developed (GIS-seq) and adapted for the genome-wide, unbiased off-target analysis of mKate knock-in following the similar protocol in our previous study31,65,66. Briefly, we harvested the HEK293T cells 3 d after transfection. The genomic DNA was size-selected using a DNAdvance genomic DNA kit (A48705, Beckman Coulter) to avoid template contamination in the following step. The purified genomic DNA (400 ng) was fragmented to an average of 500 bp using NEB fragmentase, ligated with adaptors and size-selected using a NEBNext ultra II FS DNA library prep kit following the manufacturer’s instructions. Following two rounds of nested anchored PCR to amplify the target DNA (from the end of the knock-in sequence to the ligated adaptor sequence), a size-selection purification following the NEBNext Ultra II FS DNA library Prep kit protocol was performed. The libraries were sequenced using Illumina Miseq V3 PE600 kits. The sequencing data were analysed to determine off-target insertion events with custom analysis code.

Statistics and reproducibility

Unless otherwise stated, all statistical analyses and comparisons were performed using a Student’s t-test, with a 1% false-discovery rate using the two-stage step-up method of Benjamini, Krieger and Yekutieli. The reproducibility, sample sizes and, where appropriate, statistical analyses are described in the figure legends.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Online content

Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41556-021-00836-1.

Supplementary information

Supplementary Information (425.4KB, pdf)

Supplementary notes and references.

Reporting Summary (1.8MB, pdf)
Supplementary Table 1 (26.8KB, xlsx)

Supplementary Table 1. Sequences of all SSAPs tested in this study. Supplementary Table 2. Sequences of the gRNAs used in this study. Guides starting with sp indicate SpCas9 gRNA targets and guides starting with dsp indicate dSpCas9 gRNA targets. Supplementary Table 3. List of plasmids (all of which will be available at Addgene with a plasmid ID). Supplementary Table 4. Sequences of the primers used for donor DNA (HDR template) generation, targeted sequencing and NGS assays. All NGS adaptor sequences are shown in red.

Acknowledgements

We thank A. Z. Fire and R. K. Dinesh for their insightful comments and suggestions, the all the members of the Cong and Cleary laboratory for their support. This work was supported by the National Institutes of Health (grant nos 1R35HG011316 and 1R01GM141627 to L.C.) and a Donald and Delia Baxter Foundation Faculty Scholar award (L.C.). The computational work was supported by the National Institute of Health (grant no. 1S10OD023452 to the Stanford Genomics Cluster). We thank R. Kuhn for the pU6-(BbsI) CBh-Cas9-T2A-BFP (Addgene, 64323) plasmid, J.-P. Concordet for pCas9-CtIP/pCas9-Geminin (Addgene, 109402 and 109403) and D. Liu for Rad51–Cas9(D10A) (Addgene, 125567).

Extended data

Source data

Source Data Fig. 1 (9.4KB, xlsx)

Statistical source data.

Source Data Fig. 1 (124.9KB, pdf)

Unprocessed images.

Source Data Fig. 2 (11.8KB, xlsx)

Statistical source data.

Source Data Fig. 3 (13.4KB, xlsx)

Statistical source data.

Source Data Fig. 4 (11.3KB, xlsx)

Statistical source data.

Source Data Fig. 5 (9.6KB, xlsx)

Statistical source data.

Source Data Fig. 5 (985.3KB, pdf)

Unprocessed images and unprocessed western blots.

Source Data Fig. 6 (10.6KB, xlsx)

Statistical source data.

Source Data Fig. 7 (11.3KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 2 (10KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 4 (9.4KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 5 (9.5KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 7 (9.9KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 8 (11.2KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 9 (8.9KB, xlsx)

Statistical source data.

Source Data Extended Data Fig. 9 (625.2KB, pdf)

Unprocessed images.

Author contributions

C.W. and L.C. designed the research, performed experiments and analysed data. Y.Q. and J.K.W.C. performed experiments and analysed data. N.W.H. and Q.Z. performed experiments and analysed data. N.W.H. and M.W. analysed data. L.C. designed and supervised the research. C.W., Y.Q., M.W. and L.C. wrote the paper with input from all authors.

Peer review

Peer review information

Nature Cell Biology thanks Fyodor Urnov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Data availability

All NGS data, including data from the targeted genomic loci sequencing and on/off-target analysis have been deposited to the NCBI Sequence Read Archive database under the accession code PRJNA683925. All other data supporting the findings of this study are available from the corresponding author on reasonable request. Source data are provided with this paper.

Code availability

Customized scripts for data analysis have been deposited and are available at GitHub under the Cong Laboratory repository (https://github.com/cong-lab).

Competing interests

Stanford University has filed patent applications with L.C., C.W. and Y.Q. as inventors based on this work.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Yuanhao Qu, Jason K.W. Cheng.

Extended data

is available for this paper at 10.1038/s41556-021-00836-1.

Supplementary information

The online version contains supplementary material available at 10.1038/s41556-021-00836-1.

References

  • 1.Carroll D. Genome engineering with targetable nucleases. Annu. Rev. Biochem. 2014;83:409–439. doi: 10.1146/annurev-biochem-060713-035418. [DOI] [PubMed] [Google Scholar]
  • 2.Pickar-Oliver A, Gersbach CA. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 2019;20:490–507. doi: 10.1038/s41580-019-0131-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Barrangou R, Doudna JA. Applications of CRISPR technologies in research and beyond. Nat. Biotechnol. 2016;34:933–941. doi: 10.1038/nbt.3659. [DOI] [PubMed] [Google Scholar]
  • 4.Hsu PD, Lander ES, Zhang F. Development and applications of CRISPR–Cas9 for genome engineering. Cell. 2014;157:1262–1278. doi: 10.1016/j.cell.2014.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Doudna JA, Charpentier E. The new frontier of genome engineering with CRISPR–Cas9. Science. 2014;346:1258096. doi: 10.1126/science.1258096. [DOI] [PubMed] [Google Scholar]
  • 6.Sander JD, Joung JK. CRISPR–Cas systems for editing, regulating and targeting genomes. Nat. Biotechnol. 2014;32:347–355. doi: 10.1038/nbt.2842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gaj T, Gersbach CA, Barbas CF. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol. 2013;31:397–405. doi: 10.1016/j.tibtech.2013.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Urnov FD, Rebar EJ, Holmes MC, Zhang HS, Gregory PD. Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 2010;11:636–646. doi: 10.1038/nrg2842. [DOI] [PubMed] [Google Scholar]
  • 9.Kim H, Kim J-S. A guide to genome engineering with programmable nucleases. Nat. Rev. Genet. 2014;15:321–334. doi: 10.1038/nrg3686. [DOI] [PubMed] [Google Scholar]
  • 10.Jiang W, Marraffini LA. CRISPR–Cas: new tools for genetic manipulations from bacterial immunity systems. Annu. Rev. Microbiol. 2015;69:209–228. doi: 10.1146/annurev-micro-091014-104441. [DOI] [PubMed] [Google Scholar]
  • 11.Anzalone AV, Koblan LW, Liu DR. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 2020;38:824–844. doi: 10.1038/s41587-020-0561-9. [DOI] [PubMed] [Google Scholar]
  • 12.Jasin M, Haber JE. The democratization of gene editing: Insights from site-specific cleavage and double-strand break repair. DNA Repair. 2016;44:6–16. doi: 10.1016/j.dnarep.2016.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Maizels N, Davis L. Initiation of homologous recombination at DNA nicks. Nucleic Acids Res. 2018;46:6962–6973. doi: 10.1093/nar/gky588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tsai SQ, Joung JK. Defining and improving the genome-wide specificities of CRISPR–Cas9 nucleases. Nat. Rev. Genet. 2016;17:300–312. doi: 10.1038/nrg.2016.28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kim D, Luk K, Wolfe SA, Kim J-S. Evaluating and enhancing target specificity of gene-editing nucleases and deaminases. Annu. Rev. Biochem. 2019;88:191–220. doi: 10.1146/annurev-biochem-013118-111730. [DOI] [PubMed] [Google Scholar]
  • 16.Ihry RJ, et al. p53 inhibits CRISPR–Cas9 engineering in human pluripotent stem cells. Nat. Med. 2018;24:939–946. doi: 10.1038/s41591-018-0050-6. [DOI] [PubMed] [Google Scholar]
  • 17.Enache OM, et al. Cas9 activates the p53 pathway and selects for p53-inactivating mutations. Nat. Genet. 2020;52:662–668. doi: 10.1038/s41588-020-0623-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Haapaniemi E, Botla S, Persson J, Schmierer B, Taipale J. CRISPR–Cas9 genome editing induces a p53-mediated DNA damage response. Nat. Med. 2018;24:927–930. doi: 10.1038/s41591-018-0049-z. [DOI] [PubMed] [Google Scholar]
  • 19.Dunbar CE, et al. Gene therapy comes of age. Science. 2018;359:eaan4672. doi: 10.1126/science.aan4672. [DOI] [PubMed] [Google Scholar]
  • 20.Mingozzi F, High KA. Therapeutic in vivo gene transfer for genetic disease using AAV: progress and challenges. Nat. Rev. Genet. 2011;12:341–355. doi: 10.1038/nrg2988. [DOI] [PubMed] [Google Scholar]
  • 21.Wang D, Zhang F, Gao G. CRISPR-based therapeutic genome editing: strategies and in vivo delivery by AAV vectors. Cell. 2020;181:136–150. doi: 10.1016/j.cell.2020.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Shivram H, Cress BF, Knott GJ, Doudna JA. Controlling and enhancing CRISPR systems. Nat. Chem. Biol. 2021;17:10–19. doi: 10.1038/s41589-020-00700-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yeh CD, Richardson CD, Corn JE. Advances in genome editing through control of DNA repair pathways. Nat. Cell Biol. 2019;21:1468–1478. doi: 10.1038/s41556-019-0425-z. [DOI] [PubMed] [Google Scholar]
  • 24.Pawelczak KS, Gavande NS, VanderVere-Carozza PS, Turchi JJ. Modulating DNA repair pathways to improve precision genome engineering. ACS Chem. Biol. 2018;13:389–396. doi: 10.1021/acschembio.7b00777. [DOI] [PubMed] [Google Scholar]
  • 25.Copeland NG, Jenkins NA, Court DL. Recombineering: a powerful new tool for mouse functional genomics. Nat. Rev. Genet. 2001;2:769–779. doi: 10.1038/35093556. [DOI] [PubMed] [Google Scholar]
  • 26.Kolodner R, Hall SD, Luisi-DeLuca C. Homologous pairing proteins encoded by the Escherichia colirecE and recT genes. Mol. Microbiol. 1994;11:23–30. doi: 10.1111/j.1365-2958.1994.tb00286.x. [DOI] [PubMed] [Google Scholar]
  • 27.Iyer LM, Koonin EV, Aravind L. Classification and evolutionary history of the single-strand annealing proteins, RecT, Redbeta, ERF and RAD52. BMC Genomics. 2002;3:8. doi: 10.1186/1471-2164-3-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Court DL, Sawitzke JA, Thomason LC. Genetic engineering using homologous recombination. Annu. Rev. Genet. 2002;36:361–388. doi: 10.1146/annurev.genet.36.061102.093104. [DOI] [PubMed] [Google Scholar]
  • 29.Zhang Y, Buchholz F, Muyrers JPP, Stewart AF. A new logic for DNA engineering using recombination in Escherichia coli. Nat. Genet. 1998;20:123–128. doi: 10.1038/2417. [DOI] [PubMed] [Google Scholar]
  • 30.Datta S, Costantino N, Zhou X, Court DL. Identification and analysis of recombineering functions from Gram-negative and Gram-positive bacteria and their phages. Proc. Natl Acad. Sci. USA. 2008;105:1626–1631. doi: 10.1073/pnas.0709089105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wang C, et al. Microbial single-strand annealing proteins enable CRISPR gene-editing tools with improved knock-in efficiencies and reduced off-target effects. Nucleic Acids Res. 2021;49:e36. doi: 10.1093/nar/gkaa1264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Qi LS, et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013;152:1173–1183. doi: 10.1016/j.cell.2013.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bikard D, et al. Programmable repression and activation of bacterial gene expression using an engineered CRISPR–Cas system. Nucleic Acids Res. 2013;41:7429–7437. doi: 10.1093/nar/gkt520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Maeder ML, et al. CRISPR RNA-guided activation of endogenous human genes. Nat. Methods. 2013;10:977–979. doi: 10.1038/nmeth.2598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Gilbert LA, et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell. 2013;154:442–451. doi: 10.1016/j.cell.2013.06.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Perez-Pinera P, et al. RNA-guided gene activation by CRISPR–Cas9-based transcription factors. Nat. Methods. 2013;10:973–976. doi: 10.1038/nmeth.2600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Farzadfard F, Perli SD, Lu TK. Tunable and multifunctional eukaryotic transcription factors based on CRISPR/Cas. ACS Synth. Biol. 2013;2:604–613. doi: 10.1021/sb400081r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Jones DL, et al. Kinetics of dCas9 target search in Escherichia coli. Science. 2017;357:1420–1424. doi: 10.1126/science.aah7084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sternberg SH, Redding S, Jinek M, Greene EC, Doudna JA. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature. 2014;507:62–67. doi: 10.1038/nature13011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Knight SC, et al. Dynamics of CRISPR–Cas9 genome interrogation in living cells. Science. 2015;350:823–826. doi: 10.1126/science.aac6572. [DOI] [PubMed] [Google Scholar]
  • 42.Carr PA, Church GM. Genome engineering. Nat. Biotechnol. 2009;27:1151–1162. doi: 10.1038/nbt.1590. [DOI] [PubMed] [Google Scholar]
  • 43.Esvelt KM, Wang HH. Genome-scale engineering for systems and synthetic biology. Mol. Syst. Biol. 2013;9:641. doi: 10.1038/msb.2012.66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Rybalchenko N, Golub EI, Bi B, Radding CM. Strand invasion promoted by recombination protein β of coliphage λ. Proc. Natl Acad. Sci. USA. 2004;101:17056–17060. doi: 10.1073/pnas.0408046101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Mosberg JA, Lajoie MJ, Church GM. Lambda red recombineering in Escherichia coli occurs through a fully single-stranded intermediate. Genetics. 2010;186:791–799. doi: 10.1534/genetics.110.120782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Muyrers JP, Zhang Y, Buchholz F, Stewart AF. RecE/RecT and Redα/Redβ initiate double-stranded break repair by specifically interacting with their respective partners. Genes Dev. 2000;14:1971–1982. [PMC free article] [PubMed] [Google Scholar]
  • 47.Wannier TM, et al. Improved bacterial recombineering by parallelized protein discovery. Proc. Natl Acad. Sci. USA. 2020;117:13689–13698. doi: 10.1073/pnas.2001588117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Noirot P, Kolodner RD. DNA strand invasion promoted by Escherichia coli RecT protein. J. Biol. Chem. 1998;273:12274–12280. doi: 10.1074/jbc.273.20.12274. [DOI] [PubMed] [Google Scholar]
  • 49.Thomason, L. C., Costantino, N. & Court, D. L. Examining a DNA replication requirement for bacteriophage λ Red- and Rac prophage RecET-promoted recombination in Escherichia coli. mBio7, e01443-16 (2016). [DOI] [PMC free article] [PubMed]
  • 50.Baumann P, Benson FE, West SC. Human Rad51 protein promotes ATP-dependent homologous pairing and strand transfer reactions in vitro. Cell. 1996;87:757–766. doi: 10.1016/s0092-8674(00)81394-x. [DOI] [PubMed] [Google Scholar]
  • 51.Charpentier M, et al. CtIP fusion to Cas9 enhances transgene integration by homology-dependent repair. Nat. Commun. 2018;9:1133. doi: 10.1038/s41467-018-03475-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Gutschner T, Haemmerle M, Genovese G, Draetta GF, Chin L. Post-translational regulation of Cas9 during G1 enhances homology-directed repair. Cell Rep. 2016;14:1555–1566. doi: 10.1016/j.celrep.2016.01.019. [DOI] [PubMed] [Google Scholar]
  • 53.Rees HA, Yeh W-H, Liu DR. Development of hRad51–Cas9 nickase fusions that mediate HDR without double-stranded breaks. Nat. Commun. 2019;10:2212. doi: 10.1038/s41467-019-09983-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sakuma T, Nakade S, Sakane Y, Suzuki K-IT, Yamamoto T. MMEJ-assisted gene knock-in using TALENs and CRISPR–Cas9 with the PITCh systems. Nat. Protoc. 2016;11:118–133. doi: 10.1038/nprot.2015.140. [DOI] [PubMed] [Google Scholar]
  • 55.Zhu Z, Verma N, González F, Shi Z-D, Huangfu D. A CRISPR/Cas-mediated selection-free knockin strategy in human embryonic stem cells. Stem Cell Rep. 2015;4:1103–1111. doi: 10.1016/j.stemcr.2015.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Dupré A, et al. A forward chemical genetic screen reveals an inhibitor of the Mre11–Rad50–Nbs1 complex. Nat. Chem. Biol. 2008;4:119–125. doi: 10.1038/nchembio.63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Budke B, et al. RI-1: a chemical inhibitor of RAD51 that disrupts homologous recombination in human cells. Nucleic Acids Res. 2012;40:7347–7357. doi: 10.1093/nar/gks353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Huang F, et al. Identification of specific inhibitors of human RAD51 recombinase using high-throughput screening. ACS Chem. Biol. 2011;6:628–635. doi: 10.1021/cb100428c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hustedt N, Durocher D. The control of DNA repair by the cell cycle. Nat. Cell Biol. 2016;19:1–9. doi: 10.1038/ncb3452. [DOI] [PubMed] [Google Scholar]
  • 60.Bostock CJ, Prescott DM, Kirkpatrick JB. An evaluation of the double thymidine block for synchronizing mammalian cells at the G1–S border. Exp. Cell Res. 1971;68:163–168. doi: 10.1016/0014-4827(71)90599-4. [DOI] [PubMed] [Google Scholar]
  • 61.Shedden K, Cooper S. Analysis of cell-cycle-specific gene expression in human cells as determined by microarrays and double-thymidine block synchronization. Proc. Natl Acad. Sci. USA. 2002;99:4379–4384. doi: 10.1073/pnas.062569899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Ran FA, et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature. 2015;520:186–191. doi: 10.1038/nature14299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Austin RJ, Xia T, Ren J, Takahashi TT, Roberts RW. Designed arginine-rich RNA-binding peptides with picomolar affinity. J. Am. Chem. Soc. 2002;124:10966–10967. doi: 10.1021/ja026610b. [DOI] [PubMed] [Google Scholar]
  • 64.Wang, C. et al. A step by step protocol for genomic knock-in of long sequences using dCas9-SSAP editor. Protoc. Exch., 10.21203/rs.3.pex-1711/v1 (2021).
  • 65.Tsai SQ, et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR–Cas nucleases. Nat. Biotechnol. 2015;33:187–197. doi: 10.1038/nbt.3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Nobles CL, et al. iGUIDE: an improved pipeline for analyzing CRISPR cleavage specificity. Genome Biol. 2019;20:14. doi: 10.1186/s13059-019-1625-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (425.4KB, pdf)

Supplementary notes and references.

Reporting Summary (1.8MB, pdf)
Supplementary Table 1 (26.8KB, xlsx)

Supplementary Table 1. Sequences of all SSAPs tested in this study. Supplementary Table 2. Sequences of the gRNAs used in this study. Guides starting with sp indicate SpCas9 gRNA targets and guides starting with dsp indicate dSpCas9 gRNA targets. Supplementary Table 3. List of plasmids (all of which will be available at Addgene with a plasmid ID). Supplementary Table 4. Sequences of the primers used for donor DNA (HDR template) generation, targeted sequencing and NGS assays. All NGS adaptor sequences are shown in red.

Data Availability Statement

All NGS data, including data from the targeted genomic loci sequencing and on/off-target analysis have been deposited to the NCBI Sequence Read Archive database under the accession code PRJNA683925. All other data supporting the findings of this study are available from the corresponding author on reasonable request. Source data are provided with this paper.

Customized scripts for data analysis have been deposited and are available at GitHub under the Cong Laboratory repository (https://github.com/cong-lab).


Articles from Nature Cell Biology are provided here courtesy of Nature Publishing Group

RESOURCES