Abstract
TadA-derived cytosine base editors (TadCBEs) enable programmable C•G-to-T•A editing while retaining the small size, high on-target activity, and low off-target activity of TadA deaminases. Existing TadCBEs, however, exhibit residual A•T-to-G•C editing at certain positions and lower editing efficiencies at some sequence contexts and with non-SpCas9 targeting domains. To address these limitations, we use phage-assisted evolution to evolve CBE6s from a TadA-mediated dual cytosine and adenine base editor, discovering mutations at N46 and Y73 in TadA that prevent A•T-to-G•C editing and improve C•G-to-T•A editing with expanded sequence-context compatibility, respectively. In E. coli, CBE6 variants offer high C•G-to-T•A editing and no detected A•T-to-G•C editing in any sequence context. In human cells, CBE6 variants exhibit broad Cas domain compatibility and retain low off-target editing despite exceeding BE4max and previous TadCBEs in on-target editing efficiency. Finally, we show that the high selectivity of CBE6 variants is well-suited for therapeutically relevant stop codon installation without creating unwanted missense mutations from residual A•T-to-G•C editing.
Subject terms: Genetic engineering, Targeted gene repair, Protein engineering, CRISPR-Cas9 genome editing
Existing TadA-derived CBEs exhibit residual A•T-to-G•C editing activity and suffer from lower activity at several sequence contexts and with non-SpCas9 targeting domains. Here, the authors use phage-assisted evolution to evolve CBE6 variants that address these limitations.
Introduction
Base editors are programmable precision genome editing tools that consist of a base-modification enzyme such as a deaminase fused to a programmable DNA-binding domain such as a CRISPR-Cas9 nickase1,2, a TALE repeat array3,4,or a zinc-finger array5,6. Cytosine base editors (CBEs)1 enable C•G-to-T•A editing, while adenine base editors (ABEs)2 enable A•T-to-G•C editing. In contrast with nucleases that generate uncontrolled mixtures of indels, base editors create specified changes at target DNA sequences and do not require double-strand DNA breaks or donor DNA templates1,2,7,8. CRISPR base editors unwind double-stranded DNA (dsDNA), allowing a single strand-specific deaminase to access the DNA strand not paired with the guide RNA, resulting in deamination of C or A nucleobases within the editing window. Nicking the non-editing DNA strand stimulates its replacement by cellular DNA repair processes to yield a permanently edited base pair7,8.
Recent efforts have improved the activities9–11, sequence context compatibilities9, control over editing window sizes12, protospacer-adjacent motif (PAM) compatibilities9,12–14, and size15 of base editors. Base editing has been used in vivo and ex vivo in animal models to rescue genetic diseases including Hutchinson-Gilford progeria syndrome16, sickle cell disease17, spinal muscular atrophy18, T-cell acute lymphoblastic leukemia19,20, and others21,22. Recently, base editing strategies have entered clinical trials as therapeutics20,23, with the first positive clinical outcomes20.
The laboratory-evolved deaminase2,10,14 used in ABEs, TadA*, offers favorable properties for precision genome editing including high on-target activity14, low off-target editing10,14,24–26, and small size (166 amino acids) that allows it to be packaged into a single adeno-associated virus (AAV) system27. The naturally occurring cytidine deaminases used in CBEs, in contrast, are larger (227 amino acids for the commonly used rAPOBEC11) and suffer from higher Cas-independent DNA and RNA off-target activity and lower on-target editing efficiency26. To date, no CBE has been shown to match the most active adenine base editors such as ABE8e in peak editing activity. We hypothesize that this lower editing efficiency of CBEs can be attributed to either lower intrinsic deamination activity or the effects of base excision repair following uracil excision by endogenous uracil glycosylase (UNG). To address these limitations, we and others recently described the first TadA-derived CBEs, which were developed through directed evolution (TadCBEs28, CBE-Ts29) or rational protein engineering (Td-CBEs)30. These TadA-derived CBEs exhibit low off-target editing and are ~60 amino acids smaller than BE4max, a canonical CBE that uses a natural cytidine deaminase. While some TadA-derived CBEs such as TadCBEs and CBE-Ts have comparable activity to APOBEC1-derived CBEs such as BE4max and evoAPOBEC-BE4max, they retain residual A•T-to-G•C editing at certain positions in the base editing window. Since A•T-to-G•C edits can remove stop codons21, residual activity limits the utility of TadCBEs for therapeutic stop codon installation. While Td-CBEs offer relatively high product purity, they have substantially lower activity than BE4max and evoAPOBEC-BE4max (see below).
Here, we overcome the limitations of current TadA-derived CBEs through phage-assisted evolution of TadDE, a dual editor that performs both A•T-to-G•C and C•G-to-T•A editing28, into highly selective TadCBEs. The resulting evolved CBE6 variants show virtually no A•T-to-G•C editing and demonstrate superior C•G-to-T•A editing in mammalian cells when compared side-by-side with all three families of previously reported TadA-derived CBEs. CBE6 editors evolved new mutations in the substrate pocket that directly interact with the target base, as well as mutations at the dimerization interface of TadA. The editors enable highly efficient and cytosine-selective on-target editing with minimal sequence context bias and low off-target editing. Due to their enhanced selectivity and high activity, CBE6 base editors represent state-of-the-art cytosine base editors and are especially advantageous for applications that install stop codons to reduce the levels of proteins associated with increased disease risk (Fig. 1a).
Results
Phage-assisted evolution of TadCBEs with improved selectivity
We previously reported the phage-assisted evolution of the cytidine deaminase TadA-CD from TadA-8e, a highly active laboratory-evolved deoxyadenosine deaminase14,28. We hypothesized that the selectivity and activity of TadA-CD might be further improved by using an alternative evolutionary starting point, which could enable access to mutations that are inaccessible to highly evolved TadCBEs due to epistasis31.
We initiated an evolution campaign on TadA-Dual, a dual cytidine and adenosine deaminase used in the dual cytosine and adenine base editor TadDE28, with the goal of improving the selectivity of this deaminase to exclusively perform cytidine deamination (Fig. 1b). We used phage-assisted continuous evolution (PACE), which maps the stages of traditional directed evolution to the lifecycle of bacteriophage M13 propagating on a culture of E. coli host cells32. In phage-assisted evolution, the fitness of a gene variant is linked through a genetic circuit to the expression of M13 gIII, which encodes a protein essential to phage propagation. In our circuit, we coupled cytidine deamination activity to gIII expression by fusing T7 RNA polymerase (RNAP) to a bacterial degron9. C•G-to-T•A editing activity installs a stop codon in the linker between T7 RNAP and its degron, yielding active T7 RNAP that transcribes gIII. The mutagenesis plasmid (MP) introduces mutations in the deaminase, and beneficial mutations facilitate phage propagation in the lagoon (fixed-volume vessel), while the less-fit phage are washed out. Phage-assisted non-continuous evolution (PANCE) is an analogous method that relies on manual, discrete dilution of the phage in the lagoons instead of continuous dilution, thus offering a higher likelihood of allowing even modestly beneficial mutations to propagate at the expense of lower evolutionary speed compared to PACE. Stringency in both methods was tuned by altering the lagoon dilution rate and promoter strength upstream of T7 RNA polymerase.
We used a selection circuit we previously developed that penalizes residual adenine base editing9,28. In this selection, A•T-to-G•C editing disrupts stop codon installation in the linker between T7 RNAP and its degron, leading to T7 RNAP degradation and no phage propagation9,28. This selection thus directs selection pressure to minimize deoxyadenosine deamination activity28, allowing simultaneous evolution of TadDE for increased CBE activity and reduced ABE activity (Fig. 1c)28. Over six passages of PANCE, phage titers continued to increase despite increasing the selection stringency through higher dilution factors and the use of a weaker promoter upstream of T7 RNAP, suggesting that the phage evolved more active or more selective deaminase variants. This first PANCE campaign, in which the phage population underwent a ~ 1016-fold total dilution, yielded converged mutations at the N46 (N46I, N46T) and Y73 (Y73P) positions across different lagoons (Supplementary Figs. 1a and 2). During the course of this work, we were further encouraged by an independent study that reported the importance of the N46 position for target base selectivity30.
Mutations during PACE arise predominantly from the MP, which promotes all types of substitutions but is biased towards transition mutations33. To thoroughly access amino acids that may be less likely to arise through MP-mediated mutagenesis, we constructed a phage library encoding all possible amino acids at TadA position N46 and subjected these variants to a second high-stringency PANCE that used weaker promoters for T7 RNAP (Supplementary Fig. 1b). Persisting phage survived a ~ 1012-fold overall dilution. To further increase stringency, we performed PACE for 118 hours to subject the variants to greater selection pressure from continuous dilution (Fig. 1d, Supplementary Fig. 1c, d). Interestingly, N46L30, N46V, and N46C eventually converged to N46C at 118 hours, corresponding to an average ~1033 fold-dilution. Overall, the final variants that emerged from all evolution campaigns survived an overall dilution of ~1061-fold.
Based on the cryogenic electron microscopy (cryo-EM) structure of ABE8e (Protein Data Bank (PDB): 6VPC)34, we hypothesize that the N46 position determines base selectivity by interacting with the base in the active site to potentially make it more accessible for editing. Mutations at Y73, which is at the dimerization interface of TadA, could impact enzyme assembly, activity, or stability (Fig. 1e). To predict the impact of mutations at position 73 on TadA*, we estimated the energy difference between the TadA* structure with and without Y73P using Rosetta35. We found that Y73P stabilizes the TadA* monomer by 11.8 REU (Rosetta Energy Units) compared to Y73S (the original mutation in TadDE), and thus Y73P may enhance the activity of the new TadCBEs (Supplementary Fig. 3).
Evaluation of activity and selectivity of evolved CBEs in E. coli
To assess the performance of these new deaminases, we first characterized the corresponding TadCBEs in E. coli. We developed an E. coli plasmid profiling library to interrogate the sequence-context preferences of base editors (Fig. 2a). The new CBEs were tested on a 32-member plasmid library that includes all possible sequence contexts immediately 5’ and 3’ of a target sequence at protospacer position 6 within the editing window (counting the NGG PAM as positions 21-23). The library was constructed with sequences comprising all nucleotide combinations before and after the target nucleotide, resulting in 16 sequences with a target cytosine and 16 sequences with a target adenine. When expressed using the strong ribosome binding site (RBS) SD836, the new CBEs showed very high average C•G-to-T•A editing levels of 88% (TadDE N46I Y73P, hereafter designated CBE6a), 95% (TadDE N46V Y73P, hereafter designated CBE6b), 95% (TadDE N46L Y73P, hereafter designated CBE6c), and 88% (TadDE N46C Y73P, hereafter designated CBE6d), which are comparable or superior to editing by TadCBEd (88%), a previous state-of-the-art CBE28 (Fig. 2b, Supplementary Fig. 4). When expressed using the weaker ribosome binding site sd536, the new TadCBEs showed average C•G-to-T•A editing levels of 82% (CBE6a), 90% (CBE6b), 95% (CBE6c), and 89% (CBE6d), again outperforming TadCBEd (78%) (Supplementary Fig. 5). Next, we assessed the sequence-context preference of the cytosine base editors using the SD8 RBS. While CBE6a and CBE6d showed similar sequence-context preferences as TadCBEd, disfavoring 5’ AC and 5’ GC, two variants (CBE6b and CBE6c) performed C•G-to-T•A editing equally well (over 80% editing) at every possible sequence context (Fig. 2b, Supplementary Fig. 4).
To assess cytosine versus adenine deamination selectivity, we analyzed residual A•T-to-G•C editing in the library. When expressed in E. coli using the strong SD8 RBS, TadCBEd demonstrated residual A•T-to-G•C editing (average of 8%) at protospacer position 6, especially for 5’ C and 5’ T sequence contexts, consistent with our previous report28. In contrast, we identify several CBE6 base editors that show residual A•T-to-G•C editing below the high-throughput sequencing limit of detection (average of < 0.1%), regardless of the sequence context. Thus, these new TadDE-evolved CBEs offer substantially higher product purities than TadCBEd. At position 6 in the protospacer using SD8, the ratio of the new CBEs for C•G-to-T•A editing over A•T-to-G•C editing exceeded 990-fold in all cases, compared to 10.6-fold for TadCBEd, an improvement of at least ~100-fold.
To characterize the base editing window of the new CBEs, all four CBE6s were tested in E. coli on a 448-member target site library that includes all possible 5’ and 3’ sequence contexts of a target C or A ranging from positions 1–14 of the protospacer. To maximize observed differences in activity, a weaker RBS (sd2) was used. The new TadCBEs exhibited an editing window—defined as the range where the average editing is at least 20% of the average peak editing—centered near protospacer position 6 and ranging from positions 4-8 (Fig. 2c, Supplementary Figs. 8 and 9), slightly larger than the editing window of TadCBEd, which ranges from positions 5-7. As editing window size and activity are often correlated, we recommend the CBE6 variants especially for applications that need high editing levels and do not require an especially narrow editing window12. Averaged across positions 4-8 of the editing window, the selectivity ratio of the new TadCBEs for C•G-to-T•A editing over A•T-to-G•C editing ranged from 27- to 86-fold, compared to 16-fold for TadCBEd.
Reversion analysis
To determine the contribution of each mutation to achieving CBE selectivity and activity, reversion analysis was performed in which each mutation was added to the starting point (TadDE) in a successive fashion, and each resulting variant was characterized in E. coli. We found that adding N46I, N46V, N46L, or N46C were all sufficient to remove A•T-to-G•C editing from TadDE, decreasing the A•T-to-G•C editing efficiency from 75% to an average below 0.1%. Furthermore, the addition of Y73P was necessary to increase C•G-to-T•A editing levels further, leading to an additional 20-53% of sequencing reads edited for the most difficult sequence context tested (Supplementary Fig. 10).
Next, we added the N46 and Y73P mutations to TadCBEd, which was evolved from ABE8e using the same selection circuit28. The addition of N46I removed the residual A•T-to-G•C editing from TadCBEd, but the N46I and Y73P mutations were detrimental to C•G-to-T•A editing, decreasing average editing efficiencies by 1.3-fold when only N46I is added and 1.3-fold when both mutations are added (Supplementary Fig. 11). These data indicate that the evolved mutations in TadDE—but not those in TadCBEd—provide the genetic context that supports the beneficial effects of N46I and Y73P. Thus, these new deaminase variants likely did not emerge during PACE experiments that gave rise to TadCBEd because the N46I and Y73P mutations have an epistatic relationship with mutations in TadCBEd.
Evaluation of activity and selectivity in mammalian cells
Next, we tested the CBE6 variants at a variety of endogenous genomic sites in human cells and compared side-by-side their performance with that of TadCBEd28, Td-CBEmax30, and CBE-T1.5229, the best-performing TadA-derived CBEs recently described by three groups. The new deaminases were fused to SpCas9 or eNme2-C Cas9 nickase domains in the BE4max architecture11 and transfected into HEK293T cells, along with a plasmid encoding an sgRNA. Using SpCas9, the new CBEs showed similar or superior average peak editing frequencies of 55–59% compared to TadCBEd (average peak editing of 54%; P value compared to CBE6a > 0.05, P value compared to CBE6b > 0.05), Td-CBEmax (25%; P value compared to CBE6a < 0.0001, P value compared to CBE6b < 0.0001), and CBE-T1.52 (44%; P value compared to CBE6a < 0.05, P value compared to CBE6b < 0.05) (Fig. 3a, Supplementary Fig. 12, Supplementary Fig. 23). Note that we selected CBE-T1.52 for comparison because of its high activity and selectivity among the CBE-T variants. The CBE6 variants also displayed superior selectivity for cytosine over adenine at the SpCas9 sites. While the new CBEs showed residual A•T-to-G•C peak editing efficiencies of <0.1-0.1% (CBE6a), <0.1-0.6% (CBE6b), <0.1-0.3% (CBE6c), and <0.1%-1.2% (CBE6d) at all SpCas9 sites that were screened, the previously described TadA-derived CBEs TadCBEd, CBE-T1.52, and Td-CBEmax showed peak A•T-to-G•C editing efficiencies ranging from 4.7-67% (P value compared to CBE6a < 0.01, P value compared to CBE6b < 0.01), 0.6-11% (P value compared to CBE6a < 0.05, P value compared to CBE6b < 0.05), and 0.2-12% (P value compared to CBE6a < 0.05, P value compared to CBE6b < 0.05), respectively (Supplementary Fig. 23). We speculate that differences in the observed residual A•T-to-G•C editing in mammalian cells compared to the residual A•T-to-G•C editing E. coli as reported above could be due to differences in their deoxyinosine repair pathways37.
We constructed CBE6 variants using the eNme2-C Cas9 nickase38 to assess compatibility with an alternative Cas9 domain (PAM = N4CN). The use of eNme2-C Cas9 also impedes the activity of the deaminase domain compared to SpCas9 as we previously showed with TadCBEs28. Across four target sites in HEK293T cells, the new CBEs using eNme2-C Cas9 offered superior average peak C•G-to-T•A editing efficiencies of 28-38%, an improvement over TadCBEd (average peak editing of 25%; P value compared to CBE6a > 0.05, P value compared to CBE6b > 0.05) (Fig. 3b, Supplementary Fig. 13). Encouragingly, these average editing efficiencies are comparable to or higher than that of ABE8e with eNme2-C Cas9 (29%) and exceed by approximately 6-fold the observed average editing efficiency of Td-CBEmax (5%; P value compared to CBE6a < 0.0001, P value compared to CBE6b < 0.0001) and CBE-T1.52 (6%; P value compared to CBE6a < 0.0001, P value compared to CBE6b < 0.0001) using eNme2-C Cas9 domains at the same sites (Fig. 3b).
These findings collectively establish that the new CBE6s offer comparable or higher activity than BE4max, evoAPOBEC-BE4max, and the three previously reported TadA-derived CBEs, but with virtually no detected A•T-to-G•C editing. The benefits of the CBE6 variants are especially pronounced when using a non-SpCas9 targeting domain.
Characterization of Cas-independent and Cas-dependent off-target activity of new CBEs
Highly active gene editing agents are especially prone to off-target editing, resulting in undesired mutations in genomic DNA or in RNA26,39. Off-target base editing can occur through Cas-dependent mechanisms, in which Cas9 engages non-target DNA sequences similar to the target sequence, or through Cas-independent mechanisms, in which the deaminase domain operates on other transiently single-stranded DNA sequences independent of Cas protein engagement26,40.
We performed Cas-independent DNA and RNA off-target analyses on the new CBE6s both with and without V106W. We previously showed that the addition of V106W to TadA variants reduces DNA and RNA off-target activity of ABEs with little or no decrease in on-target editing efficiency14,41. V106W was reported as a mutation that reduces off-target RNA deamination by weakening deaminase binding to RNA through steric occlusion41. V106W also decreases off-target editing of DNA, perhaps by a similar mechanism (Supplementary Figs. 16–22). However, V106W largely preserved on-target DNA editing activity, possibly due to the high effective concentration of the target DNA substrate that is enforced by fusion to Cas9.
Using the orthogonal R-loop Cas-independent DNA off-target assay26, we observed that the new CBE6s have similar low levels of DNA off-target activity (average 0.2-0.7%) as TadCBEd (0.5%; P value compared to CBE6a > 0.05, P value compared to CBE6b > 0.05), which are lower than that of BE4max (average of 1.1%; P value compared to CBE6a < 0.05, P value compared to CBE6b > 0.05) and evoAPOBEC (average of 1.0%; P value compared to CBE6a < 0.01, P value compared to CBE6b > 0.05) (Fig. 4a, Supplementary Fig. 26). With the addition of V106W, on-target editing levels are only slightly decreased (1.3-fold decrease for CBE6a; 1.1-fold decrease for CBE6b; 1.05-fold decrease for CBE6c; 1.01-fold average decrease for CBE6d), but Cas-independent DNA off-target editing levels are greatly decreased (all to ≤0.1%) (Fig. 4a). P values were <0.01 and <0.01 for comparing CBE6a V106W to BE4max and evoAPOBEC, respectively and <0.05 and <0.001 for comparing CBE6b V106W to BE4max and evoAPOBEC, respectively. (Supplementary Fig. 26).
TadCBEd offers lower Cas-independent RNA off-target editing than BE4max and evoAPOBEC28. Here, off-target RNA editing analysis revealed that the new CBE6s edited an average of 0.1% of cytosines across three transcripts prone to off-target ABE editing (CTNNB1, IP90, and RSL1D1), comparable to the average off-target RNA editing of 0.1% for TadCBEd (P value compared to CBE6a < 0.05, P value compared to CBE6b > 0.05) (Fig. 4b). The addition of V106W to the new CBEs slightly decreased the average RNA off-target editing of cytosines to <0.1%. Additionally, the new CBE6 variants showed <0.1% A•T-to-G•C editing across transcripts, below the limit of detection of HTS.
To characterize Cas-dependent off-target editing of the new CBE6s, we investigated 22 previously documented off-target sites for SpCas9 base editors and sgRNAs targeting HEK3, HEK4, EMX1, and BCL11A (Supplementary Figs. 16–19). In general, the new CBE6s showed comparably low levels of Cas-dependent off-target editing as those of TadCBEd across the 22 off-target sites (Supplementary Figs. 27–30). Cas-dependent off-target editing is more easily addressed than Cas-independent off-target editing since the former can be ameliorated by varying guide RNA sequence or length, the PAM sequence targeted, and the Cas domain. As Cas-dependent off-target editing can limit the therapeutic utility of CRISPR gene editing agents, high-fidelity Cas proteins that are known to engage fewer off-target loci may improve therapeutic relevance by reducing Cas-dependent off-target editing38.
Translocations or other chromosomal abnormalities can occur if a DNA single-strand break is converted to a double-strand break during cell replication42. Previous work has shown that potential translocations generated by base editors are correlated with the fraction of indels detected42. CBE6 indel levels are low and comparable to previously reported CBEs and ABEs, suggesting that the CBE6 variants will not generate more translocations than previously reported base editors (Supplementary Figs. 14 and 15). Translocations were virtually undetected via ddPCR in a prior study using base editing with optimized reagents43.
Stop codon installation at therapeutically relevant genomic sites in human cells
To demonstrate the utility and performance of these new CBEs, we used them to install stop codons at several therapeutically relevant sites in the genome. Gene knockout and gene silencing are strategies being applied in clinical trials to suppress the levels of proteins associated with disease or inactivate gain-of-function mutant genes21. Installation of a premature stop codon by cytosine base editing can achieve these goals while avoiding the complex mixtures of uncontrolled indel products that result from nuclease-mediated gene knock out44,45. The lack of residual A•T-to-G•C editing is important for this application because A•T-to-G•C editing of either strand of an installed TAG, TAA, or TGA nonsense codon would convert it to a missense mutation, restoring undesired readthrough of a mutated target.
To test the ability of the new CBEs to perform clean premature stop codon installation at therapeutically relevant genomic loci, we identified several genomic sites within PCSK9 that were previously studied to lower LDL cholesterol levels21. We designed three sgRNAs that install stop codons at positions in PCSK9 that generate protein-truncating variants with potential therapeutic utility21. We electroporated synthetic guide RNA and mRNA encoding CBEs into patient-derived fibroblasts. We then measured cytosine and residual adenine base editing activity at these PCSK9 target sites in patient-derived fibroblasts, comparing the new CBEs to previously reported TadCBEs, BE4max, evoFERNY, and YE1. Across the three target sites for stop codon installation, the new CBE6s resulted in virtually no detected (average of 0.1%) residual A•T-to-G•C editing and also generally yielded the highest editing levels, averaging 41-53% at the target C (Fig. 5). TadCBEd yielded an average editing level of 27% (P value compared to CBE6a > 0.05, P value compared to CBE6b < 0.001) at the target C with 4% residual A•T-to-G•C editing (P value compared to CBE6a < 0.0001, P value compared to CBE6b < 0.0001), which converts the installed TAG stop codon to a TGG missense codon (Supplementary Fig. 25). As a result, an average of 16% of the stop codons installed by TadCBEd were converted to missense codons. While CBE-T1.52 displayed higher selectivity than TadCBEd, it showed lower editing efficiency (34%; P value compared to CBE6a > 0.05, P value compared to CBE6b < 0.05) at the target C than the CBE6 variants and still caused an average of 0.9% residual A•T-to-G•C editing at these sites (P value compared to CBE6a < 0.01, P value compared to CBE6b < 0.01) (Supplementary Fig. 25). BE4max averaged 32% on-target editing (P value compared to CBE6a > 0.05, P value compared to CBE6b < 0.01) but induces much higher off-target activity, as shown previously26,28 (Supplementary Fig. 25).
Taken together, these findings indicate that the new CBE6 base editors offer enhanced activity, C versus A deamination selectivity, and target specificity compared with previously reported cytosine base editors. When residual A•T-to-G•C editing must be kept to an absolute minimum, we recommend CBE6a (TadDE N46I Y73P), though we note that CBE6a retains sequence context preferences that disfavor 5’ AC and 5’ GC sequences. We recommend CBE6b (TadDE N46V Y73P) for general cytosine base editing applications, especially when Cas domains other than SpCas9 are used. For cytosine base editing applications in which off-target editing must be strictly minimized, we recommend using CBE6a-V106W and CBE6b-V106W.
Discussion
Here we report the evolution and characterization of new CBE6 variants with high C•G-to-T•A editing activity and virtually no residual A•T-to-G•C activity. These variants did not emerge from the evolution of previously reported TadCBEs, but instead were evolved from the dual adenine and cytosine base editor TadDE, suggesting the value of this starting point for CBE evolution trajectories. The new CBE6 variants are >100-fold more selective for C•G-to-T•A editing than TadCBEd when tested at protospacer position 6 in E. coli. We show that the residue at position 46 near the target base confers selectivity for cytidine deamination, and position 73 at the dimerization interface aids in increasing editing efficiency. In both E. coli and mammalian cells, the new CBE6 variants show virtually no residual A•T-to-G•C editing and outperform current CBE variants in on-target editing efficiency. Cas9-independent DNA and RNA off-target editing levels, as well as Cas9-dependent off-target editing levels, are similar to those of TadCBEd and lower than that of BE4max and can be further reduced without substantially lowering on-target editing efficiencies by adding the V106W mutation.
The new CBE6 variants offer substantial benefits when installing stop codons at genomic sites for gene knockout by avoiding the undesired creation of missense codon byproducts, as demonstrated by editing PCSK9 in patient-derived fibroblasts. In addition to enhancing precision gene editing applications, the high editing efficiencies and very high selectivities of CBE6 variants may also benefit genetic screens that use base editors to create libraries of many gene variants to uncover structure-function insights46–49.
Methods
Molecular cloning
All plasmid construction was completed through Gibson assembly or SapI-Golden Gate methods (New England Biolabs). PCR amplification was performed using Phusion U Green Hot Start II DNA polymerase (Thermo Fisher Scientific) and nuclease-free water (Qiagen). Cloning products were transformed into Mach1 chemically competent E. coli cells (Thermo Fisher Scientific). Selection antibiotics were employed at the indicated final concentrations: carbenicillin at 100 μg/ml, spectinomycin at 50 μg/ml, kanamycin at 50 μg/ml, chloramphenicol at 25 μg/ml, and tetracycline at 10 μg/ml.
For Sanger sequencing (Quintara Biosciences), plasmid DNA was amplified using the Illustra Templiphi 100 Amplification Kit (GE Healthcare Life Sciences). Sequence-validated plasmids intended for bacterial transformation were purified using the Spin 2.0 Miniprep Kit (Qiagen), and the Plasmid Plus Midi Kit (Qiagen) were used for plasmids for bacterial transformation and mammalian transfection, respectively. The concentrations of plasmids were determined using NanoDrop technology. Plasmids encoding CBE6 variants are available from Addgene.
Bacteriophage cloning
To perform Gibson assembly of the phage, PCR fragments (1 uL each) were combined in a final volume of 4 µl. Following Gibson assembly, the reaction mixture was introduced into chemically competent S2208 E. coli host cells, defined as S2060 E. coli host cells harboring pJC175e32, with a transformation volume of 100 µl. These cells, capable of activity-independent phage propagation, were cultured for 5 hours at 37 °C with agitation in antibiotic-free 2×YT media then centrifuged at 10,000 g for 10 minutes. Clonal phage populations were isolated by performing plaque assays as described below. Individual plaques were then cultivated in DRM media (prepared from United States Biological CS050H-001/CS050H-003) for a duration of 6–8 hours. To eliminate E. coli contaminants, the bacterial culture was centrifuged at 6000 g for 10 minutes, and the resulting supernatant was removed for use. For subsequent sequencing, the gene of interest within the phage was amplified using primers AB1793 (5’-TAATGGAAACTTCCTCATGAAAAAGTCTTTAG) and AB1396 (5’-ACAGAGAGAATAACATAAAAACAGGGAAGC), followed by Sanger sequencing. These primers (Integrated DNA Technologies) anneal to the phage backbone and flank the gene of interest. Finally, phage samples were stored at 4 °C.
Transformation using chemically-competent cells
For all phage propagation, PANCE, and PACE experiments, strain S2060 was used. Competent cells were prepared by diluting an overnight culture 100-fold into 25 ml of 2×YT media (United States Biological) with tetracycline and streptomycin. The culture was then grown at 37 °C with gentle shaking at 230 r.p.m. until reaching an OD600 of approximately 0.4–0.6 then pelleted by centrifugation at 4000 g for 10 minutes at 4 °C. To create competent cells, the resulting cell pellet was resuspended in 2.5 ml of TSS (LB media supplemented with 5% v/v DMSO, 10% w/v PEG 3350, and 20 mM MgCl2), divided into 100-µl aliquots, flash-frozen in liquid nitrogen, and stored at −80 °C.
The transformation process involved the use of 100 μl of competent cells that were thawed on ice and combined with a mixture of plasmids (1 μl each, with a maximum of three plasmids per transformation) in 20 μl of 5× KCM solution (500 mM KCl, 150 mM CaCl2, and 250 mM MgCl2 in water) along with 80 μl of water. The mixture was then incubated on ice for 15 minutes. A heat-shock step was performed at 42 °C for 90 seconds, after which 800 μl of SOC media (New England Biolabs) was added to rescue the cells. The cells were allowed to recover at 37 °C at 230 r.p.m. for 0.5–1.5 hours. Subsequently, the transformed cells were plated on 2×YT media containing 1.5% agar (United States Biological) and appropriate antibiotics to be incubated at 37 °C overnight.
Plaque assays
In order to facilitate the propagation of phage without relying on their activity, they were subjected to plaque formation on S2208 E. coli host cells32. A culture of host cells, whether freshly prepared or stored at 4 °C for a maximum of 3 days, underwent a 50-fold dilution in DRM supplemented with suitable antibiotics. Subsequently, the cells were cultivated at 37 °C until reaching an OD600 measurement of 0.4–0.8.
To establish different concentrations of phage stocks, serial dilutions were performed using DRM, with each dilution being tenfold more than the previous one. For the creation of plaquing plates, a mixture consisting of molten 2×YT agar (comprising 1.5% agar at 55 °C) and Bluo-gal (Gold Biotechnology), with a final concentration of 0.08% Bluo-gal, was dispensed into the wells of a 24-well plate, with each well receiving 1 ml of the mixture and left undisturbed at room temperature until solidification.
To prepare the top agar, a mixture was made by combining 2×YT medium and molten 2×YT medium agar (at a concentration of 1.5%) in a ratio of 3:2. This mixture was then stored at 55 °C until it was ready for use. For the plaquing process, 100 µl of cells were combined with 10 µl of phage in 2-ml library tubes (VWR International) to which 300 µl of warm top agar was added. After briefly mixing, this was immediately pipetted onto the solid agar medium in one of the wells of the 24-well plate. The top agar was left undisturbed to solidify at 25 °C. The plates were then incubated at 37 °C overnight without being inverted. The quantification of phage titers was accomplished by counting the number of blue plaques and using the following formula: titer = (#of plaques in quadrant) (dilution factor of quadrant)(100).
To prepare S2060 cells harboring the AP and CP plasmids of interest, the aforementioned procedure was followed. To determine phage fold enrichment, the S2060 cells harboring the plasmids of interest were inoculated into DRM overnight then diluted 50-fold into fresh DRM and cultivated at 37 °C until reaching an OD600 of 0.4–0.8 the following day. These cells were dispensed into the wells of a 96-well plate, with each well containing 1 ml of culture (Axygen). Subsequently, phage with a known titer were added to achieve an input concentration of 105 plaque-forming units per milliliter (PFU ml−1). The cultures were incubated overnight at 37 °C with continuous shaking at 230 r.p.m.
Following the incubation period, the plates were centrifuged at 4000 g for 10 minutes to separate the cells from the phage, resulting in the phage being present in the supernatant. The supernatants were then subjected to titering using the plaquing method described earlier. To determine the fold enrichment, the titer of propagated phage in the output was divided by the titer of input phage.
PANCE
PANCE experiments were conducted following established protocols50. Chemically competent S2060 host cells, transformed with AP and CP, were prepared as described above. These competent host cells were then transformed with a mutagenesis plasmid (MP6)33 and plated on 2×YT agar supplemented with 100 mM glucose and the suitable antibiotics. Strains containing MP6 are grown with glucose-containing media to repress the arabinose promoter and are not recommended to be stored for more than 1 week. Subsequently, three colonies were selected and transferred to individual wells of a 96-well plate containing 1 ml of DRM and suitable antibiotics. The colonies were resuspended and underwent ten-fold serial dilution, repeated eight times in DRM. The plate was sealed with a porous film and incubated at 37 °C with shaking at 230 r.p.m. for 16-18 hours. Wells with dilutions with an OD600 of approximately 0.4 were combined with 20 mM arabinose to induce mutagenesis and pipetted into the necessary number of 1 ml lagoons in a 96-well plate. Selection phage at the specified dilution were added to the cultures, which were incubated overnight at 37 °C then harvested by centrifugation at 4000 g for 10 minutes. 150 µl of the resulting supernatant with the evolved phage was transferred to a 96-well PCR plate, sealed with foil, and stored at 4 °C. The phage were utilized for subsequent passages. Phage titers were determined either by qPCR using a previously reported protocol or by plaquing.
PACE
PACE experiments were conducted in accordance with previously published protocols50. Host cells harboring the mutagenesis plasmid were prepared, then twelve colonies were transferred into individual wells containing 1 ml of DRM and suitable antibiotics of a 96-well plate. The colonies were resuspended and underwent ten-fold serial dilution, repeated eight times in DRM. The plate was sealed with a porous film and incubated at 37 °C with shaking at 230 r.p.m. for 16–18 hours. Wells with dilutions with an OD600 of approximately 0.4 were combined then added to a chemostat with 80–100 ml of DRM in a warm room. The chemostat was incubated until reaching an OD600 of approximately 0.4–0.8, with continuous dilution using fresh DRM at a rate of 1–1.5 chemostat volumes per hour to maintain constant cell density.
Prior to infection, 15 ml of culture from the chemostat was added to each lagoon, which were pre-induced with 10 mM arabinose for a minimum of 1 hour. 250 mM arabinose was continuously added to the lagoons at a rate of 0.6 mL per hour. Selection phage infection was initiated in the lagoons with an initial titer of 107 PFU ml−1. The lagoon dilution rates were gradually increased over time for higher selection pressure. 1 mL samples were collected from the lagoon waste lines at specified time intervals, centrifuged at 6,000 g for 8 minutes, and the resulting supernatant containing the evolved phage was stored at 4 °C. Phage titers were calculated after plaquing with S2208 E. coli host cells32. PCR amplification using the AB1793/AB1396 primer pair followed by Sanger sequencing were used to confirm the sequences of plaques.
E. coli profiling assay
To generate the library, a 448-member single-stranded DNA library (IDT oligopools) was designed to contain the target base (A or C) at protospacer positions 1-14 with the 5′ and 3′ base varied as A, T, C, or G. Each library member contains a unique molecular identifier (UMI) barcode (Supplementary Data 2). The single-stranded oligos were amplified for three cycles with the primer pair MN1591/MN1592 with KAPA polymerase using 1.5 nM template in a reaction volume of 200 μl with an annealing temperature of 68 °C and an extension time of 3 min. The PCR product was purified (Qiagen) and assembled into BamHI/EcoRI-digested plasmid MNp553 using Gibson (NEB). Following purification with Glyco-blue (Thermo Fisher), the library was transformed into NEB 10-beta electrocompetent cells. Dilutions of cells were plated immediately to calculate library size, and then the remaining transformants were grown overnight in carbenicillin to select for transformants. The following day, the library plasmid was purified by Midiprep (Qiagen).
In parallel, electrocompetent NEB10-beta cells containing the indicated editor plasmid of interest were prepared following growth in DRM to suppress expression. 40 μl of electrocompetent cells containing the editor was then electroporated with 100 ng library plasmid, rescued in 1 ml S.O.C. media for 5 min, diluted in 35 ml DRM, and grown overnight with spectinomycin, carbenicillin, and 30 mM arabinose to induce editor expression. After 16 h growth at 37 °C with shaking at 200 rpm, the plasmids were isolated by Midiprep. 1 μl plasmid was used as a template for PCR1 and HTS analysis as indicated below.
To analyze editing results for the library, sequencing reads were demultiplexed using MiSeq Reporter (Illumina) and then sorted into target amplicons using SeqKit. The output was then sequenced using CRISPResso2. The CRISPResso2 output was analyzed using a Python script adapted from Doman et al.26 and Zhang et al.51. The output was then plotted and analyzed in PRISM 10.
To determine selectivity for cytosine over adenine deamination for each editor, we calculated the average cytosine editing efficiency and the average adenine editing efficiency at positions 4–8 in the editing window. We then computed the geometric mean of the ratio of average cytosine editing to average adenine editing at each position.
Sequence logos were generated for each editor to quantify the relative editing efficiency at each sequence context around the edited C or A. To calculate the relative editing for each context, editing efficiency was summed over edits with a particular base 5’ or 3’ of the edited base for all positions in the window, then normalized by dividing by the total editing over all contexts. This process yielded a frequency value for each possible base on either side of the edited base. Information content was calculated by scaling each frequency by the log-ratio of the calculated base frequency to the background frequency (0.25). Information content was plotted as the height for each 5’ and 3’ context base to generate a Kullback–Leibler sequence logo52. Plots were created using Logomaker in Python53.
Energy modeling of CBE6 mutations
Mutations that arose during evolution were substituted into the ABE8e cryo-EM structure (PDB: 6VPC), and folding energies were computed to compare the stabilization of the monomeric and dimeric states of TadA*34. Structures with or without the substituted mutations were energy minimized in PyRosetta with the FastRelax protocol using the ref2015 energy function54. The difference in folding energy between each mutant and the original TadA* were calculated to estimate stabilization effects.
HEK293T transfection and lysis
HEK293T cells (ATCC, CRL-3216) were acquired from ATCC and cultured in Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with GlutaMAX (Thermo Fisher Scientific) and 10% (v/v) Fetal Bovine Serum (FBS) (Gibco, qualified). The cells were incubated and cultured at 37 °C with 5% CO2.
Prior to transfection, cells were seeded at a density of 1.6 × 104 cells per well in 96-well plates (Corning) and allowed to adhere for 16–24 hours. Cells were transfected when they reached approximately 60–80% confluency. For the transfection, 0.5 uL of Lipofectamine 2000 (Thermo Fisher Scientific) was combined with editor plasmid (100 ng) and guide RNA plasmid (40 ng), and the mixture was diluted into Opti-MEM reduced serum media (Thermo Fisher Scientific) to a final volume of 12.5 µl. Transfection was carried out according to the manufacturer’s instructions.
After 72 hours, the culture media was removed, cells were washed with 100 µl of 1× PBS solution, and genomic DNA was extracted by adding 50 µl of lysis buffer per well. The lysis buffer contained 10 mM Tris-HCl (pH 8.0), 0.05% SDS, and 20 µg/ml of Proteinase K (New England Biolabs). The cell lysate was incubated at 37 °C for 1 hour, transferred to 96-well PCR plates, and heat-inactivated at 80 °C for 30 minutes. The genomic DNA was then stored at −20 °C.
High-throughput sequencing
The genomic DNA from mammalian cell lines was subjected to high-throughput sequencing using a methodology outlined in a previous study2. The primer pairs employed in PCR 1 for all genomic sites are located in Supplementary Data 1. A 25 μl reaction for a given PCR 1 consisted of 0.125 uL of both forward and reverse primers, 1 μl of genomic DNA extract, 0.75 uL DMSO, 5 uL of Phusion Green HF Buffer (Thermo Fisher Scientific), 0.5 uL dNTPs, and 0.25 uL of Phusion Hot Start II DNA polymerase (Thermo Fisher Scientific). PCR1 reactions were conducted with the following parameters: an initial denaturation at 95 °C for 2 min, followed by 30 cycles of (95 °C for 10 s, 61 °C for 20 s, and 72 °C for 30 s), concluding with a final 72 °C extension for 5 min. In PCR 2, unique Illumina barcoding primer pairs were introduced. A 25 μl reaction for a given PCR 2 included 0.5 μM of each unique forward and reverse Illumina barcoding primer pair, 1 μl of unpurified PCR 1 reaction mixture, 5 uL of Phusion Green HF Buffer (Thermo Fisher Scientific), 0.5 uL dNTPs, and 0.25 uL of Phusion Hot Start II DNA polymerase (Thermo Fisher Scientific). The PCR2 reactions were conducted with the following parameters: an initial denaturation at 95 °C for 2 min, 10 cycles of (95 °C for 10 s, 61 °C for 20 s, and 72 °C for 30 s), and a final 72 °C extension for 5 min. Subsequently, the PCR products underwent purification through electrophoresis with a 1% agarose gel, utilizing a QIAquick Gel Extraction Kit and eluting with 20 μl H2O. DNA concentrations were determined through a Qubit dsDNA High Sensitivity Assay Kit (Thermo Fisher Scientific). Subsequently, the samples were sequenced on an Illumina MiSeq instrument, and demultiplexing was carried out using the MiSeq Reporter software (Illumina). The resulting demultiplexed sequencing reads were subjected to analysis using CRISPResso2 and Microsoft Excel (version 16.75).
DNA off-target editing analysis
Along with the editor plasmid (100 ng) and an SpCas9 gRNA plasmid (40 ng), a catalytically dead SaCas9 and an SaCas9 guide RNA plasmid were transfected into HEK293T cells and analyzed using high-throughput sequencing following the aforementioned procedure26.
RNA off-target editing analysis
The procedure for analyzing off-target RNA editing was conducted following established methods26,41. HEK293T cells were grown in two 96-well plates and subjected to parallel transfections with 250 ng of editor-encoding plasmids and 83 ng of EMX1 guide RNA per well. After 48 hours, one plate was utilized to assess on-target DNA editing at the EMX1 locus. For the second plate, cells were lysed using the RNeasy kit (Qiagen). After removing the medium, cells were washed with 1× PBS and lysed in RLT Plus Buffer (Qiagen). The lysate was then transferred to a DNA eliminator column and the flowthrough was treated with ethanol, which was then transferred to an RNeasy spin column. Samples underwent RW1 washing, followed by on-column DNA digestion using RNase-Free DNase in RDD buffer (Qiagen). Subsequent washes utilized RW1 and RPE buffers. Elution of RNA was done with 45 µl nuclease-free water, and each sample was supplemented with 2 µl of RNaseOUT (Thermo Fisher Scientific).
For cDNA synthesis, the SuperScript IV First-Strand Synthesis Kit (Thermo Fisher Scientific) was employed. RNA annealing with the OligodT primer occurred through heating at 65 °C, succeeded by cooling on ice for 1 minute. The resulting mixture underwent a reverse transcription reaction. Controls without reverse transcriptase were integrated to monitor genomic DNA contamination. Incubation was done at 50 °C for 10 minutes and 80 °C for 10 minutes, followed by cooling on ice for 1 minute. Optional RNA degradation with RNaseH was undertaken to enhance cDNA amplification efficiency. The first round of targeted amplicon sequencing PCR utilized 1 µl of each cDNA sample; subsequent sequencing steps were the same as the high-throughput, targeted genomic DNA sequencing method described above.
Base editor mRNA synthesis from IVT
Production of base editor mRNA involved the generation of PCR products derived from a template plasmid harboring the expression construct for the desired base editor, a procedure outlined in prior work10. Amplification of the PCR product was performed in a total reaction volume of 200 µl, utilizing the IVT-F forward primer and IVT-R reverse primer. The resulting PCR product was purified using the QIAquick PCR Purification Kit (Qiagen) and subsequently eluted in 50 µl of nuclease-free water. In vitro transcription (IVT) reactions were initiated utilizing the HiScribe T7 High-Yield RNA Synthesis Kit (New England Biolabs), with a notable adaptation: N1-methyl-pseudouridine (substituting uridine) and co-transcriptional capping with CleanCap AG. Extraction of mRNA was performed through lithium chloride precipitation. For each 160 uL IVT reaction, 0.5 volumes of 7.5 M lithium chloride were introduced and thoroughly mixed. The mixture was incubated for 30 minutes at −20 °C, and a subsequent centrifugation step at 15,000 g for 20 minutes separated the supernatant from the pellet. Discarding the supernatant, the pellet was resuspended using 400 µl of ice-cold 70% ethanol. A second centrifugation, this time at 4 °C for 15 minutes, was performed, and the supernatant was discarded. The resulting pellet was air-dried at room temperature for 5 minutes and was reconstituted in 100–200 µl of nuclease-free water. The samples were adjusted to a uniform concentration of 2 µg µl−1 and conserved at a temperature of −80 °C.
Nucleofection of patient-derived fibroblasts
Patient-derived fibroblasts were obtained from the Coriell Institute (GM03348) and cultured in DMEM with GlutaMAX supplemented with 15% FBS at 37 °C with 5% CO2. 2.0×105 fibroblasts were nucleofected with 50 pmol of sgRNA (Synthego) and 1 μg of in vitro-transcribed SpCas9 mRNA via program DS-150 on a Lonza Nucleofector 4-D, which required 20 uL of P2 primary cell solution. Subsequently, cells were plated on 24-well plates, and the medium was changed after 24 h. After 72 h, the medium was removed, cells were washed with 1× PBS, and genomic DNA was extracted with 150 μl lysis buffer (10 mM Tris-HCl, pH 7.0, 0.05% SDS, 25 μg ml−1 proteinase K).
Statistics & reproducibility
Experiments were independently repeated three times unless otherwise stated. No data were excluded from analyses. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment. P values were calculated using Student’s two-tailed, unpaired t-tests. P values of < 0.05 were considered statistically significant.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
This work was supported by US National Institutes of Health (NIH) R35GM118062, U01AI142756, R01EB027793, R01EB031172, R01HL156647, U19NS132304, U19NS132315, and Howard Hughes Medical Institute (HHMI). We thank A. Sousa, S. Erwood, P. Randolph, S. DeCarlo, and S. Pandey for materials, discussion, and technical advice. M.E.N. was supported by a Ruth L. Kirschstein National Research Service Awards Postdoctoral Fellowship (GM143776-02). N.A.K. is a National Science Foundation (NSF) Graduate Research Fellow.
Author contributions
E.Z. designed and cloned plasmids and phage, executed the evolution experiments, validated CBE editor activity in E. coli and mammalian cells, performed DNA off-target experiments, produced base editor mRNA and performed nucleofections in fibroblasts, and analyzed data. M.E.N. designed and cloned plasmids and phage, designed and advised the evolution experiments, developed and validated CBE editor activity for profiling in E. coli, and analyzed data. N.A.K. performed Cas-independent RNA off-target experiments, performed energy modeling, and analyzed E. coli library profiling data. E.Z., M.E.N., and D.R.L. designed the research. E.Z., M.E.N., and D.R.L. drafted the manuscript, with input from all authors.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Data availability
High-throughput DNA sequencing FASTQ files generated in this study have been deposited in the National Center of Biotechnologyʼs Information Sequence Read Archive under BioProject “PRJNA1028129”. Amino acid sequences of deaminases recommended in this work are listed in the Supplementary Information. The published structure of ABE8e is available in the Protein Data Bank (6VPC). Source data are provided with this paper.
Code availability
All code used for processing library data is available on GitHub at https://github.com/MLE-zhang/BE_Lib.
Competing interests
The authors declare competing financial interests: The Broad Institute has filed a patent application on behalf of E.Z., M.E.N., and D.R.L on the base editors developed in this study. D.R.L. is a consultant for Prime Medicine, Beam Therapeutics, Pairwise Plants, Chroma Medicine, and Nvelop Therapeutics, companies that use or deliver agents for genome editing, epigenome engineering, or PACE, and owns equity in these companies. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-024-45969-7.
References
- 1.Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016;533:420–424. doi: 10.1038/nature17946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gaudelli NM, et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature. 2017;551:464–471. doi: 10.1038/nature24644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mok BY, et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature. 2020;583:631–637. doi: 10.1038/s41586-020-2477-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Cho S-I, et al. Targeted A-to-G base editing in human mitochondrial DNA with programmable deaminases. Cell. 2022;185:1764–1776.e12. doi: 10.1016/j.cell.2022.03.039. [DOI] [PubMed] [Google Scholar]
- 5.Willis JCW, Silva-Pinheiro P, Widdup L, Minczuk M, Liu DR. Compact zinc finger base editors that edit mitochondrial or nuclear DNA in vitro and in vivo. Nat. Commun. 2022;13:7204. doi: 10.1038/s41467-022-34784-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lim K, Cho S-I, Kim J-S. Nuclear and mitochondrial DNA editing in human cells with zinc finger deaminases. Nat. Commun. 2022;13:366. doi: 10.1038/s41467-022-27962-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Rees HA, Liu DR. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 2018;19:770–788. doi: 10.1038/s41576-018-0059-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Anzalone AV, Koblan LW, Liu DR. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 2020;38:824–844. doi: 10.1038/s41587-020-0561-9. [DOI] [PubMed] [Google Scholar]
- 9.Thuronyi BW, et al. Continuous evolution of base editors with expanded target compatibility and improved activity. Nat. Biotechnol. 2019;37:1070–1079. doi: 10.1038/s41587-019-0193-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gaudelli NM, et al. Directed evolution of adenine base editors with increased activity and therapeutic application. Nat. Biotechnol. 2020;38:892–900. doi: 10.1038/s41587-020-0491-6. [DOI] [PubMed] [Google Scholar]
- 11.Koblan LW, et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 2018;36:843–846. doi: 10.1038/nbt.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kim YB, et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol. 2017;35:371–376. doi: 10.1038/nbt.3803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Miller SM, et al. Continuous evolution of SpCas9 variants compatible with non-G PAMs. Nat. Biotechnol. 2020;38:471–481. doi: 10.1038/s41587-020-0412-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Richter MF, et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 2020;38:883–891. doi: 10.1038/s41587-020-0453-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Davis, J. R. et al. Efficient in vivo base editing via single adeno-associated viruses with size-optimized genomes encoding compact adenine base editors. Nat. Biomed. Eng. 10.1038/s41551-022-00911-4 (2022). [DOI] [PMC free article] [PubMed]
- 16.Koblan LW, et al. In vivo base editing rescues Hutchinson–Gilford progeria syndrome in mice. Nature. 2021;589:608–614. doi: 10.1038/s41586-020-03086-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Newby GA, et al. Base editing of haematopoietic stem cells rescues sickle cell disease in mice. Nature. 2021;595:295–302. doi: 10.1038/s41586-021-03609-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Arbab M, et al. Base editing rescue of spinal muscular atrophy in cells and in mice. Science. 2023;380:eadg6518. doi: 10.1126/science.adg6518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.ISRCTN - ISRCTN15323014: CAR T cells to fight T cell leukaemia. https://www.isrctn.com/ISRCTN15323014, 10.1186/ISRCTN15323014.
- 20.Chiesa R, et al. Base-Edited CAR7 T Cells for Relapsed T-Cell Acute Lymphoblastic Leukemia. N. Engl. J. Med. 2023;389:899–910. doi: 10.1056/NEJMoa2300709. [DOI] [PubMed] [Google Scholar]
- 21.Chadwick AC, Wang X, Musunuru K. In Vivo Base Editing of PCSK9 (Proprotein Convertase Subtilisin/Kexin Type 9) as a Therapeutic Alternative to Genome Editing. Arterioscler. Thromb. Vasc. Biol. 2017;37:1741–1747. doi: 10.1161/ATVBAHA.117.309881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yeh, W.-H. et al. In vivo base editing restores sensory transduction and transiently improves auditory function in a mouse model of recessive deafness. Sci. Transl. Med. 12, eaay9101 (2020). [DOI] [PMC free article] [PubMed]
- 23.Eisenstein M. Base editing marches on the clinic. Nat. Biotechnol. 2022;40:623–625. doi: 10.1038/s41587-022-01326-x. [DOI] [PubMed] [Google Scholar]
- 24.Jin S, et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science. 2019;364:292–295. doi: 10.1126/science.aaw7166. [DOI] [PubMed] [Google Scholar]
- 25.Zuo E, et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science. 2019;364:289–292. doi: 10.1126/science.aav9973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Doman JL, Raguram A, Newby GA, Liu DR. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 2020;38:620–628. doi: 10.1038/s41587-020-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Davis JR, et al. Efficient in vivo delivery of adenine base editors in a single adeno-associated virus. Nat. Biotechnol. 2022;6:1272–1283. doi: 10.1038/s41551-022-00911-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Neugebauer ME, et al. Evolution of an adenine base editor into a small, efficient cytosine base editor with low off-target activity. Nat. Biotechnol. 2023;41:673–685. doi: 10.1038/s41587-022-01533-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Lam DK, et al. Improved cytosine base editors generated from TadA variants. Nat. Biotechnol. 2023;41:686–697. doi: 10.1038/s41587-022-01611-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chen L, et al. Re-engineering the adenine deaminase TadA-8e for efficient and specific CRISPR-based cytosine base editing. Nat. Biotechnol. 2023;41:663–672. doi: 10.1038/s41587-022-01532-7. [DOI] [PubMed] [Google Scholar]
- 31.Starr TN, Thornton JW. Epistasis in protein evolution. Protein Sci. 2016;25:1204–1218. doi: 10.1002/pro.2897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Esvelt KM, Carlson JC, Liu DR. A system for the continuous directed evolution of biomolecules. Nature. 2011;472:499–503. doi: 10.1038/nature09929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Badran AH, Liu DR. Development of potent in vivo mutagenesis plasmids with broad mutational spectra. Nat. Commun. 2015;6:8425. doi: 10.1038/ncomms9425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lapinaite A, et al. DNA capture by a CRISPR-Cas9-guided adenine base editor. Science. 2020;369:566–571. doi: 10.1126/science.abb1390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Leman JK, et al. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods. 2020;17:665–680. doi: 10.1038/s41592-020-0848-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ringquist S, et al. Translation initiation in Escherichia coli: sequences within the ribosome-binding site. Mol. Microbiol. 1992;6:1219–1229. doi: 10.1111/j.1365-2958.1992.tb01561.x. [DOI] [PubMed] [Google Scholar]
- 37.Kuraoka I. Diversity of Endonuclease V: From DNA Repair to RNA Editing. Biomolecules. 2015;5:2194–2206. doi: 10.3390/biom5042194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Huang TP, et al. High-throughput continuous evolution of compact Cas9 variants targeting single-nucleotide-pyrimidine PAMs. Nat. Biotechnol. 2023;41:96–107. doi: 10.1038/s41587-022-01410-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Grünewald J, et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. Nature. 2019;569:433–437. doi: 10.1038/s41586-019-1161-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhang C, et al. Prediction of base editor off-targets by deep learning. Nat. Commun. 2023;14:5358. doi: 10.1038/s41467-023-41004-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rees HA, Wilson C, Doman JL, Liu DR. Analysis and minimization of cellular RNA editing by DNA adenine base editors. Sci. Adv. 2019;5:eaax5717. doi: 10.1126/sciadv.aax5717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Fiumara, M. et al. Genotoxic effects of base and prime editing in human hematopoietic stem cells. Nat. Biotechnol. 1–15 10.1038/s41587-023-01915-4 (2023). [DOI] [PMC free article] [PubMed]
- 43.Webber BR, et al. Highly efficient multiplex human T cell engineering without double-strand breaks using Cas9 base editors. Nat. Commun. 2019;10:5222. doi: 10.1038/s41467-019-13007-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Billon P, et al. CRISPR-Mediated Base Editing Enables Efficient Disruption of Eukaryotic Genes through Induction of STOP Codons. Mol. Cell. 2017;67:1068–1079.e4. doi: 10.1016/j.molcel.2017.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Dang L, et al. Comparison of gene disruption induced by cytosine base editing-mediated iSTOP with CRISPR/Cas9-mediated frameshift. Cell Prolif. 2020;53:e12820. doi: 10.1111/cpr.12820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lue NZ, et al. Base editor scanning charts the DNMT3A activity landscape. Nat. Chem. Biol. 2023;19:176–186. doi: 10.1038/s41589-022-01167-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Garcia EM, et al. Base Editor Scanning Reveals Activating Mutations of DNMT3A. ACS Chem. Biol. 2023;18:2030–2038. doi: 10.1021/acschembio.3c00257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hanna RE, et al. Massively parallel assessment of human variants with base editor screens. Cell. 2021;184:1064–1080.e20. doi: 10.1016/j.cell.2021.01.012. [DOI] [PubMed] [Google Scholar]
- 49.Xu P, et al. Genome-wide interrogation of gene functions through base editor screens empowered by barcoded sgRNAs. Nat. Biotechnol. 2021;39:1403–1413. doi: 10.1038/s41587-021-00944-1. [DOI] [PubMed] [Google Scholar]
- 50.Miller SM, Wang T, Liu DR. Phage-assisted continuous and non-continuous evolution. Nat. Protoc. 2020;15:4101–4127. doi: 10.1038/s41596-020-00410-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zhang, E. et al. Base Editor Library Profiling in E. coli. GitHub. https://github.com/MLE-zhang/BE_Lib, 10.5072/zenodo.24653 (2024).
- 52.Kullback S, Leibler RA. On Information and Sufficiency. Ann. Math. Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
- 53.Tareen A, Kinney JB. Logomaker: beautiful sequence logos in Python. Bioinformatics. 2020;36:2272–2274. doi: 10.1093/bioinformatics/btz921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Chaudhury S, Lyskov S, Gray JJ. PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics. 2010;26:689–691. doi: 10.1093/bioinformatics/btq007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Davis JH, Rubin AJ, Sauer RT. Design, construction and characterization of a set of insulated bacterial promoters. Nucleic Acids Res. 2011;39:1131–1141. doi: 10.1093/nar/gkq810. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
High-throughput DNA sequencing FASTQ files generated in this study have been deposited in the National Center of Biotechnologyʼs Information Sequence Read Archive under BioProject “PRJNA1028129”. Amino acid sequences of deaminases recommended in this work are listed in the Supplementary Information. The published structure of ABE8e is available in the Protein Data Bank (6VPC). Source data are provided with this paper.
All code used for processing library data is available on GitHub at https://github.com/MLE-zhang/BE_Lib.