Abstract
Cas9 and a guide RNA (gRNA) function to target specific genomic loci for generation of a double-stranded break. Catalytic dead versions of Cas9 (dCas9) no longer cause double-stranded breaks and instead can serve as molecular scaffolds to target additional enzymatic proteins to specific genomic loci. To generate mutations in selected genomic residues, dCas9 can be used for genomic base editing by fusing a cytidine deaminase (CD) to induce C > T (or G>A) mutations at targeted sites. In this study, we test base editing in Drosophila by expressing a transgenic Drosophila base editor (based on the mammalian BE2) that consists of a fusion protein of CD, dCas9, and uracil glycosylase inhibitor. We utilized transgenic lines expressing gRNAs along with pan-tissue expression of the Drosophila base editor (Actin5C-BE2) and found high rates of base editing at multiple targeted loci in the 20 bp target sequence. Highest rates of conversion of C > T were found in positions 3–9 of the gRNA-targeted site, with conversion reaching ∼100% of targeted DNA in somatic tissues. Surprisingly, the simultaneous use of two gRNAs targeting a genomic region spaced ∼50 bp apart led to mutations between the two gRNA targets, implicating a method to broaden the available sites accessible to targeting. These results indicate base editing is efficient in Drosophila, and could be used to induce point mutations at select loci.
Introduction
The development of clustered regularly interspaced short palindromic repeat (CRISPR)–Cas9 as a method to target specific genomic loci has revolutionized genetic engineering.1 Cas9 is a homing endonuclease that uses a small RNA molecule to target and then cut DNA. Endogenous Cas9 binds to a complexed pair of small RNAs (crRNA and tracrRNA) that combined to form a single RNA commonly referred to as a single guide RNA (sgRNA) or guide RNA (gRNA).2–5 The typical two-component system for CRISPR–Cas9 thus minimally requires the Cas9 endonuclease to be complexed with a gRNA.
The Cas9/gRNA complex can be reconstituted in vitro or expressed from transgenic constructs and formed in vivo. gRNAs typically include 19–20 bp homology to the targeted sequence whose only restriction is that it must lie directly next to a protospacer-adjacent motif (PAM) DNA sequence.3–6 The PAM for Streptococcus pyogenes Cas9 is NGG.3
The Cas9/gRNA system has been used effectively in Drosophila for the generation of targeted double-strand breaks in DNA.7–11 Two repair pathways are induced by double-strand breaks: nonhomology end joining or homology-directed repair (HDR). If the double-strand break is repaired by nonhomologous end joining, this often results in insertions or deletions (indels) at the targeted cut site, which can lead to the generation of null mutations in the targeted gene.
HDR attempts to correct the double-strand break by using homologous DNA surrounding the cut to copy in DNA that might have been lost at the break point. The homologous template is usually the sister chromosome, but an exogenous homologous template can also be provided containing a genetic cargo flanked by DNA arms homologous to those generated by the double-strand break. This can be used to generate knock-ins at the genetic locus.
The adoption of CRISPR–Cas9 into the Drosophila toolbox has made it easy to generate null mutations for a gene of interest.7,8,10,12 However, as demonstrated by genetic screens that utilized chemical mutagenesis,13 point mutations in a gene can lead to a variety of alterations of gene function that reveal insights into how a gene product functions in a cell. They can, for example, reveal important residues in a functional domain, or identify interaction domains.14 A reliance on CRISPR–Cas9 as the predominant method for generating mainly null mutations could lead to a lack of point mutations that might help elucidate the function of a gene.
Mutated versions of Cas9 have been developed that deactivate its endonuclease function (dead versions of Cas9 [dCas9]), yet still retain its ability to target specific genetic loci as directed by a gRNA.15–17 The dCas9 thus serve as a homing scaffold to direct the targeting of other proteins to specific DNA sites, including proteins that can modify DNA. This approach has been used to convert dCas9 into a base editor to generate point mutations at a target site.18
In this approach, dCas9 is tethered to cytidine deaminase (CD), which directs the deamination of cytosine (C) residues to uracil (U).18 Most CDs use RNA as a substrate, but some CDs can also utilize single-stranded DNA as a substrate.18 The Cas9/gRNA complex can lead to single-stranded DNA at the templated site, which can serve as a template for a tethered CD. Uracil has the base pairing properties of thymine and pairs with adenine. During repair or DNA synthesis, the uracil can lead to incorporation of an adenine at the template strand. This effectively converts cytosines in DNA to thymine.
A number of base editors have been developed based on this strategy. Base editor 2 (BE2) tethers the rat CD APOBEC1 to the N-terminus of dCas9 and uracil DNA glycosylase inhibitor (UGI) to its C-terminus (Fig. 1A). UGI acts to inhibit the natural DNA repair response to remove uracil from DNA through uracil DNA glycosylase, and was found to increase base editing efficiency threefold in cultured cells.18 In cultured cells, BE2 primarily converted cytosine residues in a base editing window corresponding to positions 4–8 in the gRNA-targeted region.
Here we tested in Drosophila a base editor similar in design to BE2 (Drosophila BE2, shortened to BE2 hereafter). Expression of BE2 in all tissues in combination with different transgenic gRNAs revealed robust and efficient conversion of cytosine residues to thymine across a base editing window of ∼11 bases (positions 1–11 of the gRNA target sequence). Conversions occurred most frequently at positions corresponding to residues 3–9 in the gRNA target. Interestingly, the simultaneous use of two gRNAs targeting the same genomic locus but spaced ∼50 bp apart broadened base editing to the region between the gRNA targets. The data presented here suggests that base editing in Drosophila might be an effective method to induce point mutations at target genetic loci.
Materials and Methods
Cloning of pattB-Actin5C-BE2(pActin5c-CD-dCas9m4-UDI-attB), Addgene# 104879
pWallium-dCas9-VPR (containing Homo sapien codon optimized dCas9m4; Addgene# 7889719) was digested with NdeI/EcoRI, and a 3.3 kB fragment isolated. This fragment was used to replace the NdeI/EcoRI fragment from pCMV-BE2 (Addgene# 7302018) digested with NdeI and EcoRI using a three-way ligation reaction (Rapid DNA ligation kit; Roche). This generated the intermediate plasmid pCMV-BE2+dCas9m4. The BE2+Cas9m4 insert was polymerase chain reaction (PCR) amplified and cloned using in-Fusion (Clontech) into the pActin5c-attB-RFP vector isolated by digestion of pActin5c-Cas9 (Addgene# 6220920) with EcoRI/KpnI. The final plasmid was sequence verified.
Fly stocks
Wild-type flies were IsoD1 (w1118). The GAL4 lines used were ppk-GAL4 on III (BS# 32079) and NP2222-GAL4 on II (BS# 112839). The two GAL4 gRNAs to target GAL4 were from GAL4>QF2 HACK Donor lines QF2G4H 25C1 on II (as found in BS# 66488) and QF2G4H 65E3 on III (as found in BS# 66495). The Transgenic RNAi Project (TRiP) stocks, along with the gRNA target sequences, are summarized in the gRNA section.
The GAL4 stocks to test Actin5C-BE2 were generated as follows. QF2G4H 25C1 (DsRed+)/CyO; Dh/TM6B males were crossed to Pin/CyO; ppk-GAL4 (w+) virgins. Male and female progeny of genotype QF2G4H 25C1/CyO; ppk-GAL4(w+)/TM6B were used to establish a stable stock. NP2222-GAL4/CyO; Dh/TM6B males were crossed to Pin/CyO; QF2G4H 65E3/TM6B virgins, and progeny of genotype NP2222-GAL4/CyO; QF2G4H 65E3/TM6B were used to establish a stable stock. To test the Actin5C-BE2, the mentioned stocks were crossed to Actin5C-BE2 on X virgin females, and F1 male progeny were analyzed for base editing.
Generation of Actin5C-BE2 transgenic flies
Transgenic insertion of pattB-Actin5C-BE2 onto the X chromosome was performed by PhiC31 integrase-mediated transgenesis into docking strain M{3xP3-RFP.attP}ZH-2A located at cytological location 2A3.21 Injections were conducted by Rainbow Transgenic Flies, Inc. (Camarillo, CA). The Actin5C-BE2ZH-2A transgenic line (BS# 92586) is healthy, and can be maintained as a homozygous stock.
Guide RNAs
A collection of transgenic gRNAs inserted at attP40 (located at cytological location 25C6) were used from the Harvard Drosophila RNAi Screening Center and TRiP collection (https://fgr.hms.harvard.edu/fly-in-vivo-crispr-cas). gRNAs were expressed from the pCFD3 vector (single gRNA) or the pCFD4 vector (two gRNAs) as listed hereunder. Genomic targets were selected for those least likely to cause adverse effects if mutated, such as those in 5′ UTR sites targeted for gene activation using dCas9-VPR.19 The sgRNAs generated by TRiP were designed and selected for use based on algorithms striving for optimal efficiency.22 This optimization might increase the rates of editing reported here in comparison with sites in which optimization is not possible (such as at a specific catalytic site).
The fly stocks and gRNAs leading to base editing were
Bloomington stock # | gRNA vector | Gene target | gRNA sequence |
---|---|---|---|
BS# 68045 | pCFD3 | Gclc | AGCCCGCTCGCTCTACTTTC |
BS# 68060 | pCFD3 | pps | CGCCTCCGTAAACACCGAAC |
BS# 68048 | pCFD3 | ATP_syn1 | GGTAATACCCAGATCTCTCC |
BS# 76099 | pCFD3 | CG3321 | GTTTCCCCGCTGATCAAGTT |
BS# 76095 | pCFD3 | CG17734 | ATCCTCCTCGCTGTCGAAAA |
BS# 68064 | pCFD3 | CG4101 | CCTCGTCGCCGTACTTGTCC |
BS# 68052 | pCFD3 | CG12384 | GAACAACCAAATCTTGTGGC |
BS# 67552 | pCFD4 | CG4998 | TACCGCCTTAGTTTGCGAAT |
BS# 76085 | pCFD4 | ApepP | AGGCCAATACAAGCTTCCGA |
BS# 76085 | pCFD4 | ApepP | TCCGCCAGAACGTGAACAGG |
BS# 68125 | pCFD4 | Cyp1 | TGCCCATTCCCCCATCAAAA |
BS# 68067 | pCFD4 | ry | TGCCTCGTAACCCCTAGACA |
BS# 76106 | pCFD4 | v | GTTCGCTGGGGTCTCATCTC |
BS# 68131 | pCFD4 | FK506-bp2 | TCTTAAAGTCCGAGACACAT |
BS# 68139 | pCFD4 | whd | ACAAATCCACCATATGATCG |
The fly stocks and gRNAs not leading to base editing were:
Bloomington stock # | gRNA vector | Gene target | gRNA sequence |
---|---|---|---|
BS# 67605 | pCFD3 | CG17807 | CGCTGAACCCACCCCTTACC |
BS# 68006 | pCFD4 | ppk4 | GGTGTGGCTCTTAATATCGG |
BS# 68010 | pCFD4 | ppk16 | ATGACTGCGTCTGTGAATGG |
BS# 68038 | pCFD4 | ppk28 | ATGAGAATACCATCTGTAGC |
BS# 68044 | pCFD3 | snf | TTGATGTAAATCGTTTGGTT |
BS# 68059 | pCFD3 | GluRIIA | CAATCGCACCGACGTAATGT |
BS# 68062 | pCFD4 | ZnT63 | CTTGTCATCGTGGGAGCCTT |
BS# 68009 | pCFD4 | ppk6 | GAAAGGATCCAAGTTTGATT |
BS# 67552 | pCFD4 | CG4998 | ACGTTGGAGCTTGACGTCGA |
BS# 68125 | pCFD4 | Cyp1 | TGTGATTTCGCCGTGATTTT |
BS# 76114 | pCFD4 | chico | TGTAATTAGCTCCTGAAATC |
BS# 76114 | pCFD4 | chico | ACGATTTAACTTAAGAAAGC |
BS# 76106 | pCFD4 | v | AATGTGGGGTCCAATCTTAT |
BS# 68065 | pCFD4 | pr | CATGAGCAACAAGAGCACTC |
BS# 68131 | pCFD4 | FK506-bp2 | GGTCATTGGTCAACTTGTAT |
Note: CG17807 might have minor base editing at positions C9 and C10, but chromatograms were not of sufficient quality to be included in this study. Also note that the ability of these gRNAs to induce mutations using other Cas9 sources was not verified. It could be possible that mutation of these target sites might lead to cell death or reduced viability, and so might have been selected against during adult development.
The gRNAs for GAL4 were from the GAL4>QF2 HACK donor line23
GAL4-gRNA1 | GATGTGCAGCGTACCACAAC |
GAL4-gRNA2 | TGTATTCTGAGAAAGCTGGA |
Sequence analyses of gRNA targets
Genomic DNA surrounding the gRNA targets was PCR amplified and sequenced. Ten F1 male flies were used as the source of genomic DNA as prepared by the QIAGEN DNeasy Blood & Tissue Kit (Catalog # 69504; Qiagen). To control for genomic polymorphisms that might be misinterpreted as base editing or that mutated the target site, the target site was also PCR amplified and sequenced from control progeny containing a gRNA targeting a different genomic location.
To select against potential false positives (e.g., incorrectly reporting base editing of a residue), a conservative approach was used. The chromatograms for each experimental PCR product were directly compared with wild-type chromatograms from the matched genetic background. Only peaks that were clearly different in this comparison were chosen as positive hits. If there was any uncertainty (such as small peaks in an experimental chromatogram), the PCRs and Sanger sequencing were repeated for both the control and experimental genotypes. If the chromatograms showed peaks at these locations in the control and experimental samples, this was considered as background noise in Sanger sequencing traces, and were not considered as positive hits.
The following oligos were used for PCR amplification and sequencing:
Glcl-FOR2 | CATCTTTTTGCAGCAATGGGTCTAC |
Gclc-REV2 | GACTGAATGTTCTCGAAATGATCCG |
Glcl-SEQ | CTTCAATCTGGTTGAGGCCAATATG |
ATPsyn_l-FOR2 | CGTGTGCGAAATCTTGAAAAGAAAC |
ATPsyn_l-REV2 | TTCGGATTCACGTTCTACCTTCTTG |
ATPsyn_1-SeqR | AGGCTTTCCGTTCACCCACG |
GluRIIA-FOR2 | GAAAAGCACACACACACACACACAC |
GluRIIA-REV2 | ATCAGTTTCCGTATAATGCCATCCC |
GluRIIA-SEQ | TGGCAAACAGGACGACAGCG |
ppk4-FOR2 | AGCAAGTGGATCTGAAAGTAACCCC |
ppk4-REV2 | GCAGAAGAGCAAAACGTTTCTTGAG |
ppk4-SEQ | TTACAGAGTGTTCAGAAAGGCCTCG |
ppk6-FOR | ATTCCAGGCATTAAGGACACTAGCC |
ppk6-REV | GGCGTTCCTGGGATTATAGTTGAAG |
ppk6-SEQ | ATTGGAGATGGTGTTTAGAATGGGG |
ppk16-FOR2 | AAGCAGCTTTTCAGCTCCTATTTGG |
ppk16-REV2 | GGTAGACAACCACAAAGGTCCACAG |
ppk16-Seq2 | AAAGTGCGAGCTTTCGATAACTGTG |
ppk28-FOR | GAGATTCAAAGAAGTGCGGACGTAG |
ppk28-REV | ATAGTACAGCACACAGCTGCACTCG |
ppk28-SEQR | GATTCGGCTACCTTGCTTACGTACC |
pps-FOR | AATGAGTCGTTGTTCTGTTCCAAGC |
pps-REV | CTGGTGCGTCTCGATTACTATTTGG |
pps-SEQR | TGTTCCCAATCGCACGCTCG |
snf-FOR | GTCTTAACAACTCATGTTGGGTGGC |
snf-REV | ATCCGAGGACTTGTGGGACTTACTC |
snf-SEQ | GTGCGATAACATATCGATAGACCGC |
ZnT63C-FOR | GATGTCACCGAAAAAGTGGTCAAAG |
ZnT63-REV | ACTCTGGCTGAATGGTGGTAGAGTG |
ZnT63-SEQ | GGATTTGATGAGTGCTTTCGTTATG |
whd-FOR | TTCTCGCTCTCCTCCCACTC |
whd-REV | CCAGCAAGGAAAAGGACCAG |
whd-Seq | GGGCGAAACAATAAAGAGCG |
ry-FOR | GCTAGCGTCGACCAGTGTTG |
ry-REV | ATTGGTGGGACGATACCAGG |
ry-Seq | AGACACAAATGCCCATGACG |
v-FOR | TGCGATTTTGGATATGCAGC |
v-REV | CCTGAAACTTGGCCCAGAAG |
v-Seq | TTTTGGGTTTTGGGAACTGC |
pr-FOR | GGTTAAGCCTGATTGAGCCG |
pr-REV | CACTGCTCAGATCGGTGAGG |
pr-Seq | GCGTCATCAGCACCTTCTTG |
FK506BP-FOR | TTTGGCACTGGATGAAAACG |
FK506BP-REV | TACTCACCTTGGTGCCATCG |
FK506BP-SEQ | TAGCTTCTACTCCGCTCGCC |
gig-FOR | CACCGAATCCGGAGAAGAAG |
gig-REV | CTCCTGATCCCCCAGATGAG |
gig-seq | TAATATCCGATCCGCCATCC |
chico-FOR | GAATCCACCGTCTCTGGAGG |
chico-REV | TCTGCCAGGTCATCCAAATG |
chicio-SEQ | TCGAATGACCGTTTGTTTCG |
CG4998-FOR | ATGTGCAAAATGTTGGAGCG |
CG4998-REV | CTGAAGTAGCTTGGTGGGGG |
CG4998-SEQ | AACAAAGCCCCATTTGGATG |
ApepP-FOR | CCTCAAGCCAGGCGAAATAC |
ApepP-REV | CTCATCCTCTTCGAATCCCG |
ApepP-Seq | GGAAAAAGGCGTACTGCTGG |
Cyp1-FOR | TTCATCGCCATTTTTAGCCC |
Cyp1-REV | CCAGGCTAGCCAGCCATATC |
Cyp1-SEQ | TGTGCCGAAACTGTCGAATC |
CG17807-FOR | GAACCTGGTCATCTCCTCGG |
CG17807-REV | TTGTGGCCATGAAGAAGGTG |
CG17807-SEQ | TGACCGCAACTATGGTCCAC |
CG12384-FOR | TTGGCAACAAATGAGGCTTG |
CG12384-REV | CGCGGACAACTTAAAGAGGG |
CG12384-SEQ | TCACATACAGTCCGCCAAGG |
CG4101-FOR | GCTTGAAAGTTCCATTCGGC |
CG4101-REV | GTCCATTCCAGGACTCAGCC |
CG4101-SEQ | TCCGGATTACTTTGGCCATC |
CG321-FOR | CACTCCTCGTCGCTGTCATC |
CG321-REV | GAGAATCACCTGACTGGGGC |
CG321-SEQ | ATCTCGCGCAACTTCTCCTC |
CG17734-FOR | GCCAAGGTCAAGATTAGCCG |
CG17734-REV | ACCGAGAATATTGGCGGATG |
CG17734-SEQ | GGGAAACCTTTCTGTGAGCG |
laza-FOR | TGTTTCTGTGTTTCACGCCC |
laza-REV | CGCTTATCTGGCCTACTCGG |
laza-SEQ | GGCGAATGTACCTAGCTCGC |
CG42717-FOR | TGCTTTTGCGATTGTTACCG |
CG42717-REV | ACCACACCAGTGGCATGAAG |
CG42717-SEQ | TTGGCCACAAACTTGCATTC |
CG17691-FOR2 | CCTGCAAAAACCTGTCCAAT |
CG17691-REV2 | CTGCCTTCAAAACAAGCACA |
CG17691-SEQ | TGTGATCGAGTCCACAAGGG |
CG5966-FOR | ATTGCTCAACGCATTCATGG |
CG5966-REV | CGTCTCGAGTGTGCCGTTAG |
CG5966-SEQ | TGTCGTGTATATCTCCGCCG |
The following target sites were not included in the analyses due to polymorphisms in the target site that would disrupt at least one nucleotide in the gRNA target: gig (GTGTAGAAATCTTGAATATTAGG and TTTGAGCTCGCGCGGACTTGAGG) (ry (GACGAGCGACCGACGAGTGCAGG), pr (GCGGCGATCGATAGCAAAGCAGG), and whd (TTTAAATCAGGAATGCTCGTAGG).
Chromatogram analyses of Sanger sequencing files
The EditR program (http://baseeditr.com/)24 was used to estimate the percentage of each base pair in the target sequences. The EditR program quantifies the area under each nucleotide peak at each position, and highlights regions likely targeted by base editing.
Results and Discussion
The Drosophila base editor BE2 cassette was derived from Base Editor Version 2 (BE2) that consists of a fusion protein of CD, dCas9, and uracil DNA glycosylate.18 BE2 uses a dCas9 that contains the mutations D10A and H840A, referred to here as dCas9m2 (with “m2” reflecting the inclusion of these two mutations into Cas9).
In experiments using dCas9 as the scaffold for tethering transcriptional activator domains in Drosophila, the function of dCas9m2 was compared with that of dCas9m4, a variant of dCas9 that contains the same mutations as dCas9m2 with the additional mutations H839A and N863A. dCas9m4 (with m4 reflecting this variant contains four mutations in Cas9) was found to function more effectively as a tether for targeting loci in Drosophila than the dCas9m2,19 although the reasons for this were unknown. Nonetheless, these results suggest that dCas9m4 may function as a better gRNA-targeted scaffold in Drosophila than dCas9m2, and as such, the dCas9m4 variant was used in place of the dCas9m2 variant found in BE2.
This generated the Drosophila version of the BE2 base editor used here (Fig. 1A). BE2 was cloned into a PhiC31 attB containing construct marked by miniwhite for the generation of transgenic flies.
The Liu group also developed Base Editor Version 3 (BE3) in which the catalytic H840A mutation in BE2 was reverted, allowing the mutated Cas9 protein to function as a nickase and cut the nonedited guanine strand opposite the uracil.18 This increased the efficiency of base editing in cell culture by roughly two to fivefold, but also led to indels at the targeted site. So although BE3 was more efficient at inducing base editing compared with BE2, this came at the expense of also generating indels.
When deciding to adapt BE2 or BE3 for base editing in Drosophila, we focused on the BE2 reagent to generate base editing without indels. We further reasoned that potential in vivo decreases in efficiency with BE2 could be easily compensated by increasing the number of Drosophila progeny screened. We also reasoned that potentially decreasing efficiency of BE2 might prove useful in future experiments attempting to isolate mutated flies containing a single targeted change.
To test the ability of the Drosophila base editor BE2 to induce changes in Drosophila, the Actin5C enhancer was used to direct its expression in all tissues. Using PhiC31 integrase, the pattB-Actin5C-BE2 construct was inserted onto the X chromosome at site attP-ZH-2A21 that is marked by 3xP-RFP. A stable homozygous stock of Actin5C-BE2 was established.
To target BE2 to a variety of genomic loci in the Drosophila genome, the Harvard TRiP collection of transgenic U6:gRNAs22 was used with Actin5C-BE2 (Fig. 1C). This collection has been developed by the Perrimon laboratory to use transactivator variants of dCas9 to drive expression of endogenous genes.19,25 The gRNAs in this collection targeting putative enhancer sites were cloned into the pCFD3 or pCFD4 vectors,20 and integrated at the attP40 location on the second chromosome. Thirty transgenic U6:gRNAs from this collection were used to test the ability of Actin5C-BE2 to induce mutations. The gRNAs selected for examination were chosen as mutation of the target genes was suspected not to cause lethal effects (which would interfere with the analyses).
An example cross is shown in Figure 1C. Males containing the U6:gRNA transgene were crossed to virgins containing Actin5C-BE2 on the X chromosome. All male F1 progeny contain both components, and mutations should be induced in somatic and germline tissues. The advantage of using the Actin5C enhancer to direct expression is that F1 males or females can be directly examined for changes to somatic DNA. Mutant lines could be established by crossing F1 males to a suitable balancer chromosome stock to isolate F2 males with the targeted chromosome over a balancer (while also crossing out the Actin5C-BE2 on the X). A mutant line containing the edited base could be established in the F3 generation (Fig. 1C).
In the experiments presented here, only F1 males were tested for changes in DNA. Future experiments will be required to determine the extent of germline transmission of the induced mutation using the Actin5C enhancer, or by expressing BE2 in the germline using appropriate enhancers such as vasa or nos.12,20
To determine whether BE2 can induce mutations in the genome, the targeted genomic region was PCR amplified in pools of ∼10 F1 adult males, and the target region sequenced by Sanger sequencing. To control for polymorphisms that might already be present at the targeted site, the genomic region of F1 males that used a gRNA targeting a different genomic locus (yet of the same genetic background) was also PCR amplified and sequenced by Sanger sequencing. Genomic loci that were found to contain multiple polymorphism at the targeted site, or in the gRNA target, were excluded from further analyses.
Of the 30 targeted locations, 14 demonstrated clear changes to their DNA. The gRNAs not resulting in changes as detectable by Sanger sequencing are listed in Materials and Methods. All changes were C > T mutations as seen in the Sanger sequence chromatograms (Fig. 2A) as expected by the function of CD.
The cytosine residues showing changes in the chromatograms are shown in blue in Figure 2A. All genomic changes are summarized in Figure 2B in which cytosine residues that showed any changes in the chromatograms are highlighted in red. Mutations were not found outside this 20 bp region, with the caveat that any such mutations might be below the threshold of a Sanger sequencing method of detection.
The majority of mutations occurred at positions 2–9 in the 20 bp targeted region, although mutations were also found to occur at locations 17 and 19 (Fig. 2A, B), which was not reported in mammalian cell culture studies18 but has been associated with BE4 in a recent systematic study of base editors.26 These results are summarized in Figure 2C, which displays the frequency that any change was found at a cytosine residue at a certain location in the target. For example, the 14 gRNA-targeted regions contained four examples in which a cytosine was located at position 8 in the targeted genomic loci, and 4/4 of these targeted sites (100%) demonstrated changes to that cytosine in the Sanger sequencing chromatograms. This gives an estimate of how often a particular cytosine residue at a certain location in the 20 nucleotide target site would be targeted by BE2.
To estimate how efficiently this targeting might be occurring at each targeted site, the chromatograms were used with the EditR program24 that calculates the percentage a residue is represented at each location. The chromatographs for all sequencing reactions were used to estimate the percentage of cytosine residues changed to a thymine at each location (Fig. 2D).
The cytosine residues at locations 3–9 were the most efficiently targeted, with some locations being targeted close to 100%, indicating that the somatic DNA in nearly all the adult tissues had been mutated. For example, this is evident in the chromatographs for CG4998 (the cytosine at position 7 has been nearly converted to thymine), and for Cyp1 (the cytosine at position 5 is nearly all thymine). Future studies can extend these studies with additional gRNA (e.g., from the TRiP collection) to determine why certain locations are favored for mutation versus others. These data demonstrate that base editing in Drosophila tissues can occur at high efficiencies.
The Actin5C-BE2 efficiently targeted cytosines in the 20 bp targeted region. However, surprising results were found when a genomic region was targeted by two gRNAs within ∼50 bp of one another. This was initially identified in experiments that used two gRNAs to target the GAL4 gene in transgenic Drosophila lines. The GAL4 gRNAs were located 52 bp apart, and had previously been validated to efficiently target GAL4 regions for knock-in using the HACK method and their effectiveness in inducing double-stranded breaks and indels at these targeted GAL4 loci.23 The expectation for using these two gRNAs would be the efficient induction of C > T changes in the two targeted locations. However, this was not found.
Instead, cytosine or guanine residues (corresponding to changes of the cytosine on the bottom strand) between the gRNA targets were efficiently targeted (Fig. 3A). The GAL4-targeted region, and changes to the DNA, are highlighted in red in Figure 3A. No mutations were found in the 20 bp targeted sites, but were instead found between these targeted sites (Fig. 3B).
The most frequently targeted site was a cytosine located 22 bp upstream of GAL4-gRNA1 on the bottom strand (as seen by a G > A mutation in the top sequence strand). This suggests that the DNA between these two gRNAs was now within the targeting window of the tethered CD. In addition, since CD requires single-stranded DNA as a template, it suggests that the DNA stretch between the gRNAs-targeted sequence might also be unpaired or single stranded.
The efficiency of the targeting was estimated using the Sanger sequence chromatographs with the EditR program (Fig. 3B). This confirmed that the bottom strand was being efficiently targeted (in some cases >90% of all guanine residues at this site were converted to an adenine residue). Mutations also occurred at many locations in the region between the gRNA-targeted loci, in particular a cytosine-rich region. The ability of a base editor to expand its range when two gRNAs were used to target a location has not to our knowledge been previously reported.
To determine whether this was specific to GAL4 sequences, or might be generalizable to other genomic locations, the available transgenic U6:gRNA collection from Bloomington was screened for gRNAs that targeted the top strand ∼50 bp apart. Three gRNA pairs were found to match these criteria, and were tested with Actin5C-BE2. In all three cases, mutations were found in the region between gRNAs, although the efficiencies were lower than that found with GAL4 targeting (Fig. 3C). Summary of the changes using these gRNA pairs with the EditR-estimated percentages is shown in red in Figure 3C. Some of the gRNA-targeted sites also demonstrated changes, in contrast to the GAL4 gRNA pairs.
Overall, these data suggest that the use of two gRNAs can expand the targeted region of DNA that might not otherwise be compatible for Cas9-directed mutation. In addition, since the region undergoing base editing is outside the gRNA-targeted region, repeated base editing might occur to completion in this area.
The data presented here indicate the accumulated editing that occurred throughout the hundreds of thousands of cells that make up an adult fly. It is likely that most base editing occurred early in development coinciding with expression of BE2 from the Actin5C promoter. Once a mutation is induced in the gRNA target site (such as during early developmental), that site should no longer be a target for the gRNA, and no additional editing should occur. However, the high rates of editing seen here (multiple residues >50%) suggest that base editing can occur at multiple cytosine residues during the same targeting event; this is consistent with in vitro oligomer assays of BE2 in mammalian cells.18
In recent years, a number of updates or additions to base editors have been released. To increase editing efficiency and reduce indels, BE3 was modified to Base Editor version 4 (BE4) that contains a longer linker between CD and nickase dCas9 and two copies of UGI.27 This makes BE4 an attractive base editor for use in Drosophila, and future studies will be required to determine whether BE4, with its increased C > T efficiency and low frequency of indels, positions it as a preferred base editor to BE2.
To broaden the editing window beyond the gRNA target sequence, the BE-PLUS system used dCas9 as a scaffold for 10 copies of the GCN4 peptide, which, in turn, could be complexed in vivo to a single chain variable fragment bound to base-editing molecular machinery.28 The introduction of adenine base editors (and dual cytosine/adenine base editors) further expands the range of point mutations that can now be induced using dCas9.29,30 In addition, alternative strategies for inducing mutations, such as the prime-editing technique,31 are also effective in Drosophila.
Base editing tools could be further paired with binary expression systems to limit CRISPR–Cas9 editing to specific adult tissues, such as the eye, for F1 mutagenesis screens. BE2, and the utilization of other base editing reagents, further enrich the genomic engineering toolbox in Drosophila.
Conclusions
Base editing in Drosophila somatic tissues occurs efficiently using CRISPR–Cas9. The introduction of base editing into the Drosophila genetic toolbox allows specific base pair mutations to be induced in a gene of interest, which could be used as a mechanism to evaluate the in vivo role of suspected critical residues. Base editing could further be applied as a method to systematically induce point mutations across an entire targeted gene as a way to identify gene functions that might have been missed by only examining null mutations. CRISPR–Cas9 base editing thus combines the advantages previously afforded by chemical mutagenesis to the experimental ability to select a gene target.
As collections of transgenic gRNA-expressing lines continue to grow, Drosophila could serve as valuable model organism for testing the function of CRISPR–Cas9 reagents, and further elucidate how these reagents perform in a complex genome.
Acknowledgments
We thank Katie Robinson for assistance in conducting PCR analyses, and the Bloomington Drosophila Stock Center (NIH P40OD018537) for fly lines.
Authors' Contributions
Conceptualization of the study was done by C.J.P.; methodology was carried out by C.J.P. and E.M.; formal analysis was taken care of C.J.P.; investigation was done by E.M.; writing, original draft, of the article was done by C.J.P.; writing, review and editing, of the article was done by E.M. and C.J.P.; visualization was taken care of C.J.P; funding acquisition was done by C.J.P.; supervision of the study was done by C.J.P. All authors approved the final article.
Disclaimer
The article has been submitted solely to this journal and is not published, in press, or submitted elsewhere.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
This study was partly supported by the NIH NIDCD (R01DC013070, C.J.P.).
References
- 1. Anzalone AV, Koblan LW, Liu DR. Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat Biotechnol. 2020;38:824–844. DOI: 10.1038/s41587-020-0561-9. [DOI] [PubMed] [Google Scholar]
- 2. Brouns SJ, Jore MM, Lundgren M, et al. . Small CRISPR RNAs guide antiviral defense in prokaryotes. Science. 2008;321:960–964. DOI: 10.1126/science.1159689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Jinek M, East A, Cheng A, et al. . RNA-programmed genome editing in human cells. eLife. 2013;2:e00471. DOI: 10.7554/eLife.00471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Cong L, Ran FA, Cox D, et al. . Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. DOI: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Mali P, Yang L, Esvelt KM, et al. . RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. DOI: 10.1126/science.1232033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Gasiunas G, Barrangou R, Horvath P, et al. . Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A. 2012;109:E2579–E2586. DOI: 10.1073/pnas.1208507109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Bassett AR, Tibbit C, Ponting CP, et al. . Highly efficient targeted mutagenesis of Drosophila with the CRISPR/Cas9 system. Cell Rep. 2013;4:220–228. DOI: 10.1016/j.celrep.2013.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Gratz SJ, Cummings AM, Nguyen JN, et al. . Genome engineering of Drosophila with the CRISPR RNA-guided Cas9 nuclease. Genetics. 2013;194:1029–1035. DOI: 10.1534/genetics.113.152710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Kondo S, Ueda R. Highly improved gene targeting by germline-specific Cas9 expression in Drosophila. Genetics. 2013;195:715–721. DOI: 10.1534/genetics.113.156737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Ren X, Sun J, Housden BE, et al. . Optimized gene editing technology for Drosophila melanogaster using germ line-specific Cas9. Proc Natl Acad Sci U S A. 2013;110:19012–19017. DOI: 10.1073/pnas.1318481110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Yu Z, Ren M, Wang Z, et al. . Highly efficient genome modifications mediated by CRISPR/Cas9 in Drosophila. Genetics. 2013;195:289–291. DOI: 10.1534/genetics.113.153825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Gratz SJ, Ukken FP, Rubinstein CD, et al. . Highly specific and efficient CRISPR/Cas9-catalyzed homology-directed repair in Drosophila. Genetics. 2014;196:961–971. DOI: 10.1534/genetics.113.160713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Nusslein-Volhard C, Wieschaus E. Mutations affecting segment number and polarity in Drosophila. Nature. 1980;287:795–801. DOI: 10.1038/287795a0. [DOI] [PubMed] [Google Scholar]
- 14. Kaufman TC. A short history and description of Drosophila melanogaster classical genetics: Chromosome aberrations, forward genetic screens, and the nature of mutations. Genetics. 2017;206:665–689. DOI: 10.1534/genetics.117.199950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Bikard D, Jiang W, Samai P, et al. . Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 2013;41:7429–7437. DOI: 10.1093/nar/gkt520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Gilbert LA, Larson MH, Morsut L, et al. . CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell. 2013;154:442–451. DOI: 10.1016/j.cell.2013.06.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Qi LS, Larson MH, Gilbert LA, et al. . Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013;152:1173–1183. DOI: 10.1016/j.cell.2013.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Komor AC, Kim YB, Packer MS, et al. . Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016;533:420–424. DOI: 10.1038/nature17946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Lin S, Ewen-Campen B, Ni X, et al. . In vivo transcriptional activation using CRISPR/Cas9 in Drosophila. Genetics. 2015;201:433–442. DOI: 10.1534/genetics.115.181065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Port F, Chen HM, Lee T, et al. . Optimized CRISPR/Cas tools for efficient germline and somatic genome engineering in Drosophila. Proc Natl Acad Sci U S A. 2014;111:E2967–E2976. DOI: 10.1073/pnas.1405500111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Bischof J, Maeda RK, Hediger M, et al. . An optimized transgenesis system for Drosophila using germ-line-specific phiC31 integrases. Proc Natl Acad Sci U S A. 2007;104:3312–3317. DOI: 10.1073/pnas.0611511104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Zirin J, Hu Y, Liu L, et al. . Large-scale transgenic drosophila resource collections for loss- and gain-of-function studies. Genetics. 2020;214:755–767. DOI: 10.1534/genetics.119.302964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Lin CC, Potter CJ. Editing transgenic DNA components by inducible gene replacement in Drosophila melanogaster. Genetics. 2016;203:1613–1628. DOI: 10.1534/genetics.116.191783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Kluesner MG, Nedveck DA, Lahr WS, et al. . EditR: A method to quantify base editing from sanger sequencing. CRISPR J. 2018;1:239–250. DOI: 10.1089/crispr.2018.0014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chavez A, Scheiman J, Vora S, et al. . Highly efficient Cas9-mediated transcriptional programming. Nat Methods. 2015;12:326–328. DOI: 10.1038/nmeth.3312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Arbab M, Shen MW, Mok B, et al. . Determinants of base editing outcomes from target library analysis and machine learning. Cell. 2020;182:463–480 e430. DOI: 10.1016/j.cell.2020.05.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Komor AC, Zhao KT, Packer MS, et al. . Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci Adv. 2017;3:eaao4774. DOI: 10.1126/sciadv.aao4774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Jiang W, Feng S, Huang S, et al. . BE-PLUS: A new base editing tool with broadened editing window and enhanced fidelity. Cell Res. 2018;28:855–861. DOI: 10.1038/s41422-018-0052-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Gaudelli NM, Komor AC, Rees HA, et al. . Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature. 2017;551:464–471. DOI: 10.1038/nature24644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Grunewald J, Zhou R, Lareau CA, et al. . A dual-deaminase CRISPR base editor enables concurrent adenine and cytosine editing. Nat Biotechnol. 2020;38:861–864. DOI: 10.1038/s41587-020-0535-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Bosch JA, Birchak G, Perrimon N. Precise genome engineering in Drosophila using prime editing. Proc Natl Acad Sci U S A. 2021;118. DOI: 10.1073/pnas.2021996118. [DOI] [PMC free article] [PubMed] [Google Scholar]