Abstract
Base editors have been reported to induce off-target mutations in cultured cells, mouse embryos and rice, but their long-term effects in vivo remain unknown. Here, we develop a Systematic evaluation Approach For gene Editing tools by Transgenic mIce (SAFETI), and evaluate the off-target effects of BE3, high fidelity version of CBE (YE1-BE3-FNLS) and ABE (ABE7.10F148A) in ~400 transgenic mice over 15 months. Whole-genome sequence analysis reveals BE3 expression generated de novo mutations in the offspring of transgenic mice. RNA-seq analysis reveals both BE3 and YE1-BE3-FNLS induce transcriptome-wide SNVs, and the numbers of RNA SNVs are positively correlated with CBE expression levels across various tissues. By contrast, ABE7.10F148A shows no detectable off-target DNA or RNA SNVs. Notably, we observe abnormal phenotypes including obesity and developmental delay in mice with permanent genomic BE3 overexpression during long-time monitoring, elucidating a potentially overlooked aspect of side effects of BE3 in vivo.
Subject terms: Genetic engineering, CRISPR-Cas9 genome editing
The potential off-target effects of long-term expression of base editors in vivo are unclear. Here the authors report SAFETI, Systematic evaluation Approach For gene Editing tools by Transgenic mIce, to examine off-target effects of base editors over time in mice, and see abnormal side effects.
Introduction
CRISPR-derived base editing is a genome editing method to introduce point mutations on DNA or RNA at the target loci. By fusing catalytically dead Cas9 (dCas9) or nickase Cas9 (nCas9) with cytidine deaminases or adenosine deaminases, cytosine base editors (CBEs)1,2 and adenine base editors (ABEs)3 were developed to install targeted C-to-T or A-to-G point mutations without generating double-strand breaks (DSBs). However, the application of CBEs and ABEs was limited by off-target DNA and RNA mutations4–7. Even though high-fidelity base editors were subsequently generated by protein engineering4,8–10, their specificity in vivo remains to be explored considering their constant and long-term expression through common delivery strategies11,12.
Here we generate transgenic mouse lines expressing BE3, high fidelity YE1-BE3-FNLS (W90Y and R126E mutations in rAPOEBC1)8, or ABE7.10F148A (F148A in TadA)4, and comprehensively evaluate the side effects of base editors on DNA and RNA across various tissues. In addition, we also monitor the phenotypes of the transgenic mice over months to explore the adverse effects of long-term expression of BEs.
Results
Generation of transgenic mice expressing base editors
We generated transgenic mouse lines expressing BE31, YE1-BE3-FNLS8, hA3A-BE313, ABE7.103, or ABE7.10F148A4 using the piggyBac (PB) transposon integration system14 (Fig. 1a and Supplementary Fig. 1a). The transposon vectors consisted of a CAG promoter followed the sequence encoding one of the base editors that was linked (via a self-cleaving P2A peptide) to an enhanced green fluorescent protein reporter (eGFP; Supplementary Fig. 1a). The transposon vectors were injected into zygotes of the C57BL/6 J mice together with PB transposase enzyme (PBase) mRNA. The zygotes were developed to two-cell embryos in vitro, and then transferred into oviducts of pseudopregnant females of the ICR strain. We found that birth rates of F0 mice were not severely affected by injecting vectors consisting of BE3, YE1-BE3-FNLS or ABE7.10F148A (Fig. 1b). Besides, fluorescence detection and PCR genotyping indicated that more than 75% of the born mice in these groups were successfully integrated with the corresponding base editor (Fig. 1c, Supplementary Table 1). By contrast, mice injected with vectors for BE3-human APOBEC3A (hA3A-BE3) or ABE7.10 showed dramatically reduced birth rates without successfully integrated pups (Fig. 1b, c), suggesting that hA3A-BE3 and ABE7.10 were toxic to embryos. F0 mice with the successful integration of the various base editors were intercrossed to establish transgenic mouse founders, with the cross-mating of founders yielding four transgenic mouse lines expressing GFP, BE3, YE1-BE3-FNLS, or ABE7.10F148A (Fig. 1d). Analysis of transposon integration sites showed the transgenes were integrated into the intergenic or intronic regions at different locations across the mouse genome (Supplementary Fig. 1b, c, Supplementary Table 2). The average copy numbers in GFP, BE3, YE1-BE3-FNLS and ABE7.10F148A transgenic mice were 3.3, 4.3, 3.7 and 2.3, respectively, no statistical difference was observed among different groups (Supplementary Fig. 1d, e).
We next performed RNA-seq analyses of eight tissues (brain, lung, heart, liver, kidney, ovary, muscle and adipose) from 8-week-old GFP, BE3, YE1-BE3-FNLS, and ABE7.10F148A transgenic mice. The mRNA expression of all three base editors were highest in muscles, followed by the heart, and low in other tissues, and we noted that YE1-BE3-FNLS was expressed at higher levels than BE3 or ABE7.10F148A across various tissues (Fig. 1e). Similar expression patterns of these base editors across various tissues were observed in immunoblotting of extracts from organs (Supplementary Fig. 1f). These findings are in line with previous reports of particularly high expression of transgenes in muscles and in the heart15,16, and may reflect the preferential activation of CAG promoter in these organs. Despite of the high expression of base editors, we observed no obvious morphological abnormalities in any of the examined organs of the 8-week-old transgenic mice (Supplementary Fig. 2a).
We next examined the on-target editing performance in vivo in the base editor (BE) transgenic mouse lines. For the BE3 and YE1-BE3-FNLS mouse lines, we constructed a U6-Hpd sgRNA vector containing an sgRNA for C-to-T editing of the Hpd locus (Supplementary Table 3), which results in a stop codon that can rescue the lethal phenotype of hereditary tyrosinemia type 1 in mice17. This vector was packaged into AAV2 serotype eight particles (AAV8) and delivered to 8-week-old CBE transgenic mice by tail-vein injection (Fig. 1f). Two weeks after delivery, we conducted Sanger and targeted deep sequencing to examine on-target editing in hepatocytes isolated from transgenic mice by FACS (Supplementary Fig. 2b) and detected successful C-to-T editing of the targeted site in ~20% and ~30% of the BE3 and YE1-BE3-FNLS hepatocytes, respectively (Fig. 1g, h, Supplementary Table 4).
We assessed A-to-G base editing activity of ABE7.10F148A transgenic mice using an sgRNA targeting the Dmd gene (Supplementary Table 3), whose dysfunction results in Dunchenne muscular dystrophy (DMD)18. Specifically, the U6-Dmd sgRNA was packaged into AAV9 and delivered to the tibialis anterior (TA) muscle of 8-week-old ABE7.10F148A mice by localized intramuscular injection19 (Fig. 1i). We collected TA muscles from 3 ABE7.10F148A mice two weeks after delivery and found an average of 10.5% A-to-G editing at A6 of the targeted site by Sanger and targeted deep sequencing (Fig. 1j, k, Supplementary Table 4). Taken together, these results support that the genomically integrated BE3, YE1-BE3-FNLS, and ABE7.10F148A of our transgenic mouse lines can successfully perform sgRNA-directed on-target editing in vivo.
Base editors induced substantial genome-wide mutations in vivo
To evaluate whether the BE3, YE1-BE3-FNLS and ABE7.10F148A editors introduce off-target edits in the transgenic mice, we performed whole genome sequencing (WGS) analysis of gDNA extracted from muscles (i.e., an organ with high base editor expression). To eliminate the influence of genetic background, we also performed WGS on the same tissues from GFP transgenic and wild-type (WT) mice. Mutations were called by three algorithms in the transgenic samples using the WT samples as the reference. Unlike few off-target mutations revealed by previous in vivo studies19–23, we found an average of 1353 SNVs in muscles of BE3 transgenic mice, which is 5 times higher than that in the GFP mice (Fig. 2a). These mutations were specifically observed in the transgenic mice rather than in the WT mice (Supplementary Fig. 3a). In addition, the numbers of C-to-T or G-to-A mutations were highest in the BE3 transgenic mice, and much higher than those in the GFP group (Fig. 2b, Supplementary Fig. 3b). This mutation bias was the same as that of cytosine deaminase rAPOBEC11, suggesting these mutations were induced by BE3 expression. We also detected a significantly higher number of indels and structural variations (SVs) in the BE3 group than the GFP group (Supplementary Fig. 3c, d). By contrast, there were no differences in the numbers of SNVs or indels between the GFP and BE transgenic mice expressing high fidelity YE1-BE3-FNLS or ABE7.10F148A (Fig. 2a, b and Supplementary Fig. 3c, d).
To further eliminate the interference of genetic background, we next examined the de novo off-target mutations induced by constant expression of base editors in 43 parent-offspring trios. We used the PB transposition system to generate another transgenic mouse line with a genomically integrated sgRNA cassette targeting the Tyrosinase (Tyr) gene for pigmentation5,8 (Fig. 2c and Supplementary Table 3). The targeted C-to-T editing by CBE directed by this sgRNA has been shown to introduce a stop codon at the Tyr gene that results in an Albino phenotype5,8. We then established four breeding pairs by crossing the GFP and BE transgenic male mice (BE3, YE1-BE3-FNLS or ABE7.10F148A) with Tyr sgRNA transgenic female mice, and 4 pairs of the reciprocal cross. The offspring of BE3 or YE1-BE3-FNLS × Tyr sgRNA mice showed black-white mosaic hair phenotypes (Fig. 2d), indicating the efficient editing of CBEs at the Tyr gene. We next performed WGS on gDNA extracted from tails of the progenies expressing both base editors and Tyr sgRNA, and confirmed the on-target editing efficiency of BE3 (26%), YE1-BE3-FNLS (54%), and ABE7.10F148A (31%) at the Tyr locus (Fig. 2e).
To obtain the de novo generated mutations in the progenies, we performed WGS analysis on gDNA extracted from tails of 56 mice from 43 parent-offspring trios and called de novo variants in the offspring by filtering those shared with their parents or WT mice. Notably, we found that the number of de novo SNVs in the BE3 group was significantly higher than that in the GFP group (Fig. 2f, g and Supplementary Fig. 4a). The number of de novo SNVs in the YE1-BE3-FNLS and ABE7.10F148A group was also slightly higher than that of the GFP samples when the BE transgenic mice were the mother (Fig. 2g). Interestingly, we noted that a higher number of de novo SNVs were observed in the offspring from crosses in which the BE transgenic mice were the father (Fig. 2f, g). A similar trend was observed in the GFP group, consistent with previous studies showing a paternal bias for de novo mutations24. The de novo mutations in the BE transgenic mice were evenly distributed across the chromosomes (Supplementary Fig. 4b).
Strikingly, the percentages of C-to-T and G-to-A mutations in the BE3 and YE1-BE3-FNLS transgenic mice were significantly higher than that of the GFP group (Fig. 2h and Supplementary Fig. 4c). Moreover, we found a consensus motif WCW (W = A or T) from the de novo mutations identified in the BE3 and YE1-BE3-FNLS transgenic mouse lines (Fig. 2i), consistent with the mutation preferences of the deaminase rAPOBEC1 in CBE4,6. Together, our results demonstrated that the expression of BE3 induces off-target DNA mutations in vivo, while the high fidelity YE1-BE3-FNLS and ABE7.10F148A induces few off-target DNA mutations.
Base editors induced transcriptome-wide mutations in vivo
We also assessed the off-target effects of BE3, YE1-BE3-FNLS, and ABE7.10F148A based on RNA-seq of the transcriptomes of eight tissues from 15 transgenic and WT mice. Compared with the GFP mice, all eight examined tissues of the BE3 transgenic mice showed higher numbers of RNA SNVs, and five of which showed significant differences (P < 0.05; Supplementary Fig. 5a). Among the eight tissues, the highest number of RNA SNVs were found in muscles (Fig. 3a). In addition, we found a positive correlation (R2 = 0.4) between the RNA SNV numbers and the expression level of BE3 across various tissues in the transgenic mice (Fig. 3b). Moreover, genes with higher expression levels showed higher numbers of RNV SNVs in muscles of the BE3 mice (Fig. 3c), suggesting that highly expressed genes are prone to harbor RNA SNVs. YE1-BE3-FNLS muscles also had more RNA SNVs compared to GFP muscles, while the number of RNA mutations in muscles of the ABE7.10F148A transgenic mice were similar to the GFP mice (Fig. 3a and Supplementary Fig. 5a).
An average of 51% and 34% RNA SNVs detected in the BE3 and YE1-BE3-FNLS transgenic mice were shared by more than one sample in the group, respectively (Fig. 3d), and these common SNVs were significantly enriched in highly expressed genes (Supplementary Fig. 5b). Moreover, 88% and 68% of the RNA SNVs identified in the muscles of BE3 and YE1-BE3-FNLS transgenic mice were mutated from C-to-U or G-to-A, which was not observed in the GFP (24%) or the ABE7.10F148A (27%) group (Fig. 3e). The C-to-U and G-to-A mutation preference was also observed in the heart showing high expression of BE3 (Supplementary Fig. 5c and Fig. 1e). Consistently, we also found a motif WCW (W = A or T) enrichment of RNA SNVs in the muscles of BE3 and YE1-BE3-FNLS transgenic mice (Fig. 3f). Sanger sequencing verified the presence of C to U or G to A mutations on the RNA but not DNA level (Supplementary Fig. 6 and Supplementary Table 5).
Given the extensive off-target edits in RNA level, we then examined the adverse effect of these off-target SNVs by KEGG analysis. The results showed that genes with RNA SNVs in muscles of the BE3 mice revealed enrichment for insulin signaling, lipid metabolism and cancer related pathways (Fig. 3g). Taken together, these results indicate that BE3 and YE1-BE3-FNLS induced RNA SNVs in vivo, and the number of RNA SNVs is positively correlated with the expression levels of the base editor.
Abnormal phenotypes were observed in transgenic mice with CBE overexpression
Considering the substantial off-target DNA and RNA effects of base editors, we next monitored the phenotypes of ~400 transgenic mice over 15 months to explore the potential side effects of long-term BE expression. We firstly analyzed the fertility of the transgenic mice and found no difference in the total numbers of litters, pups per litter, gender ratio and birth weights between BE and GFP transgenic mice (Supplementary Fig. 7a–d). However, we found a significantly higher proportion of BE3 transgenic mice showed abnormal phenotypes including obesity and developmental delay (Fig. 4a). Notably, most of these mice with developmental delay lived no longer than 12 weeks (Fig. 4b). The monitoring on body weight over 66 weeks revealed that BE3 transgenic mice tended to be significantly heavier than GFP group starting from 16 weeks in both sexes (Fig. 4c and Supplementary Fig. 7e–f). Specifically, the body weights of BE3 transgenic mice reached 45.6 ± 6.5 (mean ± s.e.m.) grams (g) in male and 32.7 ± 5.7 g in females at 30 weeks, significantly heavier than those in the GFP mice (34.4 ± 4.7 g in males and 26.0 ± 2.8 g in females) (Fig. 4d). The YE1-BE3-FNLS female mice also showed evidently higher weight gain than the GFP female mice after 30 weeks (Fig. 4c and Supplementary Fig. 7f). The average weights of ABE7.10F148A transgenic mice showed no difference compared with GFP mice in either sex (Fig. 4c and Supplementary Fig. 7e–f).
To further characterize the roles of BE3 expression in transgenic mice with obesity, we collected inguinal adipose and liver tissues from 30-week-old BE3 and GFP male mice (Fig. 4e). Compared with GFP group, BE3 transgenic mice showed larger size (2.8 g vs 0.9 g) and significantly heavier weight (2.9 g vs 1.4 g) of the inguinal adipose and liver tissues (Fig. 4f, g). Histopathological analysis showed that the adipocyte sizes were significantly larger in the adipose tissues of BE3 transgenic mice (Fig. 4h). Besides, fatty deposits and hepatic dysplasia were observed in livers of BE3 mice by hematoxylin and eosin (H&E) staining (Fig. 4i). We then performed RNA-seq on adipose tissues from the BE3 and GFP transgenic mice, and principal component analysis (PCA) revealed a clear distinction of transcriptome profiles between BE3 and GFP transgenic mice (Supplementary Fig. 8a). Further differential expression analysis identified 2765 upregulated and 1647 downregulated genes in the BE3 transgenic mice (Supplementary Fig. 8b). The downregulated genes were enriched in fatty acid and lipid metabolic pathways (Supplementary Fig. 8c).
To explore the underlying mechanism of the obesity phenotypes, we next performed WGS on adipose tissues from the BE3 and GFP transgenic mice. A significantly higher number of DNA mutations was identified on the genome of BE3 in comparison with the GFP transgenic mice (Fig. 4j). None of these mutations were located in the coding sequence of monogenic obesity genes25, but they were significantly enriched in the metabolic processes, insulin secretion, lipid localization and storage pathways (Supplementary Fig. 9a). In addition, we randomly selected three BE3 transgenic mice showing obesity (60.3 g on average) and three with normal weight phenotype (34.3 g on average) (Supplementary Fig. 9b), we firstly analyzed the targeted deep sequencing data to identify the integration sites and copy numbers of BE3 in the genome. The results showed that the average copy numbers in obesity were similar between obesity and normal weight groups (4.7 vs. 5.0; P = 0.77) (Supplementary Fig. 9c). In addition, BE3 was integrated into the intergenic or intronic regions of the mouse genome in both overweight and normal weight mice, and all the integration sites in overweight group were overlapped with those with normal weight (Supplementary Table 6 and Supplementary Fig. 9d). These results suggested that the abnormal phenotype was not associated with the copy numbers or integration sites of BE3 in mice. We next compared the number of DNA SNVs from WGS data between the obesity and normal weight mice and found no significant difference (Fig. 4k). Then we analyzed the RNA-seq data, the mRNA expression analysis of BE3 showed no significant difference between two groups in muscle, adipose and liver tissues (Supplementary Fig. 9e). Notably, the numbers of RNA SNVs in the liver of mice with obesity were significantly higher than those with normal weight (Fig. 4l). Furthermore, we performed GO and KEGG pathway enrichment analysis of genes containing the RNA SNVs uniquely identified in the liver of obesity mice, and found these genes were significantly enriched in fatty acid metabolic process, lipid catabolic process, and carbon metabolism pathways (Supplementary Fig. 9f, j). Specifically, we found off-target C-to-T editing in obesity mice generated premature stop codons and non-synonymous mutations on obesity and metabolic disorder-associated genes, e.g., Igfbp2, C3, Apoe, and Apoc3 (Fig. 4m). Obesity has been reported as a complex trait caused by multiple genomic sites25,26, our results suggested that a higher number of BE3 induced off-target RNA mutations on obesity-associated genes might be responsible for the obesity phenotype in mice with BE3 overexpression.
We also found that two dead mice with developmental delay in the BE3 group carried 5 times higher number of DNA mutations than that of the GFP mice (Fig. 4b and Supplementary Fig. 10a), and these mutations located genes were significantly enriched in the nervous system, cell and tissue development-related pathways (Supplementary Fig. 10b). Taken together, the abnormal phenotypes including obesity and developmental delay observed in transgenic mice might possibly be caused by the accumulation of BE3 induced off-target mutations.
Discussion
In gene therapy, using the leading AAV delivery platform, BEs could maintain in cells for 6–12 months or even for many years in vivo27. Therefore, long-term and comprehensive evaluation for off-target effects of BEs is necessary for their clinical translation. Here we established a SAFETI method to evaluate the comprehensive and long-term effects of base editors on diverse tissues in vivo by generating transgenic mice. The SAFETI system has several advances in comparison with the previous off-target detection methods. Firstly, it could evaluate the genome-wide and transcriptome-wide off-target mutations in diverse tissues with high sensitivity. In fact, the high-fidelity version of CBE, YE1-BE3-FNLS, was also found to induce numerous RNA mutations using the SAFETI system. Secondly, the SAFETI method could evaluate the phenotypic influence of base editors in vivo. We observed abnormal phenotypes including overweight and developmental delay in mice with permanent CBE overexpression. Thirdly, this system enabled us to monitor the long-term effects of BE expression in vivo. We found RNA SNVs located in tumor-associated genes, which was consistent with the results in HEK293T. While, we observed no tumor occurrence in the transgenic mice during the 15 months monitoring, suggesting the off-target RNA SNVs are not necessarily for the tumor development in vivo. This phenomenon also indicates the importance to detect the off-target effects for safety evaluation in vivo, and longer time monitoring might better answer the relationship between BE expression and aging-related diseases, e.g. cancer. In summary, our SAFETI system is a universal method to systematically evaluate the off-target effects of gene editing tools in vivo, including the newly developed optimized base editors ABE8e28, ABE8e-N108Q/L145T29, programmable C-to-G base editors30–33, and prime editors34,35.
Using the SAFETI system, we found that permanent genomic BE3 overexpression induced off-target SNVs with adverse phenotypic effects in transgenic mice, which was an overlooked aspect of the potential side effects of base editors. The obesity and developmental delay phenotypes were possibly be caused by the accumulation of off-target mutations induced by base editing in vivo. These off-target mutations were randomly generated and might occur in many genes, the interaction of which might induce complex diseases36,37. Nevertheless, the adverse phenotypic effect might be induced by permanent BE3 overexpression in all cells of the body from early development onwards. This raised the concern of long-term expression of base editors in the therapeutic applications, so we next delivered the base editors into mice with clinically relevant AAV delivery method. We examined the transgene copy numbers, transgene expression and off-target RNA SNVs in mice with YE1-BE3-FNLS delivered by AAV, and found lower copy numbers and expression levels of the transgene as well as lower numbers of off-target RNA SNVs as compared to those in YE1-BE3-FNLS transgenic mice (Supplementary Fig. 11). In combination with our results showing that the number of off-target mutations was positively correlated with BE3 expression level, somatic therapies where base editors are expressed in a tissue-specific manner at lower doses may not lead to abnormal phenotypes.
It is noteworthy that the expression of YE1-BE3-FNLS was much higher than BE3, but YE1-BE3-FNLS induced much fewer DNA and RNA mutations and no severe phenotypic effects. Similarly, the RNA SNV numbers in ABE7.10F148A mouse line were similar to those in GFP group. These results demonstrated that the high-fidelity versions of both CBE (YE1 mutation in rAPOBEC1) and ABE (F148A mutation in TadA) could significantly reduce the off-target RNA SNVs in vivo as well as in vitro. With the rapid development of newly engineered CBEs or ABEs with low off-target deamination, further studies are needed for evaluating their specificity in vivo.
In this work, we develop SAFETI method for systematically evaluating the off-target effects of gene editing tools in vivo, and our results point directions to lower off-target effects of gene editing tools in gene therapy by future protein engineering for high-fidelity deaminases or by transient delivery methods.
Methods
Ethics statement
The study protocol of mouse care and experiments was approved by the guideline of the Life Sciences Ethics Committee of Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences. All animal studies complied with relevant ethical regulations for animal testing and research.
Generation of the transgenic mice
We generated seven mouse lines expressing GFP, BE3, YE1-BE3-FNLS, hA3A-BE3, ABE7.10, ABE7.10F148A, or Tyr sgRNA using piggyBac transposon integration system in the zygotes of C57BL/6J strain using procedures previously described14. The transposon vectors for GFP and base editors consist of a CAG promoter followed by one of the base editors linked via P2A peptide to an enhanced green fluorescent protein (eGFP). The Tyr sgRNA transposon vector included mCherry followed by U6-Tyr sgRNA cassette. The transposon vectors were injected into C57BL/6J zygotes together with PBase mRNA. Then the zygotes were cultured in KSOM medium at 37 °C under 5% CO2 in air for 24 h, and then transferred into oviducts of the pseudopregnant female ICR mice. We verified the successful integration of transgenes by fluorescence detection and PCR genotyping with DNA purified from tails of the offspring (Primers are detailed in Supplementary Table 1). Then the mice with successful integration of the corresponding transgenes were intercrossed to establish the germline-transmitted founders. Cross-mating of founders resulted in transgenic mouse lines expressing GFP or one of the base editors.
Analysis of transposon integration sites and copy numbers by HTS sequencing and PCR
18 samples of tails were taken from 8-week-old GFP, BE3, YE1-BE3-FNLS and ABE7.10F148A mice, 30-week-old normal and obese BE3 mouse for gDNA extraction. Integration sites were recovered from gDNA by shearing DNA with sonication, ligating DNA linkers to the broken DNA ends, then carrying out nested PCR amplification using sample-specific primers (Supplementary Data. 2) that bound to the ligated adapter and the 5-arm/3-arm sequences in PB transposon vectors, as previously reported38. PCR products were purified using universal DNA purification kit (TIANGEN) for HTS sequencing. The reads were mapped to the sequence of 5-arm or 3-arm, and the unmapped side sequence of the splited reads were mapped to mouse genome. The integration sites were predicted from mapping position. The predicted integration sites at 5 arm or 3 arm terminals were separately confirmed by integration site-specific primers (Supplementary Data. 3). Then integration site-specific forward and reverse primers which binding genome sequence were used to identify single or biallelic integration (Supplementary Data 3). The total number of single (considered as one copy) and biallelic (considered as two copies) integration sites was calculated as copy number in each sample.
RNA isolation and RNA-seq
Total RNA of each sample was extracted using Trizol (Ambion) according to the standard protocol. The mRNAs were fragmented and converted to cDNA using random hexamers or oligo(dT) primers. The 5′ and 3′ ends of cDNA were ligated with adaptors, and correctly ligated cDNA fragments were enriched and amplified by PCR4. High-throughput mRNA sequencing was carried out using Illumina Novaseq 6000 and ~22 million reads were produced for each library.
FastQC (v0.11.8) and Trimmomatic (v0.39)39 were used for quality control. Qualified reads were mapped to the mouse reference genome (mm10) by STAR (v2.1.7)40 with two pass model. The BAM files were then sorted and PCR duplicates were marked using Picard (v2.5.7.0). The variants were called by GATK (v4.1.5). The variants with depth <20, quality score <30, QD < 2 and FS > 30 were filtered. We filtered variants detected in the same organ of WT mice, and those in the dbSNP142 dataset or located in highly complex regions from the following analysis. We also filtered out variants found in the corresponding DNA variants to rule out the influence of genetic background if we performed WGS on the same sample. To make functional annotations for the identified variants, we applied annovar (v2020-06-08) to annotate the detected SNVs and indels using RefSeq database of mm10. The adjacent 3-bp sequences of the off-target RNA SNVs were extracted from the reference genome and subjected to motif prediction using ggseqlogo41. HTseq (v0.11.3) and GenomicFeatures were used to estimate the gene-expression levels on the alignment file with default parameters and gene abundances were reported in FPKM (Fragments Per Kilobase of exon model per Million mapped fragments)42.
The differential expression analysis was conducted using DESeq243, and differentially expressed genes were defined using the criteria: absolute log2 transformed fold-change > 1 and P value < 0.05. The GO and KEGG enrichment analysis were performed using clusterProfiler44.
Protein isolation and western blot
Proteins were isolated from the examined tissues using radioimmunoprecipitation assay lysis buffer (P0013C; Beyotime) and quantified using a BCA assay (P0012; Beyotime). Equal amounts of proteins (50–100 µg) were electrophoresed on SDS‐PAGE and transferred to PVDF membranes (ISFQ00010; Merck-Millipore), blocked with 5% nonfat milk for at least 1 h at room temperature, and reacted with primary antibodies including Cas9 (7A9-3A3) mouse mAb (14697S, Cell Signaling Technology, 1:1000) and HRP-conjugated Beta Actin (2D4H5) monoclonal antibody (HRP-66009, Proteintech, 1:5000) overnight at 4 °C. The PVDF membrane was then washed in Tris-buffered saline containing 0.1% Tween-20 (TBST) and incubated for 1 h with horseradish peroxidase-conjugated goat anti-mouse IgG (SA00001-1, Proteintech, 1:5000). After washing with TBST, the membrane was treated with the Pierce™ ECL western blot analysis substrate (32109; Thermo Fisher Scientific).
Histological staining
Tissues were fixed in 4% PFA overnight, washed with PBS, dehydrated in a gradient series of alcohol solutions, and embedded in paraffin. These paraffin-embedded tissues were then sectioned (5 μm) before being deparaffinized and rehydrated. Tissue sections were stained with hematoxylin, incubated in bluing solution, counterstained with eosin, dehydrated, and equilibrated with xylene. Glass coverslips were mounted with Permount Mounting Media (SP15-100; Thermo Fisher Scientific). Sections were photographed under bright-field microscope photograph system (Leica Microsystems).
The AAVs cloning, production, and in vivo injection
The AAV vectors include U6-Hpd sgRNA vector, U6-Dmd sgRNA vector, N-YE1-BE3-FNLS vector, and C-YE1-BE3-FNLS vector between AAV serotype 2 ITRs. U6-Hpd sgRNA and U6-Dmd sgRNA vector contain the U6 promoter for Hpd or Dmd sgRNA transcription, CAG promoter, tdTomato, WPRE and SV40 poly(A) sequence. N-YE1-BE3-FNLS vector contain U6 promoter for Hpd sgRNA transcription, EFS promoter, YE1-rApobec1, N-terminal nCas9 (573 sites), N-intein and SV40 poly(A). C-YE1-BE3-FNLS vector contain C-intein, C-terminal nCas9 (574 sites), P2A, eGFP and bGH poly(A). The AAV vectors together with helper and pAAV8 or pAAV9 plasmids were transfected into HEK293T cells using polyethyleneimine (PEI). Viral particles from the media and cells were harvested, purified, and concentrated 3–5 d after transfection. AAV titers were determined by quantitative real-time PCR assays.
The 8-week-old CBE transgenic mice were injected intravenously with 5 × 1011 viral genomes (vg) of AAV8: U6-Hpd sgRNA per mouse via tail vein. The 8-week-old C57BL/6J mice were injected intravenously with 5 × 1011 vg of AAV8:N-YE1-BE3-FNLS and 5 × 1011 vg AAV8:C-YE1-BE3-FNLS per mouse via tail vein. CBE transgenic mice were sacrificed two weeks post-injection of AAV8:U6-Hpd sgRNA, and primary hepatocytes were isolated by standard two-step collagenase perfusion method purified by 40% Percoll (Sigma) at low-speed centrifugation (1000 rpm, 10 min)45. The GFP and tdTomato-positive hepatocytes were isolated using FACS (Flow Jo v10.0.7) for PCR analysis. C57BL/6J mice were sacrificed at 1-, 2-, and 3-week post-injection of AAV8:N-YE1-BE3-FNLS and AAV8:C-YE1-BE3-FNLS, and livers were sampled for droplet digital PCR and RNA-seq. The 8-week-old ABE transgenic mice were locally administered with 1 × 1011 vg (per leg per mouse) into the tibialis anterior (TA) muscle. TA muscles were collected after two weeks of AAV injection for Sanger and deep-sequencing analyses.
Amplification and HTS sequencing of genomic DNA samples
Genomic DNA of hepatocytes or target tissues was isolated using the DNeasy Blood and Tissue Kit (Qiagen). The genome sequences of targeted sites were amplified by nested PCR using site-specific primers (Supplementary Table 4). PCR products were purified using universal DNA purification kit (TIANGEN). The amplicons were ligated to adapters and sequenced using Illumina Novaseq 6000. Sequencing data were analyzed according to the method reported before8. Briefly, the raw data were demultiplexed by fastq-multx46, and the reads for each sample were aligned to the reference target sequence by CRISPResso2 (v2.0.32)47. The on-target editing efficiency for each target site was then calculated using in-house scripts.
WGS on muscle and data analysis
Genomic DNA was isolated from different tissues using the DNeasy Blood and Tissue Kit (Qiagen). WGS was performed at mean coverages of 36× by MGI2000. The raw data were qualified by SOAPnupk (v2.1.6)48 with default parameters. Qualified reads were mapped to the mouse reference genome (mm10) by BWA mem (v0.7.12)49. The BAM files were then sorted and PCR duplicates were marked using Picard (v2.5.7.0). To identify the genome wide de novo variants with high confidence, we conducted single nucleotide variation calling using three algorithms, Mutect2 (v4.1.5), Lofreq (v2.1.5)50, and Strelka (v2.7.1)51. Whole genome de novo indels were detected using Mutect2, Scalpel (v0.5.4)52, and Strelka in the same way. SNVs and indels supported by all the three algorithms were reserved for the following analysis. For detection of de novo mutations in the offspring, we called variants in each progeny with the parents as normal control.
We filtered variations in the UCSC dbSNP142 dataset or located in high complex regions including UCSC repeat regions from the following analysis. To reduce the germline variations, the variations shared between two individuals expressing the same base editors were also filtered out. To make functional annotations for the identified variants, we applied annovar (v2020-06-08) on both the detected SNVs and indels using RefSeq database of mm10.
The structure variations (SVs) were called by Manta (v1.6.0)53, Lumpy (v0.2.13)54, and Delly (v0.7.6)55 using default parameters. To identify the genome-wide SVs with high confidence, we only reserved those supported by all the three algorithms for the following analysis. To reduce the germline SVs, the SVs shared between two individuals expressing the same base editors were filtered out.
Phenotypes detection of the transgenic mice
8–10 weeks old of males and females from WT, GFP, BE3, YE1-BE3-FNLS or ABE7.10F148A group were intercrossed. Their offspring identified with successfully integration with the corresponding transgene were monitored and weighed. During the preweaning period before 3 weeks, both males and females were weighed together. After weaning after 3 weeks, male and female mice were put into separate cages for continue observation and body weight measurements every two weeks over 66 weeks. For evaluation of fertility, three breeding pairs of the WT and four transgenic mice at 8–10 weeks old were mated for 6 months. The number of pups per litter, total number of pups and total numbers of both males and females were recorded.
Droplet digital PCR reactions
For comparison of transgene copy numbers in the transgenic mice and in mice with AAV delivery methods, the transgene copy numbers in mice injected with AAV8:N-YE1-BE3-FNLS and AAV8:C-YE1-BE3-FNLS and in YE1-BE3-FNLS transgenic mice were measured by droplet digital PCR using Bio-rad QX200 ddPCR system (Bio-Rad)56. Vector genomes were normalized to mouse Sod256. Sequences of primers and probes are listed in Supplementary Table 7.
Statistics and reproducibility
Sample sizes are reported in each figure legend. Data are presented as means ± SEM. P values were calculated by two-sided, unpaired t-test. Differences with a P value < 0.05 were considered to be statistically. Statistical details of analyses can be found in the figure legends. No statistical method was used to predetermine the sample size. No data were excluded from the analyses. The experiments were not randomized. The Investigators were not blinded to allocation during experiments and outcome assessment.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank Sanwen Huang for helpful discussions and comments on this manuscript and all the colleagues in Zuo’s lab for their technical assistance and helpful discussions. This study was financially supported by the National Natural Science Foundation of China (Grant number 31922048 to E.Z., 32100487 to Y.S., 81870187 to W.W., 32101223 to N.Y., 32200449 to T.Y.), the Agricultural Science and Technology Innovation Program to E.Z., National Key R&D Program of China (Grant number 2021YFF0703802 to W.W., 2021YFC0863400 to T.Y.), China Postdoctoral Science Foundation (Grant number 2021M693452 to YS.S.), and the National Natural Science Foundation of Guangdong (Grant number 2022A1515012142 to YS.S.). This work was also supported by China National GeneBank (CNGB).
Source data
Author contributions
E.Z., N.Y., Y.S., and W.W. designed the study. H.F., E.Z., N.Y., Y.S., and W.W. performed data analysis. E.Z. generated the transgenic mice. N.Y., Y.X. YS.S., and H.Z. carried out the cloning, in vitro studies. YS.S. performed the AAVs production and mouse experiments. N.Y., YS.S., H.L., J.Z., Z.Z., T.Y., N.L., and L.X. performed phenotypic analyses. Y.S., W.W., and H.F. performed WGS, RNA-seq, and HTS analysis. E.Z., Y.S., and W.W. supervised the project. N.Y., E.Z., Y.S., W.W., and C.H. wrote the paper.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
The raw data were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive database under accession code PRJNA831302. All raw sequence data are also available in the China National GeneBank DataBase (CNGBdb) with the accession number CNP0002932. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Nana Yan, Hu Feng, Yongsen Sun, Ying Xin, Haihang Zhang.
Contributor Information
Wu Wei, Email: wuwei@lglab.ac.cn.
Yidi Sun, Email: ydsun@ion.ac.cn.
Erwei Zuo, Email: zuoerwei@caas.cn.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-37508-7.
References
- 1.Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016;533:420–424. doi: 10.1038/nature17946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nishida K, et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science. 2016;353:aaf8729. doi: 10.1126/science.aaf8729. [DOI] [PubMed] [Google Scholar]
- 3.Gaudelli NM, et al. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature. 2017;551:464–471. doi: 10.1038/nature24644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zhou C, et al. Off-target RNA mutation induced by DNA base editing and its elimination by mutagenesis. Nature. 2019;571:275–278. doi: 10.1038/s41586-019-1314-0. [DOI] [PubMed] [Google Scholar]
- 5.Zuo E, et al. Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos. Science. 2019;364:289–292. doi: 10.1126/science.aav9973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Grünewald J, et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. Nature. 2019;569:433–437. doi: 10.1038/s41586-019-1161-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jin S, et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science. 2019;364:292–295. doi: 10.1126/science.aaw7166. [DOI] [PubMed] [Google Scholar]
- 8.Zuo E, et al. A rationally engineered cytosine base editor retains high on-target activity while reducing both DNA and RNA off-target effects. Nat. Methods. 2020;17:600–604. doi: 10.1038/s41592-020-0832-x. [DOI] [PubMed] [Google Scholar]
- 9.Doman JL, Raguram A, Newby GA, Liu DR. Evaluation and minimization of Cas9-independent off-target DNA editing by cytosine base editors. Nat. Biotechnol. 2020;38:620–628. doi: 10.1038/s41587-020-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Grünewald J, et al. CRISPR DNA base editors with reduced RNA off-target and self-editing activities. Nat. Biotechnol. 2019;37:1041–1048. doi: 10.1038/s41587-019-0236-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Berns KI, Muzyczka N. AAV: an overview of unanswered questions. Hum. Gene Ther. 2017;28:308–313. doi: 10.1089/hum.2017.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kotterman MA, Chalberg TW, Schaffer DV. Viral vectors for gene therapy: translational and clinical outlook. Annu. Rev. Biomed. Eng. 2015;17:63–89. doi: 10.1146/annurev-bioeng-071813-104938. [DOI] [PubMed] [Google Scholar]
- 13.Wang X, et al. Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. Nat. Biotechnol. 2018;36:946–949. doi: 10.1038/nbt.4198. [DOI] [PubMed] [Google Scholar]
- 14.Ding S, et al. Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice. Cell. 2005;122:473–483. doi: 10.1016/j.cell.2005.07.013. [DOI] [PubMed] [Google Scholar]
- 15.Gemberling MP, et al. Transgenic mice for in vivo epigenome editing with CRISPR-based systems. Nat. Methods. 2021;18:965–974. doi: 10.1038/s41592-021-01207-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ikawa M, et al. Green fluorescent protein as a marker in transgenic mice. Dev., Growth Differ. 1995;37:455–459. doi: 10.1046/j.1440-169X.1995.t01-2-00012.x. [DOI] [PubMed] [Google Scholar]
- 17.Rossidis AC, et al. In utero CRISPR-mediated therapeutic editing of metabolic genes. Nat. Med. 2018;24:1513–1518. doi: 10.1038/s41591-018-0184-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Liang P, et al. Effective and precise adenine base editing in mouse zygotes. Protein Cell. 2018;9:808–813. doi: 10.1007/s13238-018-0566-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ryu SM, et al. Adenine base editing in mouse embryos and an adult mouse model of Duchenne muscular dystrophy. Nat. Biotechnol. 2018;36:536–539. doi: 10.1038/nbt.4148. [DOI] [PubMed] [Google Scholar]
- 20.Villiger L, et al. In vivo cytidine base editing of hepatocytes without detectable off-target mutations in RNA and DNA. Nat. Biomed. Eng. 2021;5:179–189. doi: 10.1038/s41551-020-00671-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Villiger L, et al. Treatment of a metabolic liver disease by in vivo genome base editing in adult mice. Nat. Med. 2018;24:1519–1525. doi: 10.1038/s41591-018-0209-1. [DOI] [PubMed] [Google Scholar]
- 22.Koblan LW, et al. In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice. Nature. 2021;589:608–614. doi: 10.1038/s41586-020-03086-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yeh WH, et al. In vivo base editing restores sensory transduction and transiently improves auditory function in a mouse model of recessive deafness. Sci. Transl. Med. 2020;12:eaay9101. doi: 10.1126/scitranslmed.aay9101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lindsay SJ, Rahbari R, Kaplanis J, Keane T, Hurles ME. Similarities and differences in patterns of germline mutation between mice and humans. Nat. Commun. 2019;10:4053. doi: 10.1038/s41467-019-12023-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bouchard C. Genetics of obesity: what we have learned over decades of research. Obesity. 2021;29:802–820. doi: 10.1002/oby.23116. [DOI] [PubMed] [Google Scholar]
- 26.Akbari P, et al. Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity. Science. 2021;373:eabf8683. doi: 10.1126/science.abf8683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wang D, Zhang F, Gao G. CRISPR-based therapeutic genome editing: strategies and in vivo delivery by AAV vectors. Cell. 2020;181:136–150. doi: 10.1016/j.cell.2020.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Richter MF, et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 2020;38:883–891. doi: 10.1038/s41587-020-0453-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Chen L, et al. Engineering a precise adenine base editor with minimal bystander editing. Nat. Chem. Biol. 2023;19:101–110. doi: 10.1038/s41589-022-01163-8. [DOI] [PubMed] [Google Scholar]
- 30.Koblan LW, et al. Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning. Nat. Biotechnol. 2021;39:1414–1425. doi: 10.1038/s41587-021-00938-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kurt IC, et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotechnol. 2021;39:41–46. doi: 10.1038/s41587-020-0609-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhao D, et al. Glycosylase base editors enable C-to-A and C-to-G base changes. Nat. Biotechnol. 2021;39:35–40. doi: 10.1038/s41587-020-0592-2. [DOI] [PubMed] [Google Scholar]
- 33.Yuan T, et al. Optimization of C-to-G base editors with sequence context preference predictable by machine learning methods. Nat. Commun. 2021;12:4902. doi: 10.1038/s41467-021-25217-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Anzalone AV, et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature. 2019;576:149–157. doi: 10.1038/s41586-019-1711-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chen PJ, et al. Enhanced prime editing systems by manipulating cellular determinants of editing outcomes. Cell. 2021;184:5635–5652.e5629. doi: 10.1016/j.cell.2021.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Plomin R, Haworth CM, Davis OS. Common disorders are quantitative traits. Nat. Rev. Genet. 2009;10:872–878. doi: 10.1038/nrg2670. [DOI] [PubMed] [Google Scholar]
- 37.Mitchell KJ. What is complex about complex disorders? Genome Biol. 2012;13:237. doi: 10.1186/gb-2012-13-1-237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nguyen GN, et al. A long-term study of AAV gene therapy in dogs with hemophilia A identifies clonal expansions of transduced liver cells. Nat. Biotechnol. 2021;39:47–55. doi: 10.1038/s41587-020-0741-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics29, 15–21 (2013). [DOI] [PMC free article] [PubMed]
- 41.Wagih O. ggseqlogo: a versatile R package for drawing sequence logos. Bioinformatics. 2017;33:3645–3647. doi: 10.1093/bioinformatics/btx469. [DOI] [PubMed] [Google Scholar]
- 42.Putri GH, Anders S, Pyl PT, Pimanda JE, Zanini F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics. 2022;38:2943–2945. doi: 10.1093/bioinformatics/btac166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Huang P, et al. Induction of functional hepatocyte-like cells from mouse fibroblasts by defined factors. Nature. 2011;475:386–389. doi: 10.1038/nature10116. [DOI] [PubMed] [Google Scholar]
- 46.Aronesty, E. J. Comparison of sequencing utility programs. Open Bioinformatics J. 7, 1–8 (2013).
- 47.Clement K, et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 2019;37:224–226. doi: 10.1038/s41587-019-0032-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Chen Y, et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. GigaScience. 2018;7:1–6. doi: 10.1093/gigascience/gix120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Wilm A, et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 2012;40:11189–11201. doi: 10.1093/nar/gks918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Kim S, et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods. 2018;15:591–594. doi: 10.1038/s41592-018-0051-x. [DOI] [PubMed] [Google Scholar]
- 52.Fang H, et al. Indel variant analysis of short-read sequencing data with Scalpel. Nat. Protoc. 2016;11:2529–2548. doi: 10.1038/nprot.2016.150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Chen X, et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics. 2016;32:1220–1222. doi: 10.1093/bioinformatics/btv710. [DOI] [PubMed] [Google Scholar]
- 54.Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84. doi: 10.1186/gb-2014-15-6-r84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Rausch T, et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28:i333–i339. doi: 10.1093/bioinformatics/bts378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Codner GF, et al. Aneuploidy screening of embryonic stem cell clones by metaphase karyotyping and droplet digital polymerase chain reaction. BMC Cell Biol. 2016;17:30. doi: 10.1186/s12860-016-0108-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw data were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive database under accession code PRJNA831302. All raw sequence data are also available in the China National GeneBank DataBase (CNGBdb) with the accession number CNP0002932. Source data are provided with this paper.