Abstract
To correct a DNA mutation in the human genome for gene therapy, homology-directed repair (HDR) needs to be specific and have the lowest off-target effects to protect the human genome from deleterious mutations. Zinc finger nucleases, transcription activator-like effector nuclease (TALEN) and CRISPR-CAS9 systems have been engineered and used extensively to recognize and modify specific DNA sequences. Although TALEN and CRISPR/CAS9 could induce high levels of HDR in human cells, their genotoxicity was significantly higher. Here, we report the creation of a monomeric endonuclease that can recognize at least 33 bp by fusing the DNA-recognizing domain of TALEN (TALE) to a re-engineered homing endonuclease I-SceI. After sequentially re-engineering I-SceI to recognize 18 bp of the human β-globin sequence, the re-engineered I-SceI induced HDR in human cells. When the re-engineered I-SceI was fused to TALE (TALE-ISVB2), the chimeric endonuclease induced the same HDR rate at the human β-globin gene locus as that induced by TALEN, but significantly reduced genotoxicity. We further demonstrated that TALE-ISVB2 specifically targeted at the β-globin sequence in human hematopoietic stem cells. Therefore, this monomeric endonuclease has the potential to be used in therapeutic gene targeting in human cells.
INTRODUCTION
Gene therapy has been developed to eradicate genetic diseases, such as β thalassemia. However, conventional methods involving random insertions of viral vectors can cause chromosome translocations or mutations in the human genome, as shown in multiple gene therapy trials (1–3). To eliminate the risk of causing undesirable mutations in the human genome or changing the properties of cells, gene targeting specificity is essential. Sequence-specific endonucleases that can induce homologous repair at targeted sequences have been developed to fulfill this goal. Zinc finger nucleases (ZFNs) are the first generation of sequence-specific endonucleases that were developed for this reason. They are composed of a DNA-recognizing domain fused to a sequence non-specific endonuclease domain derived from FokI (4). Each zinc finger motif can recognize 3 bp of DNA in the major groove of the targeted sequence. By combining the left and right arms, ZFNs can recognize a range of sequence lengths. An alternative to ZFN that has been recently developed is TALEN. The overall design of these endonucleases is similar to that of ZFNs, in which both arms of the recognition motifs have to form a dimeric structure to cut targeted DNA sequences (5). The advantage of TALENs is the ease of designing and modifying the coded amino acids on the DNA-recognition motif. More recently, the CRISPR/CAS9 system, originally discovered in bacteria, is another sequence-specific endonuclease system that is widely used by researchers (6). Using a 20-bp-guided RNA complimentary to the targeting sequence, this system relies on CAS9 endonuclease to cut the targeted sequence and induces recombination (7). The ease of designing targeted sequences by modifying the guided RNA sequence instead of re-engineering amino acids, as required in ZFNs or TALEN, provides a more efficient method of gene targeting.
Although all three gene-targeting systems have been successfully used in targeting specific loci in human cells, off-target effects have also been reported, including a recent study that reported the off-target activity induced by CRISPR/CAS9 at the human β-globin gene (8–12). Researchers have improved the specificity of the nucleases by using various approaches (13–15), but it is important to determine if any of these systems have low genotoxicity in human cells. To develop a system that can correct mutations in the β-globin gene to treat β-thalassemia, we tested the gene-targeting efficiency of TALEN and CRISPR-CAS9 systems. Homology-directed repair (HDR) induced by these two systems are efficient, but the genotoxicity in the human cells tested was high. Therefore, we developed an alternative system that may reduce off-target effects. I-SceI, a monomeric homing endonuclease from the LAGLIDADG family, is one of the most specific natural nucleases discovered and has an 18-bp recognition sequence. It has been used to study DNA recombinations in human cells (16) and causes very low genome toxicity (17–19) because of the lack of DNA sequences recognized by I-SceI in the human genome. Here, we re-engineered I-SceI and fused it to TALE, creating a monomeric endonuclease that recognizes 33 bp of the human β globin gene sequence. The system was used to target DNA sequences close to mutation sites that are highly prevalent in β-thalassemia patients. The fusion protein increased the HDR rate to a similar extent as that induced by TALENs and shows significantly lower levels of genome toxicity than those associated with TALEN and CRISPR-CAS9 systems.
MATERIALS AND METHODS
Construction of vectors
Cas9D10A and gRNA plasmids were obtained from Addgene (plasmids #41816 and #41824) and the construction of gRNAs, which specifically target the β-globin gene, were generated as described (20). The targeting sequence is GAATAACAGTGATAATTTCTGG, as indicated on Figure 1a. The TALEN expression vector was a gift provided by Prof. Nieng Yan. We constructed the TALEN targeting system at the human β-globin gene as described previously (21), with the exception that the SpeI, NheI and HindIII restriction enzymes were replaced by BspEI, XmaI and SalI. Briefly, a single repeat-variable di-residue (RVD) unit was inserted into the P-easy-T3 vector (Transgene) by using the isocaudomers BspEI and XmaI, followed by SalI digestion and ligation to create a double unit RVD. Next, TALEN RVDs were constructed by assembling 14–17 repeat modules into intermediary arrays and joining the intermediary arrays into pCS2 vector (Addgene plasmid #16331). All TALEN and Cas9D10A constructs were tagged with a Flag epitope at the N-terminus.
Figure 1.
TALEN and CRISPR/CAS9D10A induced HDR at β-globin sequence in human cells. (a) Schematic diagram showing how the eGFP reporter system measures HDR rates induced by TALEN or CRISPR/CAS9D10A. eGFP codon region driven by the EF1α promoter is interrupted by the 53-bp human β-globin sequence (blue dotted box). Partial DNA sequence of the human β-globin gene (chromosome 11) is highlighted with the targeting locations of TALEN (blue). Nucleotide C (highlighted with red) indicates the locus of one of the most frequent mutations of β-thalassemia in Chinese patients, IVS-II654c-t. This reporter vector was integrated into the genomes of 293FT cells by using lentiviral transduction. The donor sequence containing a truncated eGFP fragment was transfected into cells along with TALEN expression vectors. The HDR events induced at the inserted sites induced eGFP expression in the cells, which was monitored using FACS. (b) FACS results of HDR assays using TALENs or CRISPR/CAS9D10A. The control were cells transfected with the truncated eGFP fragment (Ctrl). (c) The average percentage of HDR induced by TALEN or CRISPR/CAS9D10A was determined from three independent transfection experiments, and the error bars denote the standard deviation. No HDR event was detected in the control group of 100 000 cells (NA). (d and e) TALENs and CRISPR/CAS9D10A induced high levels of genotoxicity in 293FT cells. γH2AX and 53BP1 immunostaining of 293FT cells induced by I-SceI, TALEN or CRISPR/CAS9D10A. (f) Percentage of cells with high γH2AX expression determined using FACS for three independent cell populations. The error bars indicate the standard deviation. TALEN and CRISPR/CAS9D10A samples exhibited significantly higher levels of γH2AX than that of the control. The asterisk indicates the statistical significance between the sample and control (P-value < 0.05).
eGFP reporters for measuring HDR were constructed by inserting β-globin sequences into the GFP codon sequence at position 329. Fifty-three base pairs were inserted for TALEN and CRISPR/CAS9 assays, and 18 bp was inserted at this eGFP position for I-SceI assays in Figures 1a and 2a. The disrupted eGFP fragments and EF1α promoter fragment were LR (attL×attR) recombined into p2k7 vectors to create a lentiviral vector (22). Lentiviral reporters were used to prepare lentivirus to infect 293FT cells. The donor template for correcting the defective eGFP gene was constructed by cloning the eGFP gene into pENTRY/DTOPO (Invitrogen). To prevent the expression of eGFP from the donor plasmid, the start codon (first 3 bp) was removed.
Figure 2.
Re-engineering I-SceI to target the β-globin sequence in human cells. (a) Schematic diagram of the eGFP reporter with a modified I-SceI recognition sequence. Two nucleotides were modified in the first round of selection as shown in the listed sequences. (b) A computational model of I-SceI binding to its target DNA sequence based on crystal structure data deposited in the protein data bank, highlighting the interations of the four amino acids (N152, K193, R48 and N15) directly contacting the nucleotides. (c–f) Sequential screening of I-SceI variants, which induced the highest HDR at the indicated recognition sequences. Nineteen variants and the wild-type sequence were tested for their ability to induce HDR at the targeting sequence indicated at the upper right corner. The blue arrowhead indicates the original amino acid, and the red arrowhead indicates the selected amino acid that yields the highest HDR in each round of screening. Wild-type (Wt) I-SceI was included in all screenings (N in first round was the wild-type I-SceI).
I-SceI gene was amplified by polymerase chain reaction (PCR) from I-SceI-GR-GFP (Addgene plasmid #17654) (19) and inserted into vector pCS2. We modified I-SceI to recognize the β-globin sequence by changing amino acids in I-SceI and the recognized sequences as described in the results section. All I-SceI variants were generated by using site-directed mutagenesis and encoded 19 additional amino acids. In short, we constructed 76 variants of I-SceI after four rounds of selection with four different reporter vectors harboring the sequences indicated in Figure 2. For the TALE-I-SceI fusion protein, re-engineered I-SceI (ISVB2) was fused directly to the C-terminus of TALEN repeats recognizing the β-globin sequence.
Cell culture and fluorescence-activated cell sorting (FACS) analysis of HDR
293FT cells were cultured and passaged at 37°C with 5% CO2, in Dulbecco's modified Eagle's medium (DMEM) containing 10% fetal bovine serum (FBS), 0.1 mmol/l MEM non-essential amino acids, 2 mmol/l L-glutamine, 1% penicillin-streptomycin and 500 μg/ml geneticin (Invitrogen). 293FT cells were seeded in 6-well plates before transfection. Cells were transfected using Lipofectamine 2000 (Invitrogen) when confluency was within 80–90%, according to the manufacturer's protocol. For each well, the final concentration of each plasmid transfected was 500 ng/ml. The HDR was measured using FACS (FACSCalibur, BD) analysis of at least 100 000 cells after 36 h of transfection. One cell expressing eGFP was considered to be one successful HDR event.
Genotoxicity assays
Genotoxicity was examined at the level of γH2AX and 53BP1 foci. Etoposide was used to induce different levels of these two genotoxicity markers as positive controls for endonuclease-induced genotoxicity. Briefly, cells were grown in 24-well plates and 500 ng/ml vector expressing each nuclease (without donor plasmids) was transfected into cells in each well by using Lipofectamine 2000. After 20 h, cells were fixed with 4% paraformaldehyde for 10 min, followed by permeabilization with 0.2% Triton X-100 for 5 min. Cells were then blocked in phosphate buffered saline containing 10% FBS at 37°C for 1 h, and then incubated at 37°C for 1 h with the following primary antibodies: mouse monoclonal γH2AX at 1:1000, mouse monoclonal 53BP1 at 1:1000 (Abcam). Then secondary antibodies conjugated to Alexa Fluor 488 (Invitrogen) at 1:1000 were incubated for 1 h; followed by 4',6-diamidino-2-phenylindole (DAPI) staining. The images were captured using a digital inverted microscope (NIKON Ti-E system). Cells stained for 53BP1 were divided into 4 groups with 0, 1, 2 or ≥3 53BP1 foci. FACS analysis of cells stained with anti-γH2AX was performed to measure the percentage of cells with levels of γH2AX above background staining.
Western blot
The expression of endonucleases in 293FT cells was measured using anti-FLAG antibodies, as all of the nucleases were tagged with the Flag epitope. Briefly, 2 × 105 293FT cells were seeded in 6-well plates and transfected with endonuclease vectors at a concentration of 500 ng/ml. Forty-eight hours after transfection, the cells were harvested, rinsed with phosphate-buffered saline and lysed in 150 μl of ice-cold RIPA buffer composed of 50 mM Tris, 150 mM NaCl, 0.5% Na deoxycholate, 1% Nonidet P-40 and 0.1% sodium dodecyl sulfate (SDS). Then the cell lysates were centrifuged at 4°C for 5 min at 8000 × g. The supernatant was subjected to electrophoresis using 8% SDS-polyacrylamide gel electrophoresis and then transferred onto the polyvinylidene difluoride (PVDF) membrane by performing electrobloting. The membrane was blocked in 5% non-fat milk in Tris-Buffered Saline with 0.1% Tween 20 (TBST) blocking solution at room temperature for 1 h and subsequently incubated with FLAG-specific monoclonal antibody diluted 1:1000 (F1084, Sigma) in TBST. The membrane was then subjected to a 1-h incubation with horseradish peroxidase (HRP)-conjugated goat anti-mouse secondary antibody (Zhongshan Jinqiao) at 1:1000 in TBST, followed by detection using the chemiluminescence labeling detection reagent ECL Plus (GE healthcare).
Electrophoretic mobility shift assay (EMSA)
I-SceI and ISVB2 were tagged at the N-terminus with a GST epitope and were then purified using a GST agarose column (GE healthcare). DNA probes with 3′-biotin-labeled were ordered directly from Invitrogen. We followed the protocol developed by Ruff et al. (23), in which double-stranded DNA probes (5 nmol) were mixed with various concentrations of either GST-I-SceI or GST-ISVB2 in EMSA reaction buffer (20 mM Tris-HCl, pH 8.5, 50 mM NaCl, 2 μM ZnCl2, 12 mM MgCl2, 2% glycerol, 2 mg/ml BSA and 2 mM freshly prepared DTT). All reaction mixtures were incubated at 25°C for 40 min and subsequently analyzed on 4% non-denaturing polyacrylamide gels. Samples in the gels were then transferred to nylon membranes (Milipore) and ultraviolet-crosslinked at 2000 J. The crosslinked membranes were blocked in 5% non-fat milk, followed by a 1-h incubation in 1:10 000 diluted streptavidin-HRP conjugate (Thermo) and then washed three times. The chemiluminescent signal was detected using the chemiluminescence labeling detection reagent ECL Plus (GE healthcare) and analyzed using the Bio-Rad Chemidoc XRS+. The I-SceI and β-globin probes had the following sequences: 5′-TGCACCATTCTTAGGGATAACAGGGTAATTTTCTGGGTTA-3′ and 5′-TGCACCATTCTAAAGAATAACAGTGATAATTTTCTGGGTTA-3′, respectively.
Statistical analysis
The statistical significance of all data was determined by two-tailed Student's t-tests by using the PRISM program (P-value < 0.05).
Isolation, culturing and electroporation of human hematopoietic stem cells
Human hematopoietic stem cells were collected by isolating CD34+ cells from peripheral blood from consent donors at the First Affiliated Hospital of Guangxi Medical University. 106 cells per milliliter CD34+ cells were cultured for 24 h in Serum-free StemSpan medium (StemCell Technology), which contains 1% penicillin-streptomycin (Life Technology), 50 ng/ml Stem Cell Factor (R&D), 50 ng/ml FLT3 ligand (R&D), 50 ng/ml Thrombopoietin (R&D) and 50 ng/ml Interleukin 6 (R&D). CD34+ cells were electroporated with 1 μg/ml vectors carrying TALE-ISVB2-P2A-mKate. After electroporation, the CD34+cells were cultured in the StemSpan media for 48 h, followed by FACS collection of cells expressing TALE-ISVB2-P2A-mKate.
High-throughput sequencing and analysis
High-throughput sequencing was performed on cells sorted for CD34+ cells with or without (control) the electroporation of TALE-ISVB2-P2A-mKate. Genomic DNA of the CD34+ cells was collected by using the DNeasy blood and tissue Kit (QIAGEN). PCR primers were designed to target 150–200 bps on target locus (β-globin gene) or putative off-target sites (POS). The POS are highly homologous to the target locus. Note that 16 POS are one-nucleotide different and 2 POS are two-nucleotide different from the human β-globin sequence (Supplementary Table S1). PCR amplicons for both samples and controls were then subjected to adaptors ligation and an additional round of amplification. The PCR amplicon libraries were sequenced on HiSeq1500 using paired end 100 bp with a 6-bp index read. The reads were aligned to human reference genome 19 (hg19) by Burrows–Wheeler Alignment tool (24). Each aligned read pair was individually genotyped for the presence of indels for statistical analysis.
RESULTS
TALEN and CRISPR/CAS9 systems can induce HDR at the β-globin sequence
To create an endonuclease with high HDR efficiency and low genome toxicity, we first tested whether TALEN or CRISPR/CAS9D10A could achieve these goals. We designed TALENs to recognize 33-bp sequences of the human β-globin gene close to a highly prevalent mutation site associated with β thalassemia (Figure 1a). The HDR rate was measured using a reporter system created by inserting 53 bp of the β-globin sequence into the coding region of eGFP sequence, which was subsequently integrated into human 293FT cells via lentiviral infection (Figure 1a). When template DNA containing only truncated eGFP was provided to cells harboring the β-globin HDR reporter, no cells expressing eGFP were observed using FACS (Figure 1b) in the 105 cells examined. However, when TALENs were transfected with the template DNA for HDR, a population of cells expressing eGFP was detected using FACS (Figure 1b). On average, TALENs induced HDR by ∼200 eGFP+ cells/100 000 cells (or 0.20%) in our system (Figure 1b and c). Similar levels of HDR (0.37%) were observed in another study that used TALEN and an integrated GFP reporter in 293T cells (20).
In addition, we created gRNA that targeted 20 bp in the same locus and utilized the CRISPR/CAS9D10A system as described previously (15,20). This system relies on the mutated CAS9 that is acting as nickase to increase the targeting specificity although its targeting efficiency is expected to be lower than the wild-type CAS9. Similarly, gRNA targeting at the β-globin locus along with CRISPR/CAS9D10A induced HDR by 0.20% using the same reporter (Figure 1a–c).
TALENs and CRISPR/CAS9 systems shows high levels of genome toxicity
Although TALENs and CRISPR/CAS9D10A induced HDR at the human β-globin DNA sequence, the genotoxicity induced by these two endonucleases was still unknown. To examine the genotoxicity induced by the endonucleases, we measured the levels of γH2AX and 53BP1 in 293FT cells because these two markers have been widely used as indicators of DNA damage, including double-strand breaks (DSBs)(17,25–27) and single-strand breaks (28,29). First, we used etoposide, a chemical known to induce DSB in cells (27), to demonstrate the sensitivity of γH2AX and 53BP1 as indicators of genotoxicity. At 2μM, we observed significant staining of γH2AX induced by etoposide (Supplementary Figure S1a), and cells with many foci (≥3) of 53BP1 (Supplementary Figure S1b). In general, γH2AX foci were more diffuse and less distinct than 53BP1 foci in strongly positive nuclei. Increasing concentration of etoposide to 10 μM simultaneously and significantly increased the levels of γH2AX and 53BP1 (Supplementary Figure S1a and b). Therefore, γH2AX is a sensitive DNA damage indicator with a more diffused pattern and 53BP1 clearly indicates different levels of DSBs via staining. Together, these two markers could measure and illustrate genotoxicity in the genome. We also observed a background level of cells possessing low levels of γH2AX and 53BP1 without etoposide treatment, probably reflecting cells undergoing DNA repair during DNA replication or apoptosis.
To examine whether TALEN targeting at the β-globin sequence induced any genotoxicity, we measured the levels of γH2AX and 53BP1 in cells transfected with these endonucleases. We determined that ∼14% of TALEN-transfected cells had a high γH2AX signal and ∼16.5% of cells had more than 3 foci of 53BP1 (Figure 1d–f). Surprisingly, CAS9D10A also induced γH2AX expression by 16.3% and increased the number of cells with more 53BP1 foci by 14.5% (Figure 1d–f). This suggested that TALEN and CAS9D10A targeting at the β-globin sequence could cause unwanted DSBs or DNA damage at sites other than the targeting locus. To compare the effects of genotoxicity, we also examined I-SceI, a homing endonuclease that has low levels of genotoxicity (17). As expected, cells transfected with I-SceI showed background levels of γH2AX expression and 53BP1 foci in contrast to other endonucleases expressed at the same level (Figure 1d–f). These results indicated that using TALENs or CAS9D10A to induce HDR at the β-globin site would cause DNA damage and harmful recombination in the human genome. Therefore, we sought an alternative targeting method to reduce genotoxicity.
Re-engineering I-SceI to induce HDR at the β-globin sequence
Because I-SceI showed only background levels of genotoxicity when transfected into human cells, we speculated that it might be an ideal endonuclease to be re-engineered for gene targeting. First, to confirm whether I-SceI can induce HDR in 293FT cells, we tested the HDR efficiency induced by I-SceI using the eGFP reporter system inserted with 18 bp of the I-SceI-recognizing sequence (Supplementary Figure S2a). I-SceI induced HDR by 0.21% compared to that of its original recognizing sequence in human cells, which is similar to percentage of HDR induced by TALENs or CAS9D10A (Supplementary Figure S2b). HDR induced by I-SceI showed dose-dependency on the I-SceI vector and template (Supplementary Figure S2c and d). Moreover, measuring eGFP-positive repair events at 36 and 48 h after transfection yielded similar levels of HDR (Supplementary Figure S2e). To reduce potential genotoxicity, we used 500 ng/ml of template DNA and 500 ng/ml of the vector expressing I-SceI, and then measured HDR events 36 h after transfection in the subsequent experiments.
Next, we re-engineered I-SceI to recognize the β-globin sequence by changing amino acids in I-SceI and the recognized DNA sequences simultaneously. Then, the mutated I-SceI sequences that induced the highest HDR were sequentially selected (Supplementary Figure S3). First, we chose 18 bp in the β-globin sequence, in which 7 bp are different from the original I-SceI recognition sequence (Figure 2a). Based on the crystal structure of I-SceI bound to its DNA substrate (30), we first modified N152 and two nucleotides at position −7 and −5 of the I-SceI recognition sequence (Figure 2b). We created 19 variants of I-SceI corresponding to 19 alternative amino acids at position 152. To select the variant that induced the highest HDR level, we examined the number of eGFP-positive cells induced by the 19 variants compared to the original I-SceI using the eGFP reporter containing the modified I-SceI recognition sequence. We found that changing N152 to K or Q yielded the highest HDR rate, increasing it by ∼500 cells/105 transfected cells, whereas the wild-type I-SceI only induced HDR by 80 cells/105 transfected cells (Figure 2c). When the two variants with the highest HDR were tested again, Q showed the highest HDR (Supplementary Figure S4a). Hence, we chose the Q variant for the next round of modifications. In the second round of screening, the I-SceI variant containing N152Q was modified at K193 to 19 other amino acids and tested again with the eGFP reporter containing a newly modified I-SceI recognition sequence had A instead of G at position −3 (Figure 2d). Because the original K variant induced the highest HDR in the second round of modifications, we did not change K193. In the third round, we changed R48 in the I-SceI sequence as well as +6 and +8 of the recognition sequence. Then, N15 in the I-SceI sequence as well as +9 and +11 of the recognition sequence were changed in the fourth round. The third and fourth rounds of screening resulted in changing R48 to V and N15 to K (Figure 2e and f) based on the highest HDR rates induced by the variants, which was confirmed by repeating the tests with variants that induced the three highest HDR rates (Supplementary Figure S4b and c). After four rounds of screening amino acids that directly interact with the 7 nucleotides of the recognition sequences, we obtained a modified I-SceI (named ISVB2) that could induce HDR by 520 cells/105 transfected cells with a reporter containing 18 bp of the β-globin sequence.
The apparent equilibrium dissociation constant (Kd) for the binding of ISBV2 to the β-globin sequence was determined using an EMSA (Supplementary Figure S5a). Biotinylated DNA probes (5 nmol) were mixed with GST purified I-SceI or ISVB2 at various concentrations. The Kd of DNA probes containing the original I-SceI or β-globin sequences was compared with those after the re-engineering of I-SceI. We found that the Kd for the binding of I-SceI protein to the I-SceI sequence was ∼8.2 × 10−8 M, while the Kd for ISVB2 binding to β-globin sequence was ∼5.4 × 10−8 M (Supplementary Figure S5a and b). When the DNA probes were switched, the Kd for binding of the I-SceI protein to the β-globin sequence became ∼3.9 × 10−7 M, while that for the ISVB2 binding to I-SceI sequence was 3.7 × 10−7 M (Supplementary Figure S5c and d). Therefore, ISVB2 showed high affinity to the β-globin sequence, suggesting its high specificity when expressed in human cells.
Fusion of TALE to modified I-SceI increased its HDR rate and caused low genotoxicity
Although the re-engineered I-SceI, ISVB2, showed a comparable HDR rate to that of TALEN or CRISPR/CAS9D10A, we sought to test whether the HDR rate of ISVB2 could be further enhanced, while reducing the genotoxicity of TALEN by fusing ISVB2 to TALE. First, we confirmed that the FokI domain of dimeric TALEN contributed to the genotoxicity (Supplementary Figure S6), because γH2AX and 53BP1 expression levels were significantly reduced when the FokI domain is replaced by a catalytically inactive mutant of FokI as previously reported (17). Hence, we replaced the FokI endonuclease domain with ISVB2 to create a monomeric fusion endonuclease (Figure 3a). Fifteen base pairs of the β-globin sequence are recognized by the TALE domain in addition to the 18-bp β-globin sequence specifically bound by ISVB2. Therefore, the fusion endonuclease recognizes a 33-bp sequence combining both of the TALE and ISVB2 domains. Using the reporter system that contains 53 bp of the β-globin sequences described above in the TALEN and CRISPR/CAS9D10A experiments, we tested whether TALE-ISVB2 could induce the HDR at the β-globin sequence. ISVB2 induced HDR by only ∼50% compared to that of TALEN (Figure 3b). However, ISVB2 fused to TALE showed a similar level of HDR to that of TALEN, indicating that TALE-DNA recognition domain we designed helped to increase the targeting efficiency of ISVB2. To determine whether the length of the linker between the TALE domain and ISVB2 affects the HDR efficiency, we tested four different linker lengths composed of 16, 28, 39 or 63 amino acids (Figure 3c). Only the 16-amino-acid linker showed a significantly lower HDR than other linkers, and the 28-amino-acid linker induced the highest HDR (Figure 3c). Therefore, we used the linker with 28 amino acids.
Figure 3.
Fusing re-engineered I-SceI to TALE induce similar HDR at β-globin sequence. (a) Diagram depicting the design of the monomeric TALE fused to ISVB2 in contrast to the dimeric TALEN recognizing the β-globin sequence. The brown and yellow boxes highlight the sequences recognized by TALE or ISVB2. (b) The HDR rate induced by ISVB2, TALENs and TALE-ISVB2. The values were obtained from three independent FACS experiments, and the error bars denote the standard deviation. (c) The effects of linker length on the HDR rate. Linkers with various number of amino acids (15, 28, 39 and 63) were inserted between TALE and ISVB2 and tested for their effects on the HDR rate. (d–f) Indels generated by the indicated endonucleases at the endogenous β-globin locus in 293FT cells. The sequencing results of indels at β-globin locus obtained from the negative control (NC, without endonuclease transfection) and cell populations transfected with ISVB2, TALEN or TALE-ISVB2. The frequencies of indels for ISVB2, TALEN and TALE-ISVB2 were 10% (n = 80), 19% (n = 59) and 13% (n = 78), respectively, by Sanger sequencings.
To examine the targeting efficiency at the endogenous β-globin locus, we PCR amplified genomic DNA of 293FT cells transfected with the endonucleases. We found that ISVB2, TALEN and TALE-ISVB2s induced 10% (n = 80), 19% (n = 59) and 13% (n = 78) indels, respectively (Figure 3d–f), demonstrating the different levels of targeting efficiency corresponding to non-homologous end joining (NHEJ) induced by these endonucleases.
When we examined the genome toxicity of these endonucleases, we observed that ISVB2 induced lower expression of γH2AX than that of TALENs, even though they were expressed at the same level (Figure 4a, c and d). In addition, the fusion endonuclease TALE-ISVB2 induced similar levels of γH2AX as ISVB2. A similar trend was observed in immunostained 53BP1 for these endonucleases (Figure 4b and d). In summary, replacing FokI of TALEN with ISVB2 lowered the genotoxicity while maintaining its HDR efficiency.
Figure 4.
ISVB2 induced low genotoxicity. (a and b) ISVB2 and TALE-ISVB2 induced low levels of genotoxicity in human cells. γH2AX and 53BP1 immunostained 293FT cells induced by TALEN, ISVB2 and TALE-ISVB2. The pictures of TALEN treated cells are the same as those in Figure 1 for comparison. (c) Percentage of cells with high γH2AX expression determined by FACS of three independent cell populations. The error bars indicate standard deviation. TALE-ISVB2 and ISVB2 samples both have significantly lower γH2AX expression than cells exposed to TALENs. The asterisk indicates the statistical significance between the samples (P-value < 0.05). (d) Western blot results showed that the levels of ISVB2, TALEN and TALE-ISVB2 expression were similar.
TALE-ISVB2 targets the β-globin sequence in human hematopoietic stem cells
To confirm that TALE-ISVB2 can target the β-globin sequence in human hematopoietic stem cells, we expressed TALE-ISVB2 in CD34+ cells and examined the cleavage rate by high-throughput sequencing. Note that 3.2% of CD34+ cells were collected from the peripheral blood samples (Figure 5a and b). The FACS-enriched CD34+cells were electroporated with TALE-ISVB2-P2A-mKate vectors. mKate was expressed from the same polycistronic transcript but the fluorescent protein would be cleaved from TALE-ISVB2 due to P2A self-cleaving peptide sequence. This allows us to isolate CD34+ cells that expressed TALE-ISVB2 by FACS. CD34+ cells without electroporation and CD34+ cells expressing TALE-ISVB2-P2A-mKate were independently collected for high-throughput sequencing (Figure 5c). We examined the genomic sequences at the β-globin sequence and 18 POS. These POS were selected based on their high homology of DNA sequences to the β-globin sequence. Note that 16 POS are one-nucleotide different and 2 POS are two-nucleotide different (because there is no single-nucleotide different sequence in the human genome at the corresponding position; Supplementary Table S1) from the β-globin sequence. Note that 4.85% of CD34+ expressing TALE-ISVB2 carried indels at the β-globin sequence, whereas 17 putative site showed 0.00–0.14% of indels (Figure 5d), indicating a specific cleavage at the β-globin sequence by TALE-ISVB2. Notably, the majority of indels (144/240) at the β-globin sequence were single nucleotide deletion at the expected cleavage sequence (-ATAA-) of ISVB2 (Figure 5e), further supporting the specific cleavage by this re-engineered nuclease.
Figure 5.
TALE-ISVB2 targeted β-globin sequence in human CD34+ cells. (a) FACS isolation of CD34+ cells from peripheral blood samples. Note that 3.2% CD34+ was isolated by FACS from human peripheral blood samples. Control was the same peripheral blood samples without addition of CD34+ antibody. (b) Phase contrast pictures showing the purity of CD34+ cells after the FACS enrichment from the peripheral blood samples. (c) CD34+ cells were subjected to electroporation with TALE-ISVB2-P2A-mKate vectors and mKate expressing cells were isolated by FACS. (d) Indels detected at the targeted β-globin sequence by high-throughput sequencings. Red number indicated the number of specific indel on the right. (e) Percent of indels at β-globin sequence and 18 other POSs (POS1–18) in the human genome. The percentages were calculated by subtracting the percent of indel in the CD34+ cells without TALE-ISVB2 from the percent of indel in the CD34+ cells with TALE-ISVB2 expressions. Percent of indel <0.004 after subtraction is depicted as 0.00 on the graph. Corresponding differences in nucleotide sequence to the targeted sequence is listed below each POS. The location and sequence of each POS is listed in Supplementary Table S1. The total reads and indel reads are listed in Supplementary Table S2. No read was obtained for POS5 and was hence labeled NA.
DISCUSSION
Our findings show that TALENs can induce high rates of HDR at the human β-globin sequence in human cells. However, the genotoxicity of the endonucleases was relatively high, raising concerns for its use in therapeutic gene targeting. Recent studies have focused on using CRISPR/CAS9 as a potential method for gene targeting in human cells, but studies have showed that the mutations and off-target effects of this system could also pose unwanted mutations in human cells. A nickase mutant of CAS9, CAS9D10A, has been created to increase specificity, but its off-target binding were more promiscuous than initially thought (12), and looking for targeted sequences that fulfill the protospacer adjacent motif (PAM) requirement for both gRNAs within the same region will limit the utility of this system. Nevertheless, CAS9D10A showed similar genotoxicity to TALENs in our study. In comparison, designing DNA recognition sequences for TALENs is not as limited as gRNAs, although the FokI nuclease appears to be causing off-targeting effects. By fusing TALE to re-engineered I-SceI, we created a monomeric endonuclease that can induce HDR as efficiently as TALENs or CRISPR/CAS9, while significantly reducing genotoxicity.
In this study, we modified four amino acids in I-SceI to recognize the β-globin sequence, which is seven nucleotides different from the sequence of the original I-SceI. As shown in our EMSA results, changing four amino acids in I-SceI caused a more than 10-fold decrease of binding affinity to the original recognizing sequence, indicating the important roles of these amino acids in determining binding specificity. The modified I-SceI, ISVB2, showed a similar binding affinity to the β-globin sequence compared to I-SceI binding affinity to its original sequence. Thus, we demonstrated that it is feasible to re-engineer homing endonuclease, such as I-SceI, and tested the binding specificity in human cells to recognize DNA sequence which are different from its native sequence.
In theory, we could modify more than four amino acids in I-SceI to target other DNA sequences at other genomic loci if necessary. If the four nucleotides (-ATAA-) at the core of the recognizing sequence were kept unchanged, then the probability of finding the same sequence would be every 256 bp, and the frequency of the four nucleotide-sequence in human genome is ∼1.1 × 107. Therefore, many mutations associated with human diseases are likely to be targeted by modifying I-SceI amino acids. This system does not require both arms of recognition sequences to be designed at targeting sites thus reduces the potential off-target errors caused by dimerization of nucleases, such as FokI. Similar design of fusion protein consists of TALE and a variant of another homing endonuclease, I-OnuI, was recently reported to specifically target T-cell receptor alpha gene (31). A complete matching sequence for a homing endonuclease in human genome is quite rare or not existing. Hence, our approach of re-engineering a homing endonuclease to recognize a desired sequence will widen the usage of these endonucleases. Finally, we tested the cleavage specificity of TALE-ISVB2 in human hematopoietic cells and found that it specifically cleaved the β-globin sequence. This is consistent with its low genotoxicity measured by γH2AX and 53BP1. Hence, using a monomeric endonuclease of TALE-I-SceI to target DNA sequence in human genome provides a potential option to therapeutic gene targeting.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
Acknowledgments
We thank Nieng Yan for providing the TALEN plasmid to us and Babak Javid for critical comments on the manuscript.
FUNDING
Research funding is provided by the National Natural Science Foundation of China [81470011]; the Ministry of Science and Technology of China [2012CB966702]. Funding for open access charge: the National Natural Science Foundation of China [81470011]; the Ministry of Science and Technology of China [2012CB966702].
Conflict of interest statement. None declared.
REFERENCES
- 1.Ginn S.L., Alexander I.E., Edelstein M.L., Abedi M.R., Wixon J. Gene therapy clinical trials worldwide to 2012 - an update. J. Gene Med. 2013;15:65–77. doi: 10.1002/jgm.2698. [DOI] [PubMed] [Google Scholar]
- 2.Hacein-Bey-Abina S., von Kalle C., Schmidt M., Le Deist F., Wulffraat N., McIntyre E., Radford I., Villeval J.L., Fraser C.C., Cavazzana-Calvo M., et al. A serious adverse event after successful gene therapy for X-linked severe combined immunodeficiency. N. Engl. J. Med. 2003;348:255–256. doi: 10.1056/NEJM200301163480314. [DOI] [PubMed] [Google Scholar]
- 3.Hacein-Bey-Abina S., Garrigue A., Wang G.P., Soulier J., Lim A., Morillon E., Clappier E., Caccavelli L., Delabesse E., Beldjord K., et al. Insertional oncogenesis in 4 patients after retrovirus-mediated gene therapy of SCID-X1. J. Clin. Invest. 2008;118:3132–3142. doi: 10.1172/JCI35700. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Urnov F.D., Rebar E.J., Holmes M.C., Zhang H.S., Gregory P.D. Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 2010;11:636–646. doi: 10.1038/nrg2842. [DOI] [PubMed] [Google Scholar]
- 5.Bogdanove A.J., Voytas D.F. TAL effectors: customizable proteins for DNA targeting. Science. 2011;333:1843–1846. doi: 10.1126/science.1204094. [DOI] [PubMed] [Google Scholar]
- 6.Wiedenheft B., Sternberg S.H., Doudna J.A. RNA-guided genetic silencing systems in bacteria and archaea. Nature. 2012;482:331–338. doi: 10.1038/nature10886. [DOI] [PubMed] [Google Scholar]
- 7.Mussolino C., Cathomen T. RNA guides genome engineering. Nat. Biotechnol. 2013;31:208–209. doi: 10.1038/nbt.2527. [DOI] [PubMed] [Google Scholar]
- 8.Pattanayak V., Ramirez C.L., Joung J.K., Liu D.R. Revealing off-target cleavage specificities of zinc-finger nucleases by in vitro selection. Nat. Methods. 2011;8:765–770. doi: 10.1038/nmeth.1670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hockemeyer D., Wang H., Kiani S., Lai C.S., Gao Q., Cassady J.P., Cost G.J., Zhang L., Santiago Y., Miller J.C., et al. Genetic engineering of human pluripotent cells using TALE nucleases. Nat. Biotechnol. 2011;29:731–734. doi: 10.1038/nbt.1927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Fu Y., Foden J.A., Khayter C., Maeder M.L., Reyon D., Joung J.K., Sander J.D. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol. 31:822–826. doi: 10.1038/nbt.2623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cradick T.J., Fine E.J., Antico C.J., Bao G. CRISPR/Cas9 systems targeting beta-globin and CCR5 genes have substantial off-target activity. Nucleic Acids Res. 2013;41:9584–9592. doi: 10.1093/nar/gkt714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wu X., Scott D.A., Kriz A.J., Chiu A.C., Hsu P.D., Dadon D.B., Cheng A.W., Trevino A.E., Konermann S., Chen S., et al. Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat. Biotechnol. 2014;32:670–676. doi: 10.1038/nbt.2889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Doyon Y., Vo T.D., Mendel M.C., Greenberg S.G., Wang J., Xia D.F., Miller J.C., Urnov F.D., Gregory P.D., Holmes M.C. Enhancing zinc-finger-nuclease activity with improved obligate heterodimeric architectures. Nat. Methods. 2011;8:74–79. doi: 10.1038/nmeth.1539. [DOI] [PubMed] [Google Scholar]
- 14.Hsu P.D., Scott D.A., Weinstein J.A., Ran F.A., Konermann S., Agarwala V., Li Y., Fine E.J., Wu X., Shalem O. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol. 2013;31:827–832. doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ran F.A., Hsu P.D., Lin C.Y., Gootenberg J.S., Konermann S., Trevino A.E., Scott D.A., Inoue A., Matoba S., Zhang Y., et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell. 2013;154:1380–1389. doi: 10.1016/j.cell.2013.08.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jasin M. Genetic manipulation of genomes with rare-cutting endonucleases. Trends Genet. 1996;12:224–228. doi: 10.1016/0168-9525(96)10019-6. [DOI] [PubMed] [Google Scholar]
- 17.Szczepek M., Brondani V., Buchel J., Serrano L., Segal D.J., Cathomen T. Structure-based redesign of the dimerization interface reduces the toxicity of zinc-finger nucleases. Nat. Biotechnol. 2007;25:786–793. doi: 10.1038/nbt1317. [DOI] [PubMed] [Google Scholar]
- 18.Pruett-Miller S.M., Reading D.W., Porter S.N., Porteus M.H. Attenuation of zinc finger nuclease toxicity by small-molecule regulation of protein levels. PLoS Genet. 2009;5:e1000376. doi: 10.1371/journal.pgen.1000376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Soutoglou E., Dorn J.F., Sengupta K., Jasin M., Nussenzweig A., Ried T., Danuser G., Misteli T. Positional stability of single double-strand breaks in mammalian cells. Nat. Cell Biol. 2007;9:675–682. doi: 10.1038/ncb1591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Mali P., Yang L., Esvelt K.M., Aach J., Guell M., DiCarlo J.E., Norville J.E., Church G.M. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. doi: 10.1126/science.1232033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Huang P., Xiao A., Zhou M., Zhu Z., Lin S., Zhang B. Heritable gene targeting in zebrafish using customized TALENs. Nat. Biotechnol. 2011;29:699–700. doi: 10.1038/nbt.1939. [DOI] [PubMed] [Google Scholar]
- 22.Kee K., Angeles V.T., Flores M., Nguyen H.N., Reijo Pera R.A. Human DAZL, DAZ and BOULE genes modulate primordial germ-cell and haploid gamete formation. Nature. 2009;462:222–225. doi: 10.1038/nature08562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ruff P., Koh K.D., Keskin H., Pai R.B., Storici F. Aptamer-guided gene targeting in yeast and human cells. Nucleic Acids Res. 2014;42:e61. doi: 10.1093/nar/gku101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Rogakou E.P., Pilch D.R., Orr A.H., Ivanova V.S., Bonner W.M. DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139. J. Biol. Chem. 1998;273:5858–5868. doi: 10.1074/jbc.273.10.5858. [DOI] [PubMed] [Google Scholar]
- 26.Rappold I., Iwabuchi K., Date T., Chen J. Tumor suppressor p53 binding protein 1 (53BP1) is involved in DNA damage-signaling pathways. J. Cell Biol. 2001;153:613–620. doi: 10.1083/jcb.153.3.613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Miller J.C., Holmes M.C., Wang J., Guschin D.Y., Lee Y.L., Rupniewski I., Beausejour C.M., Waite A.J., Wang N.S., Kim K.A., et al. An improved zinc-finger nuclease architecture for highly specific genome editing. Nat. Biotechnol. 2007;25:778–785. doi: 10.1038/nbt1319. [DOI] [PubMed] [Google Scholar]
- 28.Matsumoto M., Yaginuma K., Igarashi A., Imura M., Hasegawa M., Iwabuchi K., Date T., Mori T., Ishizaki K., Yamashita K., et al. Perturbed gap-filling synthesis in nucleotide excision repair causes histone H2AX phosphorylation in human quiescent cells. J. Cell Sci. 2007;120:1104–1112. doi: 10.1242/jcs.03391. [DOI] [PubMed] [Google Scholar]
- 29.Narciso L., Fortini P., Pajalunga D., Franchitto A., Liu P., Degan P., Frechet M., Demple B., Crescenzi M., Dogliotti E. Terminally differentiated muscle cells are defective in base excision DNA repair and hypersensitive to oxygen injury. Proc. Natl. Acad. Sci. U.S.A. 2007;104:17010–17015. doi: 10.1073/pnas.0701743104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Moure C.M., Gimble F.S., Quiocho F.A. The crystal structure of the gene targeting homing endonuclease I-SceI reveals the origins of its target site specificity. J. Mol. Biol. 2003;334:685–695. doi: 10.1016/j.jmb.2003.09.068. [DOI] [PubMed] [Google Scholar]
- 31.Boissel S., Jarjour J., Astrakhan A., Adey A., Gouble A., Duchateau P., Shendure J., Stoddard B.L., Certo M.T., Baker D., et al. megaTALs: a rare-cleaving nuclease architecture for therapeutic genome engineering. Nucleic Acids Res. 2014;42:2591–2601. doi: 10.1093/nar/gkt1224. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.