Abstract
The repurposed CRISPR-Cas9 system has recently emerged as a revolutionary genome-editing tool. Here we report a modification in the expression of the guide (gRNA) required for targeting that greatly expands the targetable genome. gRNA expression through the commonly used U6 promoter requires a guanosine nucleotide to initiate transcription, thus constraining genomic targeting sites to GN19NGG. We demonstrate the ability to modify endogenous genes using H1 promoter-expressed gRNAs, which can be used to target both AN19NGG and GN19NGG genomic sites. AN19NGG sites occur ~15% more frequently than GN19NGG sites in the human genome and the increase in targeting space is also enriched at human genes and disease loci. Together, our results enhance the versatility of the CRISPR technology by more than doubling the number of targetable sites within the human genome and other eukaryotic species.
Genome-editing technologies such as zinc fingers nucleases (ZFN) 1–4 and transcription activator–like effectors nucleases (TALEN) 4–10 have empowered the ability to generate targeted genome modifications and offer the potential to correct disease mutations with precision. While effective, these technologies are encumbered by practical limitations as both ZFN and TALEN pairs require synthesizing large and unique recognition proteins for a given DNA target site. Several groups have recently reported high-efficiency genome editing through the use of an engineered type II CRISPR-Cas9 system that circumvents these key limitations11–15. Unlike ZFNs and TALENs which are relatively time consuming and arduous to make, the CRISPR constructs, which rely upon the nuclease activity of the Cas9 protein coupled with a synthetic guide RNA (gRNA), are simple and fast to synthesize and can be multiplexed. However, despite the relative ease of their synthesis, CRISPRs have technological restrictions related to their access to targetable genome space, which is a function of both the properties of Cas9 itself and the synthesis of its gRNA.
Cleavage by the CRISPR system requires complementary base pairing of the gRNA to a 20-nucleotide DNA sequence and the requisite protospacer-adjacent motif (PAM), a short nucleotide motif found 3’ to the target site16. One can, theoretically, target any unique N20-PAM sequence in the genome using CRISPR technology. The DNA binding specificity of the PAM sequence, which varies depending upon the species of origin of the specific Cas9 employed, provides one constraint. Currently, the least restrictive and most commonly used Cas9 protein is from S. pyogenes, which recognizes the sequence NGG, and thus, any unique 21-nucleotide sequence in the genome followed by two guanosine nucleotides (N20NGG) can be targeted. Consequently, expansion of the available targeting space imposed by the protein component is limited to the discovery and use of novel Cas9 proteins with altered PAM requirements11,17 or pending the generation of novel Cas9 variants via mutagenesis or directed evolution. The second technological constraint of the CRISPR system arises from gRNA expression initiating at a 5’ guanosine nucleotide. Use of the type III class of RNA polymerase III promoters have been particularly amenable for gRNA expression because these short non-coding transcripts have well-defined ends, and all the necessary elements for transcription, with the exclusion of the 1+ nucleotide, are contained in the upstream promoter region. However, since the commonly used U6 promoter requires a guanosine nucleotide to initiate transcription, use of the U6 promoter has further constrained genomic targeting sites to GN19NGG13,18. Alternative approaches, such as in vitro transcription by T7, T3, or SP6 promoters, would also require initiating guanosine nucleotide(s)19–21.
In order to expand the current limitations of CRISPR-Cas9 targeting, we tested whether, instead of U6, we could utilize H1 pol III as an alternative promoter22. Because H1 can express transcripts with either purine (nucleotide R) located at the +1 position, we hypothesized that along with the S. pyogenes Cas9, we could expand the CRISPR targeting space by allowing for cleavage at both AN19NGG and GN19NGG sites (Fig. 1a). To demonstrate site-specific cleavage by H1 expressed gRNAs, we developed a reporter assay to measure CRISPR-mediated cleavage of a GFP target gene integrated at the AAVS-1 locus in the H7 human embryonic stem cell line (hESC)23 (Fig. 1b). We measured the loss of GFP fluorescence, due to coding sequence disruption, as a proxy for error-prone non-homologous end joining (NHEJ) frequency; notably, our assay would underestimate NHEJ, as in-frame mutations or indels that do not disrupt GFP fluorescence would not be detected (Fig. 1b and c). H7 cells were electroporated with equimolar ratios of Cas9 and gRNA expression plasmids and cells were visualized for GFP fluorescence after colony formation. In contrast to the negative control electroporation, all gRNA constructs from the U6 and H1 promoters we tested showed a mosaic loss of GFP signals in cells undergoing targeted mutation (Fig. 1c and data not shown). Quantitation of total cell number with a nuclear stain enabled cell-based analysis of GFP fluorescence by flow cytometry. Although 100% of constructs resulted in NHEJ, as demonstrated by loss of GFP fluorescence, the range of efficiencies varied for both U6 and H1 constructs (Fig. 1c, right and data not shown). By expressing gRNAs from either the U6 or H1 promoters, this demonstrates that mutagenesis of the GFP gene can occur at GN19NGG or AN19NGG sites, respectively.
To confirm and broaden these results with another cell line, we targeted a GFP expressing HEK-293 cell line expressing GFP at the same locus with the same gRNA constructs as above. By Surveyor analysis, we detected a range of efficiencies varying by promoter type and targeting location (Fig. 1d, and Supplementary Fig. 1). By using unmodified IMR90.4 induced pluripotent cells (hiPSC), we also confirmed the ability to modify an endogenous gene by targeting the AAVS-1 locus within the intronic region of the PPP1R12C gene. Targeted cleavage from H1 and U6 driven gRNAs were observed with comparable efficiencies as measured by the Surveyor Assay (Supplementary Fig. 2).
In order to determine the potential increase in targeting space, we performed bioinformatic analysis to determine the available CRISPR sites in the human genome. While AN19NGG sites might be predicted to occur roughly at the same frequently as GN19NGG sites, we found that they are actually 15% more common (Fig. 2 and Supplementary Fig. 3); thus changing specificity from GN19NGG to RN19NGG more than doubles the number of available sites. With a few exceptions, (chr16, chr17, chr19, chr20, and chr22) AN19NGG sites are present at higher frequencies than GN19NGG sites on each chromosome. To compare the average genome-wide targeting densities we calculated the mean distances between adjacent CRISPR sites in the genome for GN19NGG (59bp), AN19NGG (47bp), and RN19NGG sites (26bp) (Fig. 2b). Additionally, AN19NGG sites were even more enriched at relevant regions of targeting in the human genome. We found a 20% increase in AN19NGG sites in human genes, and a 21% increase at disease loci obtained from the OMIM database (Fig. 2c). We also examined 1165 miRNA genes from the human genome and found that 221 of these genes could be targeted through one or more AN19NGG sites, but not through a GN19NGG site (data not shown). Given that the efficiency of homologous recombination negatively correlates with increasing distance from cut sites, the increase in CRISPR targeting sites by use of the H1 promoter should facilitate more precise genomic targeting and mutation correction24.
As CRISPR technology is increasingly utilized for genomic engineering across a wide array of model organisms, we sought to determine the potential impact of the use of the H1 promoter in other genomes. We carried out this analysis on 5 other vertebrate genomes that had high genomic conservation at the H1 promoter (Mouse; Rat; Chicken; Cow; and Zebrafish). In all cases, we found a higher number of AN19NGG compared to GN19NGG sites: +9% Cow; +14% Chicken; +19% Rat; + 21% Mouse; and + 32% Zebrafish (Fig. 2c). One explanation for this prevalence could be due to the higher AT content (Supplementary Fig. 4). In the human genome, normalizing the GN19NGG and AN19NGG site occurrences to AT content brings the frequencies closer to parity, although this does not hold true for all genomes (Supplementary Fig. 4a and 4f). Nevertheless, this demonstrates the utility of using the H1 promoter, which more than doubles the currently available CRISPR targeting space in the human genome, and similarly in all other genomes tested.
We next sought to demonstrate the ability to target an AN19NGG site in an endogenous gene with the H1 promoter construct. Using H7 cells, we targeted the second exon of the MERTK locus, a gene involved with phagocytosis in the retinal pigment epithelium and macrophages and that when mutated causes retinal degeneration25 (Fig. 3a and 3b). To estimate the overall targeting efficiency, we harvested genomic DNA from a population of cells that were electroporated, and performed the Surveyor Assay. We amplified the region surrounding the target sites with two independent PCR reactions and calculated a 9.5% and 9.7% indel frequency (Fig. 3b). Next, 42 randomly chosen clones were isolated and tested for mutation by Surveyor analysis (data not shown). Sequencing revealed that 7/42 (16.7%) harbored mutations clustering within 3-4 nucleotides upstream of the target PAM site. 6/7 clones had unique mutations (1 clone was redundant) and 3 of these were bi-allelic frame-shift mutations resulting in a predicted null MERTK allele that was confirmed by Western Blot analysis (Fig. 3c and 3d). Taken together, these results demonstrate the ability to effectively target an AN19NGG site located at an endogenous locus.
In order to quantitatively determine the extent of off-targeting that occurred from the GFP gRNA constructs, we used Surveyor Analysis to examine three genomic loci that were bioinformatically predicted to be off-target sites (GFP_11-33, GFP_219-197, and GFP_315-293). Two of these constructs (GFP_219-197, and GFP_315-293) were GN19NGG target sites, allowing for expression with both promoters. One (GFP_11-33), an AN19NGG site, was expressed from the U6 promoter by appending a 5’-G nucleotide. In all three off-target loci we examined, we were unable to detect any off-target cleavage (data not shown). However, the lack of detectable off-targets could result from our initial selection of the GFP gRNA targets, in which sites were selected based upon low homology to other genomic loci. Thus, we reasoned that a more stringent challenge would be to compare gRNA expression from H1 and U6 promoters at targeting sites specifically known to elicit high levels of off-target hits26–28. Furthermore, the 5’ nucleotide flexibility of the H1 promoter allowed for a direct comparison of identical gRNAs targeting GN19NGG sites between U6 and H1 promoters, and we tested two sites previously reported from Fu et al. (2013): VEGFA site 1 (T1) and VEGFA site 3 (T3) (Table 1 and Supplementary Fig. 5)26,28. An additional benefit of the H1 promoter over the U6 promoter may be in increasing specificity by reducing spurious cleavage. Because increased gRNA and Cas9 concentrations have been shown to result in increased off-target hits26,27,29, we reasoned that the lower gRNA expression level from the H1 promoter30–32 might also reduce off-target effects. Using qRT-PCR, we tested the levels of the VEGFA T1 gRNA from either the H1 and U6 promoter, confirming the reduced level of expression of the gRNA (Supplementary Fig. 5a). For the VEGFA T1 site, we tested the efficiency of cutting at the on-target loci, as well as four off-target loci. In comparison with the U6 promoter, cutting at the on-target loci was comparable or slightly reduced; however, the H1 promoter expressed gRNAs were notable more stringent at the examined off-target loci indicating greater specificity (Off-target 1: 8% vs. 25%; Off-target 2: undetectable vs. 20%; and Off-target 4: 9% vs. 26%) (Table 1 and Supplementary Fig. 5). We detected equal targeting between the two promoter constructs at the VEGFA T3 site (26%), but again, lower levels of off-target cutting with the H1 promoter (Table 1 and Supplementary Fig. 5). While further studies on H1 and U6 promoters expressed gRNAs needs to be performed, our data suggests greater specificity from H1 expressed gRNAs.
Table 1.
Target | Promoter | Full-length Target | Indel mutation Frequency |
---|---|---|---|
VEGFA-T1 | U6 | GGGTGGGGGGAGTTTGCTCCtGG | 24% |
VEGFA-T1 | H1 | GGGTGGGGGGAGTTTGCTCCtGG | 16% |
OT1-3 | U6 | GGATGGAGGGAGTTTGCTCCtGG | 25% |
OT1-3 | H1 | GGATGGAGGGAGTTTGCTCCtGG | 8% |
OT1-4 | U6 | GGGAGGGTGGAGTTTGCTCCtGG | 20% |
OT1-4 | H1 | GGGAGGGTGGAGTTTGCTCCtGG | Not Detected |
OT1-6 | U6 | CGGGGGAGGGAGTTTGCTCCtGG | Not Detected |
OT1-6 | H1 | CGGGGGAGGGAGTTTGCTCCtGG | Not Detected |
OT1-11 | U6 | GGGGAGGGGAAGTTTGCTCCtGG | 26% |
OT1-11 | H1 | GGGGAGGGGAAGTTTGCTCCtGG | 9% |
VEGFA-T3 | U6 | GGTGAGTGAGTGTGTGCGTGtGG | 26% |
VEGFA-T3 | H1 | GGTGAGTGAGTGTGTGCGTGtGG | 26% |
OT3-1 | U6 | GGTGAGTGAGTGTGTGTGTGaGG | 20% |
OT3-1 | H1 | GGTGAGTGAGTGTGTGTGTGaGG | 13% |
OT3-4 | U6 | GCTGAGTGAGTGTATGCGTGtGG | 16% |
OT3-4 | H1 | GCTGAGTGAGTGTATGCGTGtGG | 11% |
OT3-18 | U6 | TGTGGGTGAGTGTGTGCGTGaGG | Not Detected |
OT3-18 | H1 | TGTGGGTGAGTGTGTGCGTGaGG | Not Detected |
Accumulating evidence for S. pyogenes Cas9 targeting in vitro and in vivo, indicates that the Cas9:gRNA recognition extends throughout the entire 20 base pair targeting site. First, in testing >1012 distinct variants for gRNA specificity in vitro, one study found that the +1 nucleotide plays a role in target recognition. Furthermore, positional specificity calculations from this data show that the 5’ nucleotide contributes a greater role in target recognition than its 3’ neighbor, indicating that the “seed” model for CRISPR specificity might overly simplify the contribution of PAM-proximal nucleotides27. Secondly, alternative uses such as CRISPR interference (CRISPRi), which repurposes the CRISPR system for transcriptional repression, found that 5’ truncations in the gRNA severely compromised repression, and 5’ extensions with mismatched nucleotides – such as mismatched G bases for U6 expression – also reduce the repression efficiency, suggesting that both length (20 nt) and 5’ nucleotide context are important for proper Cas9 targeting24,33–36. Finally, crystal structure data further supports the experimental data and importance of the 5’ nucleotide in Cas9, as significant contacts are made with the 5’ nucleotide of the gRNA and 3’ end of the target DNA37,38.
For increased targeting space, the use of alternate Cas9 proteins has been shown to be effective, as in N. meningitides and S. thermophilus, yet PAM restrictions from other type II systems reported, so far have more stringent requirements and therefore reduce the sequence space available for targeting when used alone (data not shown and 11,17). In contrast, modified gRNA expression by use of the H1 promoter would be expected to greatly expand the targeting repertoire with any Cas9 protein irrespective of PAM differences. When we quantitated the respective gRNAs targets for orthologous Cas9 proteins (AN23NNNNGATT vs. GN23NNNNGATT for N. meningitides and AN17NNAGAAW vs. GN17NNAGAAW for S. thermophilus), we found a 64% and 69% increase in the gRNA sites with a 5’-A nucleotide, indicating an even greater expansion of targeting space through use of the H1 promoter with alternate Cas9 proteins (Supplementary Table 1). As suggested in plants, use of different promoters can expand the frequency of CRISPR sites. While the U6 promoter is restricted to a 5’ guanosine nucleotide, the U3 promoter from rice is constrained to a 5’ adenosine nucleotide further highlighting the need for different promoters in different systems to increase targeting space36. Conveniently, sole use of the H1 promoter can be leveraged to target AN19NGG and GN19NGG sites (and possibly CN19NGG or TN19NGG sites39) via a single promoter system (Supplementary Fig. 6). This in turn can be employed to expand targeting space of both current and future Cas9 variants with altered sites restrictions.
Similarly with ZFN or TALEN technologies, one approach to mitigate potential off-target effects might be to employ cooperative offset nicking with the Cas9 mutant (D10A)24,35. This requires identification of two flanking CRISPR sites, oriented on opposing strands, and within ~20bp of the cut site24, and thus the additional targeting density provided by AN19NGG sites would be expected to augment this approach. An added benefit over the U6 promoter may also be to reduce spurious cleavage; as several groups have reported that increased gRNA and Cas9 concentrations correlate with an increase in the propensity for off-target mutations26,27,29, the lower level of expression provided by the H1 promoter may result in reduced off-target cutting.
With enhanced CRISPR targeting through judicious site selection, improved Cas9 variants, optimized gRNA architecture, or additional cofactors, an increase in specificity throughout the targeting sequence will likely result, placing greater importance on the identity of the 5’ nucleotide. As a research tool, this will allow for greater manipulation of the genome while minimizing confounding mutations, and for future clinical applications, high targeting densities coupled with high-fidelity target recognition will be paramount to delivering safe and effective therapeutics.
Methods
Plasmid construction
To generate the H1 gRNA-expressing construct, overlapping oligos were assembled to create the H1 promoter fused to the 76bp gRNA scaffold and pol III termination signal. In between the H1 promoter and the gRNA scaffold, a BamHI site was incorporated to allow for the insertion of targeting sequence. The H1::gRNA scaffold::pol III terminator sequence was then TOPO cloned into pCR4-Blunt (Invitrogen), and sequenced verified; the resulting vector is in the reverse orientation (see below). To generate the various gRNAs used in this study, overlapping oligos were annealed and amplified by PCR using two-step amplification Phusion Flash DNA polymerase (Thermo Scientific), and subsequently purified using Carboxylate-Modified Sera-Mag Magnetic Beads (Thermo Scientific) mixed with 2X volume 25%PEG and 1.5M NaCl. The purified PCR products were then resuspended in H2O and quantitated using a NanoDrop 1000. The gRNA-expressing constructs were generated using the Gibson Assembly40 (NEB) with slight modifications for either the AflII digested plasmid (Addgene #41824) for U6 expression, or BamHI digestion of plasmid just described for H1 expression. The total reaction volume was reduced from 20µl to 2µl.
Cell culture
The hESC line H7 and IMR-90 iPS cells (WiCell) were maintained by clonal propagation on growth factor reduced Matrigel (BD Biosciences) in mTeSR1 medium (Stem Cell Technologies), in a 10% CO2/5% O2 incubator according to previously described protocols 41,42. For passaging, hESC colonies were first incubated with 5µM blebbistatin (Sigma) in mTesR1, and then collected after 5–10 min treatment with Accutase (Sigma). Cell clumps were gently dissociated into a single cell suspension and pelleted by centrifugation. Thereafter, hPSCs were re-suspended in mTeSR1 with blebbistatin and plated at approximately 1,000–1,500 cells/cm2. Two days after passage, medium was replaced with mTeSR1 (without blebbistatin) and changed daily.
Human embryonic kidney (HEK) cell line 293T (Life Technologies) was maintained at 37°C with 5% CO2 / 20% O2 in Dulbecco’s modified Eagle’s Medium (DMEM) (Invitrogen) Supplemented with 10% fetal bovine serum (Gibco) and 2mM GlutaMAX (Invitrogen).
Gene targeting of H7 cells
hESC cells were cultured in 10µM Rho Kinase inhibitor (DDD00033325 EMD Millipore) 24h prior to electroporation. Electroporation were performed using the Neon kit (Invitrogen), according to the manufacturer instruction. Briefly, on the day of electroporation, hESC were digested with Accutase (Sigma) for 1–2 minutes until colonies lifted. Importantly, colonies were not dissociated into a single cell suspension. After colonies were harvested, wet pellets were kept on ice for 15 min, and then resuspended in electroporation buffer containing gene targeting plasmids. Electroporation parameters were as following: voltage: 1400 ms; interval: 30 ms; 1 pulse. Following electroporation, cell colonies were slowly transferred to mTeSR1 medium containing 10µM Rho Kinase inhibitor, and then kept at room temperature for 20 min before plating on Matrigel-coated dishes and further cultured.
For analysis of clonally derived colonies, electroporated hESC were grown to sub-confluence, passaged as described in the previous paragraph and plated at a density of 500 cells per 35mm dish. Subsequently, single colonies were isolated by manual picking and further cultured.
For 293T cell transfection, ~100,000 cells/well were seeded in 24-well plates (Falcon) 24 hours prior to transfection. Cells were transfected in quadruplicates using Lipofectamine LTX Plus Reagent (Invitrogen) according to manufacturer’s recommended protocol. For each well of a 24-well plate, 400ng of the Cas9 plasmid and 200ng of the gRNA plasmid were mixed with 0.5µl of Plus Reagent and 1.5µl of Lipofectamine LTX reagent.
Generation of constitutively expressed GFP ESC lines
The H7 human ESC line (WiCell) was maintained in mTeSR1 (Stem Cell Technologies) media on Matrigel substrate. Prior to cell passaging, cells were subjected to a brief pre-treatment with blebbistatin (>5 minutes) to increase cell viability, treated with Accutase for 7 minutes, triturated to a single cell suspension, quenched with an equal volume of mTesR, pelleted at 80xg for 5 minutes and resuspended in mTesR containing blebbistatin. 1×106 cells were pelleted, media carefully removed and cells placed on ice for 10–15 minutes. 10µg of AAV-CAGGS-EGFP donor vector (Addgene; #22212) containing homology to the AAVS1 safe-harbor locus, plus 5µg each of hAAVS1 1R + L TALENs Addgene # 35431 and 35432 23,43) in R-buffer were electroporated with a 100µl tip-type using the Neon Transfection System (Life Technologies, Grand Island, NY) with the following parameters: 1500V, 20ms pulse and 1 pulse. Cells were then added gently to 1 ml of medium and incubated at room temperature for 15 minutes and then plated onto Matrigel-coated 35mm dishes containing mTeSR and 5µM blebbistatin. After 2 days cells were seeded at a density of 1×104 after which time stable clonal sublines were manually selected with a fluorescence equipped Nikon TS100 epifluorescence microscope.
Surveyor assay and sequencing analysis for genome modification
For Surveyor analysis, genomic DNA was extracted by resuspending cells in QuickExtract solution (Epicentre), incubating at 65°C for 15 minutes, and then at 98°C for 10 minutes. The extract solution was cleaned using DNA Clean and Concentrator (Zymo Research) and quantitated by NanoDrop. The genomic region surrounding the CRISPR target sites was amplified from 100ng of genomic DNA using Phusion DNA polymerase (NEB). Multiple independent PCR reactions were pooled and purified using Qiagen MinElute Spin Column following the manufacturer’s protocol. An 8µl volume containing 400ng of the PCR product in 12.5mM Tris-HCl (pH 8.8), 62.5mM KCl and 1.875mM MgCl2 was denatured and slowly re-annealed to allow for the formation of heteroduplexes: 95°C for 10 minutes, 95°C to 85°C ramped at −1.0°C/sec, 85°C for 1 sec, 85°C to 75°C ramped at −1.0°C/sec, 75°C for 1 sec, 75°C to 65°C ramped at −1.0°C/sec, 65°C for 1 sec, 65°C to 55°C ramped at −1.0°C/sec, 55°C for 1 sec, 55°C to 45°C ramped at −1.0°C/sec, 45°C for 1 sec, 45°C to 35°C ramped at −1.0°C/sec, 35°C for 1 sec, 35°C to 25°C ramped at −1.0°C/sec, and then held at 4°C. 1µl of Surveyor Enhancer and 1µl of Surveyor Nuclease (Transgenomic) were added to each reaction, incubated at 42°C for 60 min, after which, 1µl of the Stop Solution was added to the reaction. 1µl of the reaction was quantitated on the 2100 Bioanalyzer using the DNA 1000 chip (Agilent). For gel analysis, 2µl of 6X loading buffer (NEB) was added to the remaining reaction and loaded onto a 3% agarose gel containing ethidium bromide. Gels were visualized on a Gel Logic 200 Imaging System (Kodak), and quantitated using ImageJ v. 1.46. NHEJ frequencies were calculated using the binomial-derived equation: ; where the values of “a” and “b” are equal to the integrated area of the cleaved fragments after background subtraction and “c” is equal to the integrated area of the un-cleaved PCR product after background subtraction 44.
Flow Cytometry
Following blebbistatin treatment, sub-confluent hESC colonies were harvested by Accutase treatment, dissociated into a single cell suspension and pelleted. Cells were then resuspended in Live Cell Solution (Invitrogen) containing Vybrant DyeCycle ruby stain (Invitrogen) and analyzed on an Accuri C6 flow cytometer.
Quantitative real-time qPCR
293T cells were seeded at 250,000 cells/well in 12-well plates (Falcon) 24 hours prior to transfection. Cells were transfected in triplicate using Lipofectamine LTX with Plus Reagent (Invitrogen) according to manufacturer’s recommended protocol with a 6-dose titration of the gRNA plasmid: 0 ng, 31.25ng, 62.5ng, 125ng, 250ng, or 500ng in each well. 48 hours post-transfection, total RNA was isolated using RNAzol RT (Molecular Research Center), and purified using Direct-zol RNA MiniPrep (Zymo). 500ng of total RNA was dsDNase (ArticZymes; Plymouth Meeting, PA USA) treated to remove residual genomic DNA contamination and reverse transcribed in a 20 µl reaction using Superscript III reverse transcriptase (Invitrogen) following the manufacturer’s recommendations. For each reaction, 0.1µM of the following oligonucleotides were used to prime each reaction; gRNA scaffold-CTTCGATGTCGACTCGAGTCAAAAAGCACCGACTCGGTGCCAC , U6 snRNA-AAAATATGGAACGCTTCACGAATTTG. The underlined scaffold sequence denotes an anchor sequence added for transcript stability. Each qPCR reaction was carried out in a Biorad CFX 96 real-time PCR machine in a 10 µl volume using the SsoAdvanced™ Universal SYBR® Green Supermix (Biorad) containing 250nM of oligonucleotide primers and 1 microliter of a 1:15 dilution of the RT reaction product from above. Reactions were carried out for 40 cycles with 95°C denaturation, 54°C annealing temperature and 60°C extension steps. The following primers were used for detecting the guide RNA and reference gene respectively: F1for-GTTTTAGAGCTAGAAATAGCAAGTTAA and guideRNAscaffrev-AAGCACCGACTCGGTGCCAC and U6snRNAF-CTCGCTTCGGCAGCACATATACT and U6snRNARev-ACGCTTCACGAATTTGCGTGTC. Relative normalized expression for each guide RNA sample and the s.e.m was calculated using the Biorad’s integrated CFX manager software.
Bioinformatics
To determine all the potential CRISPR sites in the human genome, we used a custom Perl script to search both strands and overlapping occurrences of the 23-mer CRISPR sequence sites GN19NGG or AN19NGG. To calculate the mean and median distance values, we first defined the predicted CRISPR cut site as occurring between the third and fourth bases upstream of the PAM sequence. After sorting the sequences, we then calculated the distances between all adjacent gRNAs in the genome. This data was imported into R to calculate the mean and median statistical values, and to plot the data. To calculate the mean density, the gRNA cut sites were binned across the genome and calculated for the frequency of occurrences. This data was plotted in R using the ggplot2 package, or used Circos to generate a circular plot45. To calculate the occurrences in human genes or at disease loci, we used BEDTools utility IntersectBED46 to find the occurrence of overlaps with either a RefSeq BED file retrieved from the UCSC Genome Browser or a BED file from OMIM (Online Mendelian Inheritance in Man, OMIM. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD), 2013. World Wide Web URL: http://omim.org/). The genomes used in this study were human (hg19), mouse (mm10), rat (rn5), cow (bosTau7), chicken (galGal4), zebrafish (dr7), drosophila (dm3), C. elegans (ce10), and S. cerevisiae (sacCer3).
Supplementary Material
Acknowledgements
This work was funding by the National Institutes of Health (NIH) (5T32EY007143, R01EY009769, 5P30EY001765), Maryland Stem Cell Research Foundation, Foundation Fighting Blindness, Research to Prevent Blindness, BrightFocus Foundation, and generous gifts from the Guerrieri Family Foundation and Mr. and Mrs. Robert and Clarice Smith.
Footnotes
Author contributions
V.R. conceived the study, designed the experiments, and analyzed the data with input from D.J.Z. V.R. generated the constructs and performed the biochemistry. V.R. and J.M performed the cell-culture work and flow cytometry. K.W. generated and validated the integrated reporter lines used in this study and performed the qRT-PCR experiment with VR. VR performed the bioinformatics and statistical analysis. V.R. wrote the paper with input from D.J.Z.
Competing financial interests: The authors declare no competing financial interests.
REFERENCES
- 1.Porteus MH, Baltimore D. Chimeric nucleases stimulate gene targeting in human cells. Science. 2003;300:763. doi: 10.1126/science.1078395. [DOI] [PubMed] [Google Scholar]
- 2.Miller JC, et al. An improved zinc-finger nuclease architecture for highly specific genome editing. Nat Biotechnol. 2007;25:778–785. doi: 10.1038/nbt1319. [DOI] [PubMed] [Google Scholar]
- 3.Sander JD, et al. Selection-free zinc-finger-nuclease engineering by context-dependent assembly (CoDA) Nature methods. 2011;8:67–69. doi: 10.1038/nmeth.1542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wood AJ, et al. Targeted genome editing across species using ZFNs and TALENs. Science. 2011;333:307. doi: 10.1126/science.1207773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Boch J, et al. Breaking the code of DNA binding specificity of TAL-type III effectors. Science. 2009;326:1509–1512. doi: 10.1126/science.1178811. [DOI] [PubMed] [Google Scholar]
- 6.Moscou MJ, Bogdanove AJ. A simple cipher governs DNA recognition by TAL effectors. Science. 2009;326:1501. doi: 10.1126/science.1178817. [DOI] [PubMed] [Google Scholar]
- 7.Christian M, et al. Targeting DNA double-strand breaks with TAL effector nucleases. Genetics. 2010;186:757–761. doi: 10.1534/genetics.110.120717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Miller JC, et al. A TALE nuclease architecture for efficient genome editing. Nat Biotechnol. 2011;29:143–148. doi: 10.1038/nbt.1755. [DOI] [PubMed] [Google Scholar]
- 9.Zhang F, et al. Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat Biotechnol. 2011;29:149–153. doi: 10.1038/nbt.1775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Reyon D, et al. FLASH assembly of TALENs for high-throughput genome editing. Nat Biotechnol. 2012;30:460–465. doi: 10.1038/nbt.2170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cong L, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jinek M, et al. RNA-programmed genome editing in human cells. eLife. 2013;2:e00471. doi: 10.7554/eLife.00471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mali P, et al. RNA-guided human genome engineering via Cas9. Science. 2013;339:823–826. doi: 10.1126/science.1232033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Cho SW, Kim S, Kim JM, Kim JS. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat Biotechnol. 2013;31:230–232. doi: 10.1038/nbt.2507. [DOI] [PubMed] [Google Scholar]
- 15.Hwang WY, et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. 2013;31:227–229. doi: 10.1038/nbt.2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jinek M, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hou Z, et al. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc Natl Acad Sci U S A. 2013 doi: 10.1073/pnas.1313587110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ding Q, et al. Enhanced Efficiency of Human Pluripotent Stem Cell Genome Editing through Replacing TALENs with CRISPRs. Cell stem cell. 2013;12:393–394. doi: 10.1016/j.stem.2013.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Adhya S, Basu S, Sarkar P, Maitra U. Location, function, and nucleotide sequence of a promoter for bacteriophage T3 RNA polymerase. Proc Natl Acad Sci U S A. 1981;78:147–151. doi: 10.1073/pnas.78.1.147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Melton DA, et al. Efficient in vitro synthesis of biologically active RNA and RNA hybridization probes from plasmids containing a bacteriophage SP6 promoter. Nucleic Acids Res. 1984;12:7035–7056. doi: 10.1093/nar/12.18.7035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Pleiss JA, Derrick ML, Uhlenbeck OC, et al. T7 RNA polymerase produces 5' end heterogeneity during in vitro transcription from certain templates. RNA. 1998;4:1313–1317. doi: 10.1017/s135583829800106x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Baer M, Nilsen TW, Costigan C, Altman S. Structure and transcription of a human gene for H1 RNA, the RNA component of human RNase P. Nucleic Acids Res. 1990;18:97–103. doi: 10.1093/nar/18.1.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hockemeyer D, et al. Efficient targeting of expressed and silent genes in human ESCs and iPSCs using zinc-finger nucleases. Nat Biotechnol. 2009;27:851–857. doi: 10.1038/nbt.1562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ran FA, et al. Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome Editing Specificity. Cell. 2013 doi: 10.1016/j.cell.2013.08.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.D'Cruz PM, et al. Mutation of the receptor tyrosine kinase gene Mertk in the retinal dystrophic RCS rat. Human molecular genetics. 2000;9:645–651. doi: 10.1093/hmg/9.4.645. [DOI] [PubMed] [Google Scholar]
- 26.Fu Y, et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol. 2013 doi: 10.1038/nbt.2623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Pattanayak V, et al. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nat Biotechnol. 2013 doi: 10.1038/nbt.2673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cho SW, et al. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome research. 2014;24:132–141. doi: 10.1101/gr.162339.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hsu PD, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013 doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Boden D, et al. Promoter choice affects the potency of HIV-1 specific RNA interference. Nucleic Acids Res. 2003;31:5033–5038. doi: 10.1093/nar/gkg704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.An DS, et al. Optimization and functional effects of stable short hairpin RNA expression in primary human lymphocytes via lentiviral vectors. Molecular therapy : the journal of the American Society of Gene Therapy. 2006;14:494–504. doi: 10.1016/j.ymthe.2006.05.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Makinen PI, et al. Stable RNA interference: comparison of U6 and H1 promoters in endothelial cells and in mouse brain. The journal of gene medicine. 2006;8:433–441. doi: 10.1002/jgm.860. [DOI] [PubMed] [Google Scholar]
- 33.Larson MH, et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nature protocols. 2013;8:2180–2196. doi: 10.1038/nprot.2013.132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Qi LS, et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013;152:1173–1183. doi: 10.1016/j.cell.2013.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mali P, et al. CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nat Biotechnol. 2013 doi: 10.1038/nbt.2675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Shan Q, et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat Biotechnol. 2013;31:686–688. doi: 10.1038/nbt.2650. [DOI] [PubMed] [Google Scholar]
- 37.Jinek M, et al. Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science. 2014;343:1247997. doi: 10.1126/science.1247997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nishimasu H, et al. Crystal structure of cas9 in complex with guide RNA and target DNA. Cell. 2014;156:935–949. doi: 10.1016/j.cell.2014.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tuschl T. Expanding small RNA interference. Nat Biotechnol. 2002;20:446–448. doi: 10.1038/nbt0502-446. [DOI] [PubMed] [Google Scholar]
- 40.Gibson DG, et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature methods. 2009;6:343–345. doi: 10.1038/nmeth.1318. [DOI] [PubMed] [Google Scholar]
- 41.Walker A, et al. Non-muscle myosin II regulates survival threshold of pluripotent stem cells. Nat Commun. 2010;1:71. doi: 10.1038/ncomms1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Maruotti J, et al. A simple and scalable process for the differentiation of retinal pigment epithelium from human pluripotent stem cells. Stem cells translational medicine. 2013;2:341–354. doi: 10.5966/sctm.2012-0106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Sanjana NE, et al. A transcription activator-like effector toolbox for genome engineering. Nature protocols. 2012;7:171–192. doi: 10.1038/nprot.2011.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Guschin DY, et al. A rapid and general assay for monitoring endogenous gene modification. Methods in molecular biology. 2010;649:247–256. doi: 10.1007/978-1-60761-753-2_15. [DOI] [PubMed] [Google Scholar]
- 45.Krzywinski M, et al. Circos: an information aesthetic for comparative genomics. Genome research. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Cermak T, et al. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Res. 2011;39:e82. doi: 10.1093/nar/gkr218. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.