Abstract
The past decade has witnessed a rapid evolution in identifying more versatile clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein (Cas) nucleases and their functional variants, as well as in developing precise CRISPR/Cas-derived genome editors. The programmable and robust features of the genome editors provide an effective RNA-guided platform for fundamental life science research and subsequent applications in diverse scenarios, including biomedical innovation and targeted crop improvement. One of the most essential principles is to guide alterations in genomic sequences or genes in the intended manner without undesired off-target impacts, which strongly depends on the efficiency and specificity of single guide RNA (sgRNA)-directed recognition of targeted DNA sequences. Recent advances in empirical scoring algorithms and machine learning models have facilitated sgRNA design and off-target prediction. In this review, we first briefly introduce the different features of CRISPR/Cas tools that should be taken into consideration to achieve specific purposes. Secondly, we focus on the computer-assisted tools and resources that are widely used in designing sgRNAs and analyzing CRISPR/Cas-induced on- and off-target mutations. Thirdly, we provide insights into the limitations of available computational tools that would help researchers of this field for further optimization. Lastly, we suggest a simple but effective workflow for choosing and applying web-based resources and tools for CRISPR/Cas genome editing.
Keywords: Genome editing, Efficiency and specificity, CRISPR/Cas9, sgRNA, Computational tool, Algorithm
Introduction
The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein (Cas) system was discovered from the adaptive immune system of bacteria and archaea, which employs ∼ 20-bp RNA CRISPR arrays for guiding Cas nucleases to specifically recognize and cleave the invader’s nucleic acid sequences [1], [2]. In the last decade, this system was developed as a robust genome editing tool to generate sequence-specific mutagenesis at desired genomic sites in a wide range of organisms including both plants and animals [3], [4], [5], [6], [7], [8], [9]. Currently, the CRISPR/Cas genome editing tools have been rapidly modified for further broadening their application potentials [10] (Figure 1). After the Cas9 nuclease, the first discovered Cas nuclease, was used for CRISPR genome editing, other types of Cas nucleases and their orthologues were also proved to have potentials for genome editing. Meanwhile, scientists are also engineering and modifying the existing Cas nucleases to enhance CRISPR/Cas applications. Currently, a variety of CRISPR/Cas-derived genome editors, including base editors and prime editors, provide more options for selecting genome editing tools [11], [12] (Figure 1). Because CRISPR/Cas-based genome editing is precise, robust, and powerful, it has become a revolutionary approach for both foundational and applied research, including clinical CRISPR gene therapy and crop improvement [10], [13], [14].
Although different types of CRISPR/Cas systems exhibit similarities in their genome editing patterns, the recognition and cleavage methods and their underlying machineries are different, which directly determine how to choose the optimal CRISPR/Cas tools for individual experimental purposes. To simplify and accelerate CRISPR/Cas-related research, many laboratories have developed different computational tools and resources for designing single guide RNAs (sgRNAs) and analyzing the genome editing results, including both on- and off-target effects. Currently, CRISPR/Cas tools are not only restricted in genome modification, the high-efficiency binding features of dead Cas9 (dCas9)/Cas9 nickase (nCas9) and their variants allow them to be adapted rapidly for fusing with other functional enzymes to achieve gene regulation, including CRISPR/dCas-mediated gene transcriptional modulation and epigenetic modifications [12], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24]. In this review, we comprehensively summarize our current knowledge of CRISPR/Cas genome editing and the key parameters involved in choosing suitable computational tools. In addition, we systematically summarize the features of several major web-accessible tools for designing sgRNAs and analyzing post-genome editing data; these tools are widely used in both animal and plant genome editing.
Workflow for performing genome editing experiments
Rapid evolution of CRISPR/Cas genome editing techniques offers more diverse applications that are not just limited to the targeted mutations in desired genomic DNA sequences by inducing double strand breaks (DSBs). Basically, the purpose of applying CRISPR/Cas genome tools is to target and modify a genome sequence, which is subsequently used for identifying gene functions and potential applications, such as human therapeutic purposes and crop genetic improvement [10], [14], [25]. To precisely edit a specific genome sequence by CRISPR/Cas, several key procedures need to be taken into consideration.
Different CRISPR/Cas genome editing techniques have distinct features for achieving certain types of experimental purposes. The common purposes for using CRISPR/Cas tools include: (1) impairing gene functions by creating targeted mutagenesis in their functional domains, which can be achieved by inducing high-frequency DSBs by using the traditional CRISPR/Cas genome editors; (2) remodeling gene roles by precisely modifying specific nucleotide base sequences, which preferably uses base editor and prime editor; and (3) modulating gene expression, in which CRISPR/Cas-based gene activation and repression approaches are usually employed.
CRISPR/Cas genome editing experiments mainly consist of three major steps (Figure 2): (1) designing sgRNAs to target a gene of interest; (2) choosing an efficient transformation method to deliver the CRISPR/Cas reagents into targeted cells; and (3) screening mutations and analyzing genome editing events. These three steps are extremely important for CRISPR/Cas genome editing. Designing sgRNAs provides a complementary genome site for targeting a specific gene. An ideal sgRNA not only binds to the target sequence with high efficiency but also minimizes the possibility of recognizing other sequence sites that causes off-target effects. Many computational tools have now been developed to design sgRNAs. These web-based computational tools and databases provide a public platform for researchers to identify perfect sgRNAs, and also to predict possible off-target effects.
Delivery of CRISPR/Cas reagents into targeted cells is always required. Without delivering CRISPR/Cas reagents, it is impossible for a sgRNA to bind to the target site and allow Cas enzyme to recognize and edit the specific sequences. There are many transgenic approaches developed for delivering CRISPR/Cas reagents into targeted cells with different purposes. For plants, CRISPR/Cas constructs can be transferred into plant cells by Agrobacterium-mediated T-DNA transgene methods, but exogenous fragments can be integrated into plant genomes. CRISPR/Cas ribonucleoprotein (RNP) complex can be used for delivery as well and have been demonstrated in mammalian and plant cells. The sgRNAs and Cas proteins would be degraded after generating mutagenesis, which is beneficial for reducing off-target effects. Many excellent reviews have summarized transgene techniques in detail [26], [27].
After CRISPR/Cas targets the specific sequences, it is necessary to screen the editing events and estimate the potential off-target impacts. Thus, evaluation of the genome editing efficacy is a crucial part of applying CRISPR/Cas genome editing techniques. Successful genome editing should specifically modify the targeted genome sequences without off-target effects on other genome locations. To identify mutation types, many experiment-based methods and high-throughput screening strategies have been developed.
Best practices for sgRNA design
Efficiency and specificity are two main criteria for CRISPR/Cas genome editing. Efficiency demonstrates how well a sgRNA targets the specific sequence and guides a Cas enzyme to edit the targeted sequences; it is usually presented by the percentage of cells that are edited. Specificity means the CRISPR/Cas editing events are unique or not and whether they cause off-target effects. There are many factors affecting CRISPR/Cas genome editing efficiency and specificity that have been integrated into sgRNA design [28]. The affinity between the RNP complex and the targeted DNA sequences depends on the hybridization of sgRNAs and DNA sequences through sequence complementarity. Previous studies suggest that different binding sites result in huge differences in cleavage efficiency and specificity among different organisms [29], [30], [31], [32]. Several web-accessible databases have been established by collecting sgRNA data from large-scale CRISPR/Cas experiments [33], [34], [35], [36], [37] (Table 1). Based on the analysis, these databases not only provide practical resources for sgRNA selection but also reveal the key factors that affect sgRNA efficacy and specificity, which would facilitate the further optimization of sgRNA design.
Table 1.
Name | Organism | Cas nuclease | Major feature | Database or web server | Website | Refs. |
---|---|---|---|---|---|---|
CRISPOR | > 100 species | > 30 Cas9 orthologues and Cas variants | Designing, evaluating, and cloning guide sequences for the CRISPR/Cas9 system; providing primers for vector construction; indicating mismatch number; and linking off-target to genome browser | Web server | https://crispor.tefor.net/ | [38] |
CHOPCHOP | > 100 species | Cas9, Cas12, Cas13, and TALEN | Providing multiple predictive models; visualizing genomic location of targets and genes; and providing primers | Web server | https://chopchop.cbu.uib.no/ | [46], [47] |
CRISPR RGEN Tools | > 100 species | > 20 Cas9 orthologues and Cas variants | Providing multiple predictive models; downloadable and standalone; and predicting potential off-target number via Cas-OFFinder, and out-of-frame scores via Microhomology-Predictor | Web server | https://www.rgenome.net/cas-designer/ | [84], [93] |
E-CRISP | > 50 species | SpCas9 | Feasibly creating genome-scale libraries; downloadable; and frequently updated | Web server | https://www.e-crisp.org/E-CRISP/index.html | [49] |
GUIDES | Human and mouse | SpCas9 | Feasibly designing CRISPR knockout libraries; downloadable; and step-by-step | Web server | https://guides.sanjanalab.org/ and https://github.com/sanjanalab/GUIDES | [11] |
CRISPRscan | > 10 species | Cas9 and Cas12 | Designing sgRNAs for protein-coding genes; ready-to-inject sgRNA sequence; tracks for genome browser; and searching whole-genome off-target impacts | Web server | https://www.crisprscan.org/ | [45] |
CCTop | > 100 species | > 10 Cas9 orthologues and Cas variants | Searching for single and multiple queries; indicating mismatch number; predicting off-target impacts; and predicting sgRNA efficiency using CRISPRater with custom in vitro transcription selection | Web server | https://cctop.cos.uni-heidelberg.de/ | [39] |
CRISTA | > 100 species | SpCas9 | Providing machine learning framework, including DNA/RNA bulge genomic context and RNA thermodynamics; detecting off-targets; and ranking targets | Web server | https://crista.tau.ac.il/ | [56] |
DeepCRISPR | Human | SpCas9 | Incorporating epigenetic information; and predicting off-target impacts | Web server | https://www.deepcrispr.net/ | [57] |
DRSC Find CRISPRs | Drosophila | SpCas9 | Providing off-target stringency from 3 to 5 mismatches; and separating target region and potential off-targets by different tracts | Web server |
https://www.flyrnai.org/crispr/ https://www.flyrnai.org/crispr3/web/ |
[72] |
EuPaGDT | Eukaryotic pathogens | > 10 Cas9 orthologues and Cas variants | Providing wide compatibility for eukaryotic pathogen genomes | Web server | https://grna.ctegd.uga.edu/ | [73] |
WU-CRISPR | Human and mouse | SpCas9 | Providing machine learning algorithm trained by experimental data; providing custom sequence between 26 bp and 30,000 bp with one sequence per time; and downloadable results | Web server | https://crispr.wustl.edu/ | [155], [156] |
GPP sgRNA Designer | Human, mouse, and rat | SpCas9, SaCas9, and AsCpf1 | Inputting up to 200 transcript IDs or gene IDs; maximizing on-target activity and minimizing off-target activity; and scoring on-targeting efforts | Web server | https://portals.broadinstitute.org/gpp/public/analysis-tools/sgrna-design | [48] |
CRISPR-GE | > 40 plant species | SpCas9, FnCpf1, and AsCpf1 | Providing software toolkits, primer design for vector construction, on-target amplification, and PCR sequencing result analysis | Web server | https://skl.scau.edu.cn/ | [94] |
CRISPR-P | 49 plant species | > 14 Cas9 and variants | Supporting wide range of plant species; providing on-target and off-target scoring; and providing gRNA sequence analysis | Web server | https://crispr.hzau.edu.cn/CRISPR2/ | [95], [96] |
CRISPR-PLANT V2 | 7 plant species | SpCas9 | Supporting main model and crop plant species; providing selection of chromosome and locations with clear instruction | Web server | https://www.genome.arizona.edu/crispr2/ | [157] |
CRISPRz | Zebrafish, human, and mouse | SpCas9 | Providing specific for a wide variety of cell lines and organisms including zebrafish; and providing validated sgRNA database | Web server | https://research.nhgri.nih.gov/CRISPRz/ | [82] |
CRISPRlnc | 10 species | SpCas9 | Providing downloadable validated sgRNA database for lncRNAs | Database | https://www.crisprlnc.org/ | [81] |
FORECasT | Human | SpCas9 | Predicting the mutational outcomes | Web server | https://partslab.sanger.ac.uk/FORECasT | [87] |
AsCRISPR | Human and mouse | SpCas9, AsCpf1, AaCas12b, CasX, and variants | Designing sgRNAs for allele-specific genetic elements | Web server | https://www.genemed.tech/ascrispr/ascrispr | [98] |
SNP-CRISPR | 9 plant and animal species | NGG and NAG PAM | Designing sgRNAs for targeting SNPs or Indel variants | Web server | https://www.flyrnai.org/tools/snp_crispr/web/ | [99] |
SSC | N/A | Cas9 | For both CRISPR knockout and CRISPRa/CRISPRi | Web server | https://cistrome.org/SSC/ | [35] |
DeepHF | N/A | SPCas9 and Cas9HF | gRNA designer and efficiency prediction | Web server and database | https://www.DeepHF.com/ | [158] |
PnB Designer | 6 species | Cas9 | Designing pegRNAs for prime editors and sgRNAs for base editors | Web server | https://fgcz-shiny.uzh.ch/PnBDesigner/ | [99], [100] |
inDelphi | Human | SpCas9 | Predicting the mutational outcomes | Web server | https://www.crisprindelphi.design/ | [86] |
Note: Cas, CRISPR-associated protein; CRISPR, clustered regularly interspaced short palindromic repeats; CRISPRa, CRISPR activation; CRISPRi, CRISPR interference; gRNA, guide RNA; lncRNA, long non-coding RNA; pegRNA; prime editing guide RNA; sgRNA, single guide RNA; TALEN, transcription activator-like effector nuclease; N/A, not available.
To systemically characterize the relationship between sgRNA features and cleavage efficiency, Zhang and coworkers assessed more than 700 sgRNA variants and over 100 potential target sites in human cells [33]. Their results suggested that the total number, position, and distribution of mismatched bases were crucial to determine the cleavage activity of CRISPR/Cas9 targets [33]. In addition, a mismatched single-base located in the protospacer adjacent motif (PAM)-proximal region is more sensitive than the PAM-distal counterparts [33]. To refine sgRNA efficacy and its prediction, Labuhn and colleagues employed fluorescent reporter knockout assays to test the target efficacies of 430 sgRNAs; based on their experimental results, they developed a linear model-based discrete system, called CRISPRater, for predicting sgRNA efficiency [36]. Currently, this algorithm has been integrated with other sgRNA designing programs, such as CRISPOR [38] and CCTop [39].
Effect of nucleotide composition and location on sgRNA design
The nucleotide composition of a sgRNA, particularly GC content, is essential to determine its efficiency and specificity. One of the most important applications of CRISPR/Cas tools is to perform whole-genome screening for gene functional analysis [31], which also provides important information for uncovering nucleotide preference of sgRNAs. Based on analyzing the data of 1841 sgRNAs designed for targeting endogenous mouse and human genes, Doench and colleagues developed a predictive model (named Rule Set 1, which is based on sgRNA sequence features) to clarify general rules for designing highly active sgRNAs [40]. After quantification of the sequence features correlated with the activities of sgRNAs, they found that the GC content of a sgRNA did not display a positive correlation with the sgRNA activity in genome editing; both high and low GC contents of sgRNAs led to less efficient genome editing [40]. A similar rule was also identified in performing genome-scale functional screens using human cells and zebrafish [31], [41]. Additionally, several large-scale datasets suggest that the type of nucleobase is important for sgRNA activity [40], [42]. The nucleotide at the position 20, located immediately upstream of PAM, is a key determinant. Guanine was highly favorable whereas cytosine was strongly unfavorable [31], [40], [41]. In contrast, the position 16, the last nucleotide of the seed region, preferred cytosine over guanine [40], [42]. Theoretically, the transcription of sgRNAs relies on RNA polymerase III that recognizes uracil-rich sequences for termination [43], [44]. The uracil-rich sequence structure might lead to early termination of sgRNAs and then impair expression [42]. Thus, sgRNA sequences with thymine-rich nucleobase are not favorable at their 3′ end region. Additionally, adenine is preferable in the middle of a sgRNA, whereas cytosine has negative effects at the position 3 [31], [40].
Zebrafish is an ideal model organism for performing large-scale analysis of sgRNA activity. To dissect the sgRNA molecular features affecting the efficacy of CRISPR/Cas9 in vivo, a sgRNA pool was constructed by introducing 1280 sgRNAs to target 128 genes in the zebrafish genome [45]. The researchers found that sgRNA stability in vivo plays a critical role in determining sgRNA activity. The formation of a guanine-quadruplex structure, which contains at least eight guanines, can significantly increase sgRNA stability. Additionally, several sequence features were identified by statistical analysis of the most efficient sgRNAs, such as guanine enrichment in the region of positions 1–14, cytosine enrichment between the position 15 and the position 18, and overall depletion of thymidine and adenine except the positions 9 and 10 [45]. Taken together, a linear regression-based predictive sgRNA-scoring algorithm, named CRISPRscan (http://CRISPRscan.org), was proposed for detecting the most active sgRNAs in vivo [45]. The CRISPRscan model is also implemented in other web-based sgRNA design tools, such as CHOPCHOP [46], [47] and CRISPOR [38].
Given the hypothesis that sgRNA activity could be influenced by several other features, such as the position-independent nucleotides, the location of the target sites in the gene, and the thermodynamic property of a sgRNA, the Rule Set 1 predictive model was further improved by integrating new prediction algorithms and generated “Rule Set 2”. It employs the improved algorithms for on- and off-target activity prediction, and the gradient-boosted regression tree model with the augmented feature set trained on the combined dataset, which is used not only for sgRNA libraries for general genome editing purposes (gene knockout and knockin) but also for CRISPR activation (CRISPRa) and CRISPR interference (CRISPRi) [37]. The Rule Sets 1 and 2 were widely implemented in many websites and computational tools for designing sgRNAs, including CHOPCHOP [46], [47], CRISPOR [38], GPP sgRNA Designer [48], and E-CRISP [49].
Some other factors also affect Cas nuclease binding and cleavage. It has been suggested that both sequence composition and locus accessibility are important to determine sgRNA activity, which subsequently influence the sgRNA design tools, such as sgRNAScorer [50], [51]. Additionally, chromatin accessibility [52], [53], [54], [55] and asymmetric sgRNA–DNA interactions also affect CRISPR/Cas cutting specificity [37], [56]. Currently, many groups have integrated these algorithms into their web-based applications, such as DeepCRISPR, CRISTA [56], [57], predictSGRNA [58], and uCRISPR [59]. GuidePro is a two-layer ensemble predictor for sgRNA efficiency prediction that enables the integration of multiple factors for the prioritization of sgRNAs for gene knockout [60].
Designing prime editing guide RNAs for prime editing
Prime editing is a new application of CRISPR/Cas technology in which a small-sized genetic sequence is altered without requiring a donor DNA template. In prime editing system, a prime editing guide RNA (pegRNA) is used to replace the traditional sgRNA, which contains a primer binding site (PBS) and a reverse transcriptase (RT) template sequence. After nCas9 cuts a target DNA sequence, the PBS sequence will be elongated and inserted into the original DNA sequence for DNA replacement [61]. Thus, prime editing can be used to repair any nucleotide error without a DNA template. Due to these advantages, prime editor has huge potentials for genome editing. However, evaluation of prime editing efficiency is time- and lab-intensive. To solve this problem, Kim and colleagues used deep learning to create a precise computational model for measuring the efficiency of pegRNAs based on high-throughput evaluation of 54,836 pegRNA–target pairs in human cells [62]. More importantly, this computational tool and resources can be found in their publicly available website http://deepcrispr.info/DeepPE/.
Off-target consideration
One of the main concerns about sgRNA design is off-target effects that are normally generated by unexpected cleavage at genomic sites similar to the target sequences [33], [63]. Thus, traditional short sequence alignment tools, such as Burrows-Wheeler Alignment Tool (BWA) and Bowtie [64], [65], [66], have been used to predict potential off-target sites [38], [49]. Given that BWA and Bowtie are originally designed for aligning short DNA reads to large reference genomes [64], [65], there are several innate defects for predicting off-target effects. For instance, CRISPR/Cas has been suggested to tolerate more mismatches than traditional BWA or Bowtie alignment allows [33], [67], [68]. Additionally, nucleotide positions are important for target specificity, and atypical PAM could be recognized by CRISPR/Cas9 as well [33], [37]. To overcome these problems, many improved off-target prediction tools have been reported. For example, CCTop can predict potential off-target sites with four mismatches differently distributed in the targeted genomic sites [39], and Cas-OFFinder is not limited by the number of mismatches and allows variations in PAM sequences [67].
To predict off-target sites more accurately, several computational models were built based on large amounts of experimental data. After evaluating more than 100 predicted genomic off-target loci in two human embryonic kidney cell lines [33], several rules were proposed to minimize off-target effects, including that (1) the potential off-target sequences should not be followed by a PAM with either a 5′-NGG or 5′-NAG sequence, and (2) the minimum mismatches between sgRNA and potential off-target sites should be limited to 3 nt and at least two mismatches are better in the proximal PAM region. These rules have been implemented in their specificity score tool, termed MIT, which has subsequently been implemented in web-accessible applications, such as CHOPCHOP [46], [47] and CRISPOR [38]. Another commonly used specificity score tool is Cutting Frequency Determination (CFD), proposed by Doench and colleagues [37]. In addition to mismatch position of sgRNA and atypical PAM effect, the identities of mismatched nucleotides and insertion and deletion (indel) variants can significantly affect sgRNA activity. CFD has been shown to predict most off-target sites and exhibit better performance than MIT and CCtop by using GUIDE-seq, an unbiased experimental method for detection of sgRNA off-target effects [69]. CFD has been implemented in CRISPOR, GPP sgRNA Designer, GUIDES, and other web-related tools.
Currently, there are many computational programs for designing sgRNAs and predicting their genome editing efficiency and specificity. To comprehensively benchmark these techniques and tools, several available on-target design tools, genome-wide off-target cleavage site (OTS) detection techniques, and in silico genome-wide OTS prediction tools have been systematically evaluated [70], [71]. A one-stop platform, named integrated Genome-Wide Off-target cleavage Search platform (iGWOS), was constructed by integrating these available OTS prediction algorithms and datasets [70], [71].
Web-based tools and resources available for designing sgRNAs
The growing application of CRISPR/Cas techniques provides more data to optimize computational analysis models. As shown in Table 1, a large number of available sgRNA design tools have been compared and the majority of them displayed different features.
Because genetic and epigenetic features of the genome are essential to sgRNA efficacy, many comprehensive sgRNA design websites are constructed for diverse genomes, such as CHOPCHOP, CRISPOR, CRISPR RGEN Tools, and E-CRISP. Some are compatible with dozens or even hundreds of organisms (Table 1). However, other tools are restricted to a certain type of genome background. For instance, CRISPR-PLANT, CRISPR-P, and CRISPR-GE are online sgRNA design resources that mainly serve plant species. DRSC Find CRISPRs was designed for genome editing of Drosophila [72]. EuPaGDT is a tailored website tool for eukaryotic pathogens [73]. In contrast to the comprehensive websites that only offer sgRNA design services, these organism-specialized tools usually provide empirical CRISPR/Cas vectors and protocols that are very useful for wet lab experiments. Moreover, CRISPy-web implements sgRNA design with a user-provided microbial genome [74]. Thus, based on individual research objectives, the first step is always to design an appropriate sgRNA by selecting a suitable sgRNA design tool.
Selecting a genome editing system also depends on the experimental purpose. Constructing genome-scale CRISPR/Cas9 knockout libraries has been achieved in certain organisms, such as human cells [31], [34], [75], mouse [76], [77], zebrafish [78], and rice [79], [80]. To this end, Graphical User Interface for DNA Editing Screens (GUIDES) provides a website application for constructing genome-wide CRISPR/Cas-mediated mutation libraries in human and mouse genomes [11]. Additionally, CRISPRlnc and CRISPRz web tools are established by collecting experimentally validated sgRNAs generated from large-scale mutagenesis data and published sources [81], [82], which can be directly chosen for subsequent experiments. However, for small-scale genome editing experiments, PAM requirements should be one of the most important limitations for designing sgRNAs. Some websites only support SpCas9, whereas others have many Cas nuclease options and relatively broad ranges of PAM variants available for diverse experimental purposes. Additionally, certain tools, such as CHOPCHOP, provide an “Option” menu that can customize PAM types.
As summarized in the aforementioned discussion, many predictive models and scoring algorithms have been developed for predicting sgRNA specificity and efficiency, which may have distinct predictive scoring system. CRISPOR and CHOPCHOP integrate multiple scoring models into their web tools. For example, ten efficiency scores and two specificity scores have been combined in CRISPOR tool; CHOPCHOP employs six efficiency scores and two specificity scores.
Predicting CRISPR/Cas outcomes is a relatively new development for increasing the accuracy of sgRNA design. Non-homologous end-joining (NHEJ) is a central mechanism for repairing CRISPR/Cas-generated DSBs. Since NHEJ simply rejoins break ends together without using a homologous sequence for guidance template, this error-prone repair approach has been considered as the major method for inducing indel mutations at the DSB sites. Previous studies have demonstrated that NHEJ-mediated error-prone repair is nonrandom and strongly biased by short and homologous sequences around the DSBs, termed microhomology mediated end joining (MMEJ) [83], [84], [85]. FORECasT and inDelphi are two recommended CRISPR/Cas predictive tools that were developed by training with large-scale experimental data [86], [87].
Because human therapeutic treatments and crop genetic improvement are two main application areas of CRISPR/Cas technology, several web-based tools, which are commonly used in animal and plant genome editing, are recommend below.
CRISPOR
CRISPOR provides multiple tools that include efficiency prediction, specificity prediction, and a primer design tool for vector construction as well as on-target and off-target detection. CRISPOR incorporates almost all empirical algorithms for predicting efficiency, such as Rule Set 2 [37], [40], CRISPRscan [45], Wang et al. [31], Chari et al. [51], and Xu and coworkers [35]. They also apply “deepCpf1” and “Najm et al.” to predict Cas12a and SaCas9 efficiencies [88], [89], [90], respectively. The predicted results are well visualized by these models. For specificity prediction, CRISPOR includes MIT and CFD that are two mainstream specificity prediction tools. CRISPOR also integrates two CRISPR/Cas outcome predictive models, out-of-frame score and frameshift ratio [84], [85], to further reduce cutting efficiency. In addition, several critical factors such as the GC content and the type and number of mismatches (0–4 nt) are labeled in the results. CRISPOR covers hundreds of organisms. Different nucleases and PAM types are also available for selection. These features allow the majority of researchers to use CRISPOR for designing different CRISPR/Cas genome editing experiments.
CHOPCHOP
CHOPCHOP is also a comprehensive website for sgRNA design. Both CRISPR/Cas and transcription activator-like effector nuclease (TALEN) systems are supported by CHOPCHOP. Additionally, CHOPCHOP provides various targeting systems, such as knockout, knock-in, gene activation, and gene repression. Similar to CRISPOR, CHOPCHOP also provides multiple predictive models, and the user can choose one of them to predict cutting specificity and efficiency. In addition, CHOPCHOP has a “Custom PAM” option that is convenient for choosing different PAM sequences. It has been reported that cell types may affect the DSB repair pathway and then influence CRISPR/Cas genome editing outcomes [91], [92]. Several cell types, including mESC, U2OS, HEK293, HCT116, and K562, are optional in the CHOPCHOP website for accurate outcome prediction. It is also important that CHOPCHOP is compatible with more than 200 genomes. It allows researchers to design sgRNAs in a specific region of a gene, such as 5′ UTR, 3′ UTR, promoter, or the coding region.
CRISPR RGEN Tools
CRISPR RGEN Tools is a CRISIPR/Cas library platform that contains multiple sgRNA design tools. For example, CRISPR RGEN Tools employs Cas-designer for conventional CRISPR/Cas nucleases, BE-Designer for CRISPR base editing, and PE-Designer for CRISPR prime editing [93]. In addition, PE-Designer only allows for SpCas9; both Cas-designer and BE-Designer have wide PAM compatibility. More than 100 organisms are well organized in those three tools. Microhomology-Predictor is an outcome-predictive tool that introduces out-of-frame score algorithm to evaluate potential in-frame deletions caused by the MMEJ repair approach [84]. In addition to CRISPR/Cas, this tool also supports other programmable nucleases, such as zinc finger nucleases (ZFNs) and TALENs, and an out-of-frame score over 66 is recommended. Thus, a user can utilize those tools to implement different experimental purposes; it is also helpful for designing sgRNAs with high accuracy.
CRISPR-GE
CRISPR-GE is a web-based tool for designing sgRNAs in plants [94]. CRISPR-GE covers 41 plant genomes, including several agriculturally important crops, such as rice (Oryza sativa japonica), corn (Zea mays), and grape (Vitis vinifera). This tool also includes multiple Cas nucleases, such as SpCas9, FnCas12a, and AsCas12a, for helping the users to design sgRNAs for different CRISPR/Cas systems. Additionally, CRISPR-GE provides a “User defined” option that allows the users to customize PAM sequences (including 5′ and 3′ PAMs) and the length of target sites. CRISPR-GE provides warning notes for indicating “bad site”, such as very low or very high GC contents, poly-T site(s), and contiguous base-pairing with a sgRNA. CRISPR-GE implements CFD model to predict the specificity of a target site. CRISPR-GE also provides a primer design tool to assist vector construction and mutant detection.
CRISPR-P
CRISPR-P is another web-based tool for designing sgRNAs for plants [95], [96], which covers 75 plant genomes and the majority of them are important grain crops. Compared with CRISPR-GE, there are more CRISPR/Cas PAM types available in CRISPR-P, which include NGG (SpCas9), NNAGAAW (St1Cas9), N4GMTT (NmCas9), NNGRRT (SaCas9), and NG (xCas9). Additionally, CRISPR-P allows the users to choose U3 or U6 sgRNA promoter-driven expression cassettes for designing sgRNAs. The users can input gene ID/name, position on scaffold and chromosome, and fasta format sequences for submitting data. CRISPR-P implements Rule Set 1/2 and CFD to predict on-target and off-target effects. The sgRNA predictive outputs are well visualized, which includes sgRNA GC content, restriction endonuclease site, secondary structure of sgRNA [97], and microhomology score [84].
AsCRISPR
AsCRISPR is a comprehensive web tool for designing sgRNAs for allele-specific genome elements, which can be used to discriminate between alleles. This tool is specifically designed for targeting dominant single nucleotide variants (SNVs) retrieved from ClinVar and OMIM databases [98]. In this publicly available web tool, several Cas enzymes, such as SpCas9, AsCas12a, and Cas12v, as well as CasX and their variants, can be selected. Currently, this web tool is only for targeting SNVs in the human and mouse genomes.
SNP-CRISPR
SNP-CRISPR is a web-based computational program for designing sgRNAs based on public variant datasets or user-identified variants [99]. It can be used for both model species and non-reference genomes as well as across varying genetic backgrounds, particularly for SNP-containing alleles. SNP-CRISPR also calculates the efficiency and specificity scores for sgRNA designs targeting both the variants and the reference.
PnB Designer
PnB Designer is a web-based tool for designing sgRNAs for both prime and base editors, two newly developed CRISPR/Cas genome editors [100]. PnB Designer design sgRNAs for both single and multiple genome targets on several different plant and animal species.
Sequence scan for CRISPR
Sequence scan for CRISPR (SSC; https://cistrome.org/SSC/) is one online web server for scanning sgRNA spacer [35]. It is not only for designing sgRNAs for CRISPR knockout but also for CRISPR inhibition or activation with sgRNA efficiency prediction.
In addition to academic-developed publicly available computational tools, certain CRISPR companies have also developed several useful computational tools and resources for the public. These design tools include, but are not limited to: IDT (https://www.idtdna.com/site/order/designtool/index/CRISPR_CUSTOM), Horizon (https://horizondiscovery.com/en/ordering-and-calculation-tools/crispr-design-tool), and Synthego (https://www.synthego.com/products/bioinformatics/crispr-design-tool).
Best practice for downstream analysis and tools/resources available for performing downstream analysis
To identify desired genome editing events after CRISPR/Cas genome editing experiments, many experiment-based methods and computational tools have been developed for detecting the indels induced by genome editing enzymes in the targeted sequences. In 1995, Mashal and colleagues developed a method that frequently determines the level of activity for a sgRNA in hetero-duplexed DNA (hdDNA) [101]. In this assay, reagents are transfected into the cells; genomic DNA surrounding the target locus is amplified by using polymerase chain reaction (PCR). Then, the PCR products are denatured and re-complexed under heating and then subsequent slow cooling. If an aberrant NHEJ event occurred, a heteroduplex forms between amplicons of different length in mutant and wild-type amplicons. These amplicons lead to DNA distortion, which is recognized and cleaved by T7 endonuclease I (T7E1). This method has been widely adopted to test CRISPR/Cas9 genome editing events. However, the accuracy of the T7E1 enzyme is questioned due to the low dynamic range and the requirement of hetero-duplex formation, which lead to incorrect prediction of sgRNA activity [102].
Decoding Sanger sequencing of on-target sites
To enable easy quantification of CRISPR/Cas9 genome editing products, several new methods have been developed by directly decoding Sanger sequencing data (Table 2). For example, tracking of indels by decomposition assesses (TIDE) is a decomposition algorithm that is able to precisely determine the indel spectrum and frequency of targeted mutations generated by CRISPR/Cas9 genome editing [103]. It is a very simple and effective method to assess the efficiency of well-performing sgRNAs. It only requires standard molecular biology reagents and involves three steps, including a standard PCR reaction, Sanger sequencing, and decoding raw sequencing data by the TIDE web tool. The algorithms accurately reconstruct the spectrum of indels from the sequence traces. The web tool reports the identity of the detected indels and their frequencies [104]. Moreover, it is highly effective to predict indels with all sizes in sample clones as well as tracing indels in heterozygotes [102]. TIDE has been further designed to decompose the sequence data produced by template-directed CRISPR/Cas genome editing experiments [105]. Since the majority of CRISPR/Cas-induced mutations in plants are biallelic (two distinct variations), homozygous (two identical mutations), and heterozygous (wild-type/single mutation) [106], Liu and colleagues established a web-based tool, termed DSDecode, to automatically decoding the superimposed sequencing chromatograms of CRISPR/Cas PCR products [107].
Table 2.
Type | Name | Description | Website | Web server or standalone tool | Ref. |
---|---|---|---|---|---|
Decoding Sanger sequencing of on-target sites | TIDE | Quantifying non-templated CRISPR/Cas9 mutations | https://tide.nki.nl | Web server | [103] |
TIDER | Quantifying the indels of templated CRISPR/Cas9 editing | https://tide.nki.nl | Web server | [105] | |
EditR | Quantifying base editing results | https://baseeditr.com/ | Standalone tool | [159] | |
Poly peak parser | Quantifying heterozygous indels | http://yost.genetics.utah.edu/software.php | Standalone tool | [160] | |
DSDecode | Automatically decoding the sequencing chromatograms | http://skl.scau.edu.cn/dsdecode/ | Standalone tool | [107] | |
NGS evaluation of targeted amplicon sequences | BATCH-GE | Detecting the on- and off-target impacts by analyzing deep sequencing data and calculating mutagenesis efficiencies | https://github.com/WouterSteyaert/BATCH-GE | Standalone tool | [115] |
CRISPR-GA | Quantifying and characterizing indels and homologous recombination events | https://crispr-ga.net | Web server | [108] | |
CRISPResso2 | Enabling the users to analyze, visualize, and compare CRISPR outputs from hundreds of experiments using batch functionality | https://crispresso.pinellolab.partners.org/submission | Web server | [110] | |
Cas-Analyzer | Measuring the frequencies of mutations induced by CRISPR/Cas9 and other programmable nucleases for NGS data analysis | http://www.rgenome.net/cas-analyzer/ | Web server | [109] | |
CRIS.py | Providing a Python-based software to analyze NGS data for both knockout and knock-in (multiple users specified) modifications from one to thousands of samples at once | https://github.com/patrickc01/CRIS.py; https://s.stjude.org/video/player.html?videoId=6000021936001 | Standalone tool | [111] | |
CRISPRpic | Providing precise mutation calling and ultrafast analysis of the sequencing results | https://github.com/compbio/CRISPRpic | Web server | [161] | |
CRISPR-DAV | Providing high-throughput analysis of amplicon-based NGS data | https://github.com/pinetree1/crispr-dav | Standalone tool | [162] | |
GNL-Scorer | Combining optimal datasets, models, and features, to address the cross-species problem | https://github.com/TerminatorJ/GNL_Scorer | Standalone tool | [114] | |
CrispRVariants | Quantifying Sanger sequencing and high-throughput amplicon sequencing | https://www.bioconductor.org/packages/CrispRVariants | Standalone tool | [112] | |
NGS evaluation of pooled CRISPR/Cas9 libraries | CRISPRCloud2 | Providing accurately mapping short reads to CRISPR library; statistically aggregating the information across multiple sgRNAs targeting the same gene; providing a user-friendly data visualization and query interface; easy linking with other tools and bioinformatic resources for target preference | https://crispr.nrihub.org | Web server | [125] |
CRISPRAnalyzeR | Featuring with eight hit calling strategies including DESeq2, MAGeCK, edgeR, sgRSEA, Z-Ratio, Mann-Whitney test, ScreenBEAM, and BAGEL; exploring the pooled CRISR/Cas9 screens | https://www.crispr-analyzer.org; https://www.github.com/boutroslab/CRISPRAnalyzeR | Standalone tool | [123] | |
PinAPL-Py | Providing a comprehensive workflow covering quality control, automated sgRNA sequence extraction and alignment, sgRNA enrichment/depletion analysis, and gene ranking | https://pinapl-py.ucsd.edu | Web server | [124] | |
MAGeCK | Providing analysis of large-scale screens | https://bitbucket.org/liulab/mageck/src/master/ | Standalone tool | [117] | |
MAGeCK-VISPR | Providing analysis of large-scale screens | https://bitbucket.org/liulab/mageck-vispr | Standalone tool | [163] | |
BAGEL | Providing analysis of large-scale screens | https://bagel-for-knockout-screens.sourceforge.net/ | Standalone tool | [121] | |
HiTSelect | Providing analysis of large-scale screens | https://github.com/diazlab/HiTSelect | Standalone tool | [119] | |
caRpools | Providing analysis of large-scale screens | https://github.com/boutroslab/caRpools | Standalone tool | [118] | |
CHANGE-seq | Measuring the genome-wide activity of Cas9 | Standalone tool | [130] | ||
ScreenBEAM | Providing analysis of large-scale screens | https://github.com/jyyu/ScreenBEAM | Standalone tool | [120] | |
CERES | Providing CRISPR screen analysis | https://depmap.org/ceres/ | Standalone tool | [164] | |
PBNPA | Providing analysis of large-scale screens | https://cran.r-project.org/web/packages/PBNPA/ | Standalone tool, database | [122] | |
NGS evaluation of off-target effects | DISCOVER-Seq | Detecting unbiasedly off-targets by precise tracking of MRE11; exploring molecular nature of Cas activity in cell with single-base resolution | N/A | Standalone tool | [129] |
HTGTS | Providing robust detection of DSBs generated by engineered nucleases based on their translocation to other endogenous or ectopic DSBs | weblogo.berkeley.edu | Standalone tool | [165] | |
IDLVs | Detecting off-target cleavages with a frequency as low as 1%; providing frequent off-target sites up to 13 mismatches between the sgRNA and its genomic target | N/A | Standalone tool | [127] | |
BLESS | Mapping DNA DSBs at nucleotide resolution by detecting telomere ends, Sce endonuclease-induced DSBs, and complex genome-wide DSB landscapes | N/A | Standalone tool | ||
BLISS | Measuring the location and frequency of DSBs in genome by direct labeling of DSBs in fixed cells or tissues; quantifying DSBs through unique molecular identifiers; low input requirement | N/A | Standalone tool | [166] | |
GUIDE-seq | Providing unbiased and global detection of DSBs induced by CRISPR RNA-guided nucleases | N/A | Standalone tool | [69] | |
GOTI | Providing comparison of edited and non-edited cells distinguished by Cre-loxP recombination system | https://github.com/sydaileen/GOTI-seq | Standalone tool | [167] | |
SITE-Seq | Identifying off-targets in vitro by integrating biochemical assay to increase the enrichment of CRISPR/Cas cleavage fragments | N/A | Standalone tool | [168] | |
Digenome-seq | Providing deep sequencing of in vitro Cas9-digested genomes | N/A | Standalone tool | [169] | |
CRISPR-net | Quantifying CRISPR off-target activities with mismatches and indels | https://codeocean.com/capsule/9553651/tree/v1 | Standalone tool | [170] | |
CIRCLE-seq | Providing a sensitive and unbiased in vitro genome-wide off-target identification strategy optimized by using restriction enzyme for circularization of randomly sheared genome DNA | N/A | Standalone tool | [128] | |
Evaluation and prediction of repair outcomes | inDelphi | Predicting the mutational outcomes | https://www.crisprindelphi.design/ | Web server, database | [86] |
SPROUT | Predicting the length, probability, and sequences of indels caused by CRISPR/Cas gene editing |
https://zou-group.github.io/SPROUT | Web server | [113] |
Note: DSB, double strand break; NGS, next-generation sequencing.
Evaluation of targeted sequences by next-generation sequencing
With rapid adaptation of genome editing technology, massively parallel sequencing methods have been employed for assessing CRISPR/Cas post-experimental data. Evaluation of targeted sequences by next-generation sequencing (NGS) strategies has been developed for deeper quantification of targeted amplicon sequences. The CRISPR Genome Analyzer (CRISPR-GA) evaluates the NGS dataset and quantifies and characterizes the indels and homologous recombination events [108]. NGS also provides information regarding the selected locus, including quantification of edited-sites and other mutations detected. After scanning the reads, locating indels, and computing the allelic replacements, CRISPR-GA provides a combined report-card to the user which includes all potential information about genome editing events. Similarly, CRISPResso2 and Cas-Analyzer also provide web-accessible tools for evaluating deep sequencing outcomes of CRISPR/Cas genome editing experiments [109], [110]; CRISPResso2 also provides specific optimizations on analyzing base editing outcomes [110].
Current computation languages, such as Python and R, play a significant role in efficiency enhancement of several bioinformatic tools, which have been used to accurately detect modifications in the edited genomes by the NGS datasets. For example, “CRIS.py” is a simple and highly versatile program, which analyzes NGS data, and identifies knockout and multiple user-defined knock-in alterations from one and up to thousands of CRISPR/Cas9-edited samples [111]. CrispRVariants provides an R-based toolkit that is feasible to evaluate and visualize mutant allele types, locations, and frequency [112]. The repair outcomes of CRISPR/Cas9-generated DSBs were recently extensively studied in human primary T cells, in which Leenay and colleagues sequenced the repair outcomes at 1656 on-target genomic sites [113]; then, they used the sequencing data to develop and train a machine learning model, termed CRISPR Repair OUTcome (SPROUT). SPROUT incudes all the datasets generated from the 1656 CRISPR on-target sites and can be used to predict the length, probability, and sequences of indels generated by CRISPR/Cas9 [113]. In another study, Wang and colleagues collected 13 datasets obtained from previously reported different CRISPR/Cas genome editing experiments in six different species, including human, mouse, zebrafish, Drosophila, Ciona intestinalis, and C. elegans; after machine learning and featurization by eight different models, they developed an algorithm, called GNL-Scorer, for predicting CRISPR target activities [114]. GNL-Scorer, both GNL and GNL-Human, is a computational model based on the Bayesian Ridge Regression (BRR) model, which combines optimal datasets and features to address the cross-species problem. Both SPROUT and GNL-Scorer computational tools and resources will promote CRISPR sgRNA design and enhance the application of the CRISPR/Cas-based genome editing. BATCH-GE is another easy-to-use computational tool for identifying CRISPR/Cas-derived indel mutations and other precise genome editing events, including both on- and off-target impacts by analyzing huge data generated by deep sequencing technology [115], [116].
NGS evaluation of pooled CRISPR/Cas9 libraries
Given the size and diversity of data generated by pooled CRISPR/Cas9 screens, the majority of conventional methods are not sufficient to evaluate the huge datasets generated by pooled CRISPR/Cas9 screens. To this end, several algorithms have been specifically developed for interpreting raw sequencing outputs of CRISPR/Cas9 screens, such as Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) [117], caRpools [118], HiTSelect [119], Screening Bayesian Evaluation and Analysis Method (ScreenBEAM) [120], Bayesian Analysis of Gene Essentiality (BAGEL), and Permutation Based Non-Parametric Analysis of CRISPR/Cas9 screen data (PBNPA) [121], [122]. Since these analysis methods were developed for persons skilled in bioinformatics, it is difficult for many biologists or researchers with less programming background to implement them. To simplify analysis procedure, web-based interfaces have been developed to enable the users to evaluate pooled CRISPR/Cas9 screening data. CRISPRAnalyzeR is the first end-to-end analysis pipeline that integrates eight different algorithms for identification of candidate genes. In addition, CRISPRAnalyzeR is constructed in R and can be easily installed locally [123]. PinAPL-Py workflow contains various statistical models, better sequence quality checks, automated sgRNA-seq extraction, precise sequence alignment, sgRNA enrichment or depletion analysis, and gene ranking facility [124]. Its workflow can deploy a variety of well-known sgRNA libraries as well as easily upload-able custom libraries. Importantly, it can analyze the multiple CRISPR/Cas-edited experiments. PinAPL-Py ranks both sgRNAs and genes, and it provides ready-to-publish plots. However, both CRISPRAnalyzeR and PinAPL-Py have several rate-limiting steps, such as long time for raw FASTQ file transfer and complicated parameter tuning for alignment. CRISPRCloud2 employs Amazon Web Service to decrease the covert time and satisfy data-privacy requirements. Additionally, an adaptive hash-mapping algorithm was introduced into CRISPRCloud2 to increase alignment speed and accuracy [125].
NGS evaluation of off-target effects
Off-target impact is one of the major challenges for CRISPR/Cas application in gene therapy and crop improvement as well as other areas, such as gene function studies. To reduce potential off-target impacts, many strategies have been developed, which include but are not limited to selecting high-affinity Cas enzymes, designing better sgRNAs, and using the right CRISPR/Cas reagent delivery system. However, identifying all potential off-targets is still a challenge. Identifying and quantifying unexpected genome targeting events are essential to assess the fidelity of genome editing tools as well as to guarantee the safety of gene therapeutic applications. Currently, NGS has been proved as a reliable technology to identify all potential off-target impacts as well as targeted and cleaved genome sites. However, NGS generates a vast number of reading sequences that require special computational programs to identify off-target sequences. To solve this problem, in the past several years, several research laboratories have developed computational tools that can highlight off-target activities besides the edited DNA sequences in the genome by using NGS (Table 2). Crosetto and colleagues presented a method called “direct in situ breaks-labeling enrichment on streptavidin and next-generation sequencing (BLESS)” that scans the DSBs at the whole-genome level by using Instant-seq software for Illumina sequencing data [126]. The efficiency of BLESS was tested in human and mouse cells by using various DSB-inducing reagents and sequencing platforms. The aforementioned method can identify telomere ends, Sce endonuclease-induced DSBs, and complex genome-wide DSBs. In human cells, the identified mutations (> 2000) were in the form of un-evenly distributed aphidicolin-sensitive-regions (ASRs) that was the principal proof of utilization of BLESS at the whole-genome level. Genome-wide unbiased identification of DSBs enabled by sequencing (GUIDE-seq) is an experimental approach for global detection of DNA DSBs for identifying off-target cleavage generated by Cas nucleases and potentially other nucleases, such as TALENs [69]. During identifying off-target sequences by GUIDE-seq, the authors customized a bin-consensus variant-calling algorithm based on molecular index and SAMtools; this computational program distinguishes off-target sequences from the reference sequences. This method can be used to detect off-target cleavage activities that previous computational methods or chromatin immunoprecipitation sequencing (ChIP-seq) could not detect. GUIDE-seq also detects Cas-independent genomic DSB hotspots. Giving that linear double-stranded integrase-defective lentiviral vectors (IDLVs) possesses the propensity of integrating preferentially into nuclease-induced DSBs by NHEJ repairing pathway, it has been employed to detect CRISPR/Cas-induced off-target cleavages with a very low frequency of 1% [127]. IDLVs also shows that Cas9 protein induces frequent off-target cleavages at 1-bp bulge or up to 13-bp mismatches between the sgRNA and its genomic DNA target, which may help in refining sgRNA design [127]. Circularization for in vitro reporting of cleavage effects by sequencing (CIRCLE-seq) identities off-targets at the genome-wide level by mapping the paired-end read sequences for searching off-target sites using bwa mem and samtools mpileup. This NGS and computational approach can be used not only for organisms with reference genome sequences but also for organisms without reference genomes [128]. However, off-target discovery methods using purified genomic DNA/specific cellular models are not capable of direct-in-vivo detection. To overcome this issue, a recently developed universally applicable approach called “discovery of in situ Cas off-targets and verification by sequencing (DISCOVER-Seq)” can be used to detect off-target effects in vivo [129]. This unbiased off-target identification approach recruits the DNA repair factors both in cells and organisms. By tracking these factors as “MRE11” [a subunit of the MRE11–RAD50–NBS1 (MRN) complex, which is tightly distributed around the Cas9 cut site], this program can detect off-target activities with single-base resolution. Moreover, DISCOVER-Seq works with several sgRNA formats and different types of Cas proteins that enable the characterization of new genome editing tools. Based on large-scale data analysis and a machine learning model, Lazzarotto and colleagues developed a “circularization for high-throughput analysis of nuclease genome-wide effects by sequencing (CHANGE-seq)” method for measuring the genome-wide activity of Cas9 in vitro, which includes both genetic and epigenetic impacts as well as off-target effects. Using this method, the authors identified 201,934 off-target sites from 110 sgRNA targets across 13 therapeutically relevant loci in human primary T cells [130]. From this study, they also observed that CRISPR/Cas9-induced off-target impacts were more likely to occur near active promoters, enhancers, and transcribed regions. With the rapid development of these NGS-based off-target detection approaches, more data can be produced from living therapeutic cells, which will boost the evolution of machine learning models and enhance alignment algorithms for identifying off-target impacts of CRISPR/Cas at the whole-genome level.
Conclusion and perspectives
Given the versatility and robustness of CRISPR/Cas-based genome editing, many interdisciplinary scientists have been working to enhance this technology, including screening functionally active CRISPR/Cas nucleases, clarifying key determinants of sgRNA specificity, and reducing off-target potentials. The rapid development of computational algorithm tools accelerates greatly the quick application of CRISPR/Cas9 genome editing technology, particularly by designing optimal sgRNAs and post-genome editing data analysis. Up to now, many computational tools have been developed for designing sgRNAs and analyzing the potential on- and off-target impacts of different CRISPR/Cas genome editing systems. Certain of these programs are publicly available and have web servers for quick operation. To meet the new applications of the CRISPR/Cas systems, new computational tools for performing and analyzing CRISPR/Cas events have also been recently developed, such as scMAGeCK [131], CRISPRO [132], and ProTiler [133]. scMAGeCK links genotypes with multiple phenotypes in single-cell CRISPR screens [131]. CRISPRO maps functional scores associated with guide RNAs to genomes, transcripts, and protein coordinates and structures, which can be used to predict improved sgRNA efficacy [132]. ProTiler is used for the analysis and visualization of CRISPR screens with a tiling-sgRNA design [133]. However, there still exist several gaps in developing new sgRNA analysis tools to meet the needs of rapidly evolving CRISPR/Cas genome editing techniques.
The parameters used for building sgRNA scoring algorithms are mainly based on the data generated by CRISPR/Cas9 and CRISPR/Cas12a genome editing systems [37], [40], [89], which create targeted DNA mutagenesis via DSBs. Currently, numerous precise genome editors, such as prime editors and epigenetic editors, have been developed that are capable of rewriting genome sequences without inducing DSBs and donor DNA templates, which are especially promising tools for executing high-throughput screening and modifying base mutations [12], [134]. Given that prime editors are capable of achieving desired sequence insertions, deletions, and all 12 types of base conversions, they have been rapidly adapted in many organisms. Unlike conventional sgRNAs, the binding and sequence-specific conversion rely on an engineered multifunctional pegRNA in prime editing [61]. In addition to the common sgRNA features, pegRNAs have a programmable 3′ end, which is composed of an RT template that functions to guide DNA repair and a PBS that anneals to the nicked target DNA strand [61]. A previous study suggests that both PBS length and RT template length are important for prime editing efficiency. The suggested PBS length range is 8–15 nt, whereas RT templates are always 10–20 nt in length [12], [61]. In addition, GC content and RT template secondary structure may affect editing efficiency as well. Due to the complex combination matrix of possible PBS and RT lengths, the best method for designing pegRNAs still depends mainly on experience [12], [61]. Thus, a comprehensive study of the key determinants of the prime editing efficiency based on large-scale experimental data would be an effective approach for constructing pegRNA design tools. Additionally, as more Cas enzymes have been discovered and refined, new sgRNA design programs are also needed to work on these newly developed CRISPR/Cas systems.
Constructing sgRNA-directed mutation libraries is one of the most effective strategies to identify gene function and regulatory gene interaction networks. Current commonly used empirical algorithms are primarily derived from large-scale sgRNA analysis on human cells and the zebrafish model, but many studies demonstrate that genome editing efficiency and specificity vary widely among different organisms. Indeed, the probability of off-targets is always lower in plant species compared with animals [68], [135], [136], [137], [138], [139]. In addition to sequence features, various other factors have been identified, which affect sgRNA activity, such as chromatin accessibility, gene position, nucleosomes, and epigenomic markers [55], [140], [141], [142]. Chromatin accessibility has been demonstrated to play a dominant role in determining genome-wide binding of dCas9-sgRNA [42]. However, chromatin accessibility varies among organisms [143], [144]. Thus, comprehensive analysis of sgRNA sequence features and chromatin data across organisms might provide new insights into further optimizing scoring algorithms and computational tools.
With the quick development of CRISPR/Cas-based genome editing, it is not only limited to create targeted mutagenesis at the protein-coding region. Genome editing of upstream open reading frame (uORF) techniques provides a new viewpoint to fine-tune gene translation by means of endogenous regulatory elements. Although uORFs are found widely in eukaryotic genomes, their roles remain to be elucidated [145], [146], [147], [148]. Additionally, small RNAs are an extensive class of widespread gene regulators in eukaryotic organisms, implicated in various regulatory processes [149], [150], [151], [152], [153]. Execution of high-throughput genome-wide functional identification by genome editing of uORFs and small RNAs has a great potential to dissect the mechanisms of gene regulation. Despite the fact that a number of uORF and small RNA databases are available for a wide range of eukaryotic organisms, they are not integrated into sgRNA-designing platform. Currently there are no computational tools for designing sgRNAs for genome editing of small RNAs and uORFs. To quickly elucidate the roles of small RNAs, particularly microRNAs (miRNAs), scientists from both wet- and dry-labs should work together to develop a powerful strategy for designing sgRNAs for small RNA genome editing based on the characteristics of miRNAs, such as stem-loop structures and miRNA biogenesis [153].
The active maintenance and optimization of current computational tools is another main concern. Doench and coworkers analyzed 26,000 website-based computational tools and found that about 30% of them were inaccessible [154]. With the clarification of the mechanism underlying CRISPR/Cas binding and cleavage, the parameters on sgRNA scoring and algorithms need to be updated continuously. With the growing accumulation of experiment-based data, the existing predictive models will be further trained, which subsequently accelerates the evolution of CRISPR/Cas applications. Frequent update of currently available computational resources and tools will enhance the application of CRISPR/Cas-based genome editing.
Additionally, there are so many computational tools, including sgRNA design databases and tools for CRISPR/Cas genome editing efficiency prediction as well as on- and off-target analyses. Different tools have different advantages and disadvantages and usage for different organisms. Thus, selecting the right tool for a specific CRISPR/Cas genome editing experiment is critical. When selecting a computational tool, one first needs to know what species and even what cell types they are working on and what Cas enzymes they are using. For many cases, there are multiple computational tools that can be used; different programs may perform differently due to the fact that the different computational programs are designed based on different datasets and criteria. It is also important that further investigations uncover the causes of differences among different tools. In a recent paper, Yan and colleagues presented a way to choose a tool for designing on-target sgRNAs, and they suggest that different computational tools may be recommended in different scenarios [70]. Developing a learning-based model and also incorporating other features, such as sgRNA sequences and their structures, is the right direction for designing a good sgRNA and predicting sgRNA efficiency [70]. With the help of computational tools and resources, CRISPR/Cas-based genome editing will move forward more quickly than we thought.
Competing interests
The authors have declared no competing interests.
CRediT authorship contribution statement
Chao Li: Conceptualization, Writing – original draft, Visualization, Writing – review & editing. Wen Chu: Writing – original draft. Rafaqat Ali Gill: Writing – original draft. Shifei Sang: Writing – original draft. Yuqin Shi: Writing – review & editing. Xuezhi Hu: Writing – review & editing. Yuting Yang: Visualization. Qamar U. Zaman: Writing – review & editing. Baohong Zhang: Conceptualization, Supervision, Funding acquisition, Writing – review & editing. All authors have read and approved the final manuscript.
Acknowledgments
We greatly appreciate Dr. Jeffrey McKinnon for his thoughtful proofreading and wonderful suggestion on this manuscript. We also greatly appreciate the scientific community for making huge progress in this field. We have tried to cite as many references as possible. However, due to the page limitation, there may be some important works not cited here; we apologize for this. The work in Dr. Baohong Zhang’s Laboratory is supported in part by Cotton Incorporated and the National Science Foundation, the United States (Grant No. 1658709). This work was also supported by the National Natural Science Foundation of China (Grant No. 31700316), the Fundamental Research Funds for the Central Nonprofit Scientific Institution (Grant No. 1610172018009), and the Natural Science Foundation of Hubei Province, China (Grant No. 2018CFB543).
Handled by Xiaole Shirley Liu
Footnotes
Peer review under responsibility of Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation and Genetics Society of China
Contributor Information
Chao Li, Email: lichao01@caas.cn.
Baohong Zhang, Email: zhangb@ecu.edu.
References
- 1.Cebrian-Serrano A., Davies B. CRISPR-Cas orthologues and variants: optimizing the repertoire, specificity and delivery of genome engineering tools. Mamm Genome. 2017;28:247–261. doi: 10.1007/s00335-017-9697-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mojica F.J.M., Díez-Villaseñor C., García-Martínez J., Almendros C. Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology (Reading) 2009;155:733–740. doi: 10.1099/mic.0.023960-0. [DOI] [PubMed] [Google Scholar]
- 3.Cho S.W., Kim S., Kim J.M., Kim J.S. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat Biotechnol. 2013;31:230–232. doi: 10.1038/nbt.2507. [DOI] [PubMed] [Google Scholar]
- 4.Hwang W.Y., Fu Y., Reyon D., Maeder M.L., Tsai S.Q., Sander J.D., et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. 2013;31:227–229. doi: 10.1038/nbt.2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jiang W., Bikard D., Cox D., Zhang F., Marraffini L.A. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat Biotechnol. 2013;31:233–239. doi: 10.1038/nbt.2508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Xie K., Yang Y. RNA-guided genome editing in plants using a CRISPR-Cas system. Mol Plant. 2013;6:1975–1983. doi: 10.1093/mp/sst119. [DOI] [PubMed] [Google Scholar]
- 7.Li J.F., Norville J.E., Aach J., McCormack M., Zhang D., Bush J., et al. Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9. Nat Biotechnol. 2013;31:688–691. doi: 10.1038/nbt.2654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nekrasov V., Staskawicz B., Weigel D., Jones J.D.G., Kamoun S. Targeted mutagenesis in the model plant Nicotiana benthamiana using Cas9 RNA-guided endonuclease. Nat Biotechnol. 2013;31:691–693. doi: 10.1038/nbt.2655. [DOI] [PubMed] [Google Scholar]
- 9.Cong L., Ran F.A., Cox D., Lin S., Barretto R., Habib N., et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li C., Brant E., Budak H., Zhang B. CRISPR/Cas: a Nobel Prize award-winning precise genome editing technology for gene therapy and crop improvement. J Zhejiang Univ Sci B. 2021;22:253–284. doi: 10.1631/jzus.B2100009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Meier J.A., Zhang F., Sanjana N.E. GUIDES: sgRNA design for loss-of-function screens. Nat Methods. 2017;14:831–832. doi: 10.1038/nmeth.4423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Anzalone A.V., Koblan L.W., Liu D.R. Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat Biotechnol. 2020;38:824–844. doi: 10.1038/s41587-020-0561-9. [DOI] [PubMed] [Google Scholar]
- 13.Chen K., Wang Y., Zhang R., Zhang H., Gao C. CRISPR/Cas genome editing and precision plant breeding in agriculture. Annu Rev Plant Biol. 2019;70:667–697. doi: 10.1146/annurev-arplant-050718-100049. [DOI] [PubMed] [Google Scholar]
- 14.Zhang B. CRISPR/Cas gene therapy. J Cell Physiol. 2020;236:2459–2481. doi: 10.1002/jcp.30064. [DOI] [PubMed] [Google Scholar]
- 15.Shalem O., Sanjana N.E., Zhang F. High-throughput functional genomics using CRISPR-Cas9. Nat Rev Genet. 2015;16:299–311. doi: 10.1038/nrg3899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dominguez A.A., Lim W.A., Qi L.S. Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation. Nat Rev Mol Cell Biol. 2016;17:5–15. doi: 10.1038/nrm.2015.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Thakore P.I., Black J.B., Hilton I.B., Gersbach C.A. Editing the epigenome: technologies for programmable transcription and epigenetic modulation. Nat Methods. 2016;13:127–137. doi: 10.1038/nmeth.3733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Adli M. The CRISPR tool kit for genome editing and beyond. Nat Commun. 2018;9:1911. doi: 10.1038/s41467-018-04252-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pickar-Oliver A., Gersbach C.A. The next generation of CRISPR-Cas technologies and applications. Nat Rev Mol Cell Biol. 2019;20:490–507. doi: 10.1038/s41580-019-0131-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chen S.P., Wang H.H. An engineered Cas-Transposon system for programmable and site-directed DNA transpositions. CRISPR J. 2019;2:376–394. doi: 10.1089/crispr.2019.0030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chaikind B., Bessen J.L., Thompson D.B., Hu J.H., Liu D.R. A programmable Cas9-serine recombinase fusion protein that operates on DNA sequences in mammalian cells. Nucleic Acids Res. 2016;44:9758–9770. doi: 10.1093/nar/gkw707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kearns N.A., Genga R.M.J., Enuameh M.S., Garber M., Wolfe S.A., Maehr R. Cas9 effector-mediated regulation of transcription and differentiation in human pluripotent stem cells. Development. 2014;141:219–223. doi: 10.1242/dev.103341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hilton I.B., D'Ippolito A.M., Vockley C.M., Thakore P.I., Crawford G.E., Reddy T.E., et al. Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat Biotechnol. 2015;33:510–517. doi: 10.1038/nbt.3199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Choudhury S.R., Cui Y., Lubecka K., Stefanska B., Irudayaraj J. CRISPR-dCas9 mediated TET1 targeting for selective DNA demethylation at BRCA1 promoter. Oncotarget. 2016;7:46545–46556. doi: 10.18632/oncotarget.10234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhang D., Zhang Z., Unver T., Zhang B. CRISPR/Cas: a powerful tool for gene function study and crop improvement. J Adv Res. 2021;29:207–221. doi: 10.1016/j.jare.2020.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Tong S., Moyo B., Lee C.M., Leong K., Bao G. Engineered materials for in vivo delivery of genome-editing machinery. Nat Rev Mater. 2019;4:726–737. doi: 10.1038/s41578-019-0145-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yin H., Kauffman K.J., Anderson D.G. Delivery technologies for genome editing. Nat Rev Drug Discov. 2017;16:387–399. doi: 10.1038/nrd.2016.280. [DOI] [PubMed] [Google Scholar]
- 28.Chuai G.H., Wang Q.L., Liu Q. In silico meets in vivo: towards computational CRISPR-based sgRNA design. Trends Biotechnol. 2017;35:12–21. doi: 10.1016/j.tibtech.2016.06.008. [DOI] [PubMed] [Google Scholar]
- 29.Tang X., Ren Q., Yang L., Bao Y., Zhong Z., He Y., et al. Single transcript unit CRISPR 2.0 systems for robust Cas9 and Cas12a mediated plant genome editing. Plant Biotechnol J. 2019;17:1431–1445. doi: 10.1111/pbi.13068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Tang X., Zheng X., Qi Y., Zhang D., Cheng Y., Tang A., et al. A single transcript CRISPR-Cas9 system for efficient genome editing in plants. Mol Plant. 2016;9:1088–1091. doi: 10.1016/j.molp.2016.05.001. [DOI] [PubMed] [Google Scholar]
- 31.Wang T., Wei J.J., Sabatini D.M., Lander E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014;343:80–84. doi: 10.1126/science.1246981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Yang H., Wang H., Shivalila C.S., Cheng A.W., Shi L., Jaenisch R. One-step generation of mice carrying reporter and conditional alleles by CRISPR/Cas-mediated genome engineering. Cell. 2013;154:1370–1379. doi: 10.1016/j.cell.2013.08.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hsu P.D., Scott D.A., Weinstein J.A., Ran F.A., Konermann S., Agarwala V., et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013;31:827–832. doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Shalem O., Sanjana N.E., Hartenian E., Shi X., Scott D.A., Mikkelsen T., et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343:84–87. doi: 10.1126/science.1247005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Xu H., Xiao T., Chen C.H., Li W., Meyer C.A., Wu Q., et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 2015;25:1147–1157. doi: 10.1101/gr.191452.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Labuhn M., Adams F.F., Ng M., Knoess S., Schambach A., Charpentier E.M., et al. Refined sgRNA efficacy prediction improves large- and small-scale CRISPR-Cas9 applications. Nucleic Acids Res. 2018;46:1375–1385. doi: 10.1093/nar/gkx1268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Doench J.G., Fusi N., Sullender M., Hegde M., Vaimberg E.W., Donovan K.F., et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016;34:184–191. doi: 10.1038/nbt.3437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Haeussler M., Schonig K., Eckert H., Eschstruth A., Mianne J., Renaud J.B., et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 2016;17:148. doi: 10.1186/s13059-016-1012-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Stemmer M., Thumberger T., Keyer M.D., Wittbrodt J., Mateo J.L. CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS One. 2015;10:e0124633. doi: 10.1371/journal.pone.0124633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Doench J.G., Hartenian E., Graham D.B., Tothova Z., Hegde M., Smith I., et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat Biotechnol. 2014;32:1262–1267. doi: 10.1038/nbt.3026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Gagon J.A., Valen E., Thyme S.B., Huang P., Ahkmetova L., Pauli A., et al. Efficient mutagenesis by Cas9 protein-mediated oligonucleotide insertion and large-scale assessment of single-guide RNAs. PLoS One. 2014;9:e98186. doi: 10.1371/journal.pone.0098186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wu X., Scott D.A., Kriz A.J., Chiu A.C., Hsu P.D., Dadon D.B., et al. Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat Biotechnol. 2014;32:670–676. doi: 10.1038/nbt.2889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Orioli A., Pascali C., Quartararo J., Diebel K.W., Praz V., Romascano D., et al. Widespread occurrence of non-canonical transcription termination by human RNA polymerase III. Nucleic Acids Res. 2011;39:5499–5512. doi: 10.1093/nar/gkr074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nielsen S., Yuzenkova Y., Zenkin N. Mechanism of eukaryotic RNA polymerase III transcription termination. Science. 2013;340:1577–1580. doi: 10.1126/science.1237934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Moreno-Mateos M.A., Vejnar C.E., Beaudoin J.D., Fernandez J.P., Mis E.K., Khokha M.K., et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods. 2015;12:982–988. doi: 10.1038/nmeth.3543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Montague T.G., Cruz J.M., Gagnon J.A., Church G.M., Valen E. CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic Acids Res. 2014;42:W401–W407. doi: 10.1093/nar/gku410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Labun K., Montague T.G., Gagnon J.A., Thyme S.B., Valen E. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res. 2016;44:W272–W276. doi: 10.1093/nar/gkw398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Fusi N., Smith I., Doench J., Listgarten J. In silico predictive modeling of CRISPR/Cas9 guide efficiency. bioRxiv. 2015:021568. [Google Scholar]
- 49.Heigwer F., Kerr G., Boutros M. E-CRISP: fast CRISPR target site identification. Nat Methods. 2014;11:122–123. doi: 10.1038/nmeth.2812. [DOI] [PubMed] [Google Scholar]
- 50.Chari R., Yeo N.C., Chavez A., Church G.M. sgRNA Scorer 2.0: a species-independent model to predict CRISPR/Cas9 activity. ACS Synth Biol. 2017;6:902–904. doi: 10.1021/acssynbio.6b00343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Chari R., Mali P., Moosburner M., Church G.M. Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat Methods. 2015;12:823–826. doi: 10.1038/nmeth.3473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Listgarten J., Weinstein M., Kleinstiver B.P., Sousa A.A., Joung J.K., Crawford J., et al. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat Biomed Eng. 2018;2:38–47. doi: 10.1038/s41551-017-0178-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Singh R., Kuscu C., Quinlan A., Qi Y., Adli M. Cas9-chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res. 2015;43:e118. doi: 10.1093/nar/gkv575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Lee C.M., Davis T.H., Bao G. Examination of CRISPR/Cas9 design tools and the effect of target site accessibility on Cas9 activity. Exp Physiol. 2018;103:456–460. doi: 10.1113/EP086043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Horlbeck M.A., Witkowsky L.B., Guglielmi B., Replogle J.M., Gilbert L.A., Villalta J.E., et al. Nucleosomes impede Cas9 access to DNA in vivo and in vitro. Elife. 2016;5:e12677. doi: 10.7554/eLife.12677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Abadi S., Yan W.X., Amar D., Mayrose I. A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action. PLoS Comput Biol. 2017;13:e1005807. doi: 10.1371/journal.pcbi.1005807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Chuai G., Ma H., Yan J., Chen M., Hong N., Xue D., et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 2018;19:80. doi: 10.1186/s13059-018-1459-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kuan P.F., Powers S., He S.Y., Li K.Q., Zhao X.Y., Huang B. A systematic evaluation of nucleotide properties for CRISPR sgRNA design. BMC Bioinformatics. 2017;18:297. doi: 10.1186/s12859-017-1697-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhang D., Hurst T., Duan D., Chen S.J. Unified energetics analysis unravels SpCas9 cleavage activity for optimal gRNA design. Proc Natl Acad Sci USA. 2019;116:8693–8698. doi: 10.1073/pnas.1820523116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.He W., Wang H., Wei Y., Jiang Z., Tang Y., Chen Y., et al. GuidePro: a multi-source ensemble predictor for prioritizing sgRNAs in CRISPR/Cas9 protein knockouts. Bioinformatics. 2021;37:134–136. doi: 10.1093/bioinformatics/btaa1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Anzalone A.V., Randolph P.B., Davis J.R., Sousa A.A., Koblan L.W., Levy J.M., et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature. 2019;576:149–157. doi: 10.1038/s41586-019-1711-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Kim H.K., Yu G., Park J., Min S., Lee S., Yoon S., et al. Predicting the efficiency of prime editing guide RNAs in human cells. Nat Biotechnol. 2021;39:198–206. doi: 10.1038/s41587-020-0677-y. [DOI] [PubMed] [Google Scholar]
- 63.Zhang X.H., Tee L.Y., Wang X.G., Huang Q.S., Yang S.H. Off-target effects in CRISPR/Cas9-mediated genome engineering. Mol Ther Nucleic Acids. 2015;4:e264. doi: 10.1038/mtna.2015.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Bae S., Park J., Kim J.S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics. 2014;30:1473–1475. doi: 10.1093/bioinformatics/btu048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Fu Y., Foden J.A., Khayter C., Maeder M.L., Reyon D., Joung J.K., et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol. 2013;31:822–826. doi: 10.1038/nbt.2623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Tsai S.Q., Zheng Z., Nguyen N.T., Liebers M., Topkar V.V., Thapar V., et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat Biotechnol. 2015;33:187–197. doi: 10.1038/nbt.3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Yan J., Chuai G., Zhou C., Zhu C., Yang J., Zhang C., et al. Benchmarking CRISPR on-target sgRNA design. Brief Bioinform. 2018;19:721–724. doi: 10.1093/bib/bbx001. [DOI] [PubMed] [Google Scholar]
- 71.Yan J., Xue D., Chuai G., Gao Y., Zhang G., Liu Q. Benchmarking and integrating genome-wide CRISPR off-target detection and prediction. Nucl Acid Res. 2020;48:11370–11379. doi: 10.1093/nar/gkaa930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Housden B.E., Hu Y., Perrimon N. Design and generation of Drosophila single guide RNA expression constructs. Cold Spring Harb Protoc. 2016;2016:prot090779. doi: 10.1101/pdb.prot090779. [DOI] [PubMed] [Google Scholar]
- 73.Peng D., Tarleton R. EuPaGDT: a web tool tailored to design CRISPR guide RNAs for eukaryotic pathogens. Microb Genom. 2015;1:e000033. doi: 10.1099/mgen.0.000033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Blin K., Pedersen L.E., Weber T., Lee S.Y. CRISPy-web: an online resource to design sgRNAs for CRISPR applications. Synth Sys Biotechnol. 2016;1:118–121. doi: 10.1016/j.synbio.2016.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Zhou Y., Zhu S., Cai C., Yuan P., Li C., Huang Y., et al. High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells. Nature. 2014;509:487–491. doi: 10.1038/nature13166. [DOI] [PubMed] [Google Scholar]
- 76.Shen Y.J., Manier S., Park J., Mishima Y., Capelletti M., Roccaro A.M., et al. In vivo genome-wide Crispr library screen in a xenograft mouse model of tumor growth and metastasis of multiple myeloma. Blood. 2016;128:1137. [Google Scholar]
- 77.Chen S., Sanjana N.E., Zheng K., Shalem O., Lee K., Shi X., et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell. 2015;160:1246–1260. doi: 10.1016/j.cell.2015.02.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Shah A.N., Davey C.F., Whitebirch A.C., Miller A.C., Moens C.B. Rapid reverse genetic screening using CRISPR in zebrafish. Nat Methods. 2015;12:535–540. doi: 10.1038/nmeth.3360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Lu Y., Ye X., Guo R., Huang J., Wang W., Tang J., et al. Genome-wide targeted mutagenesis in rice using the CRISPR/Cas9 system. Mol Plant. 2017;10:1242–1245. doi: 10.1016/j.molp.2017.06.007. [DOI] [PubMed] [Google Scholar]
- 80.Meng X., Yu H., Zhang Y., Zhuang F., Song X., Gao S., et al. Construction of a genome-wide mutant library in rice using CRISPR/Cas9. Mol Plant. 2017;10:1238–1241. doi: 10.1016/j.molp.2017.06.006. [DOI] [PubMed] [Google Scholar]
- 81.Chen W., Zhang G., Li J., Zhang X., Huang S., Xiang S., et al. CRISPRlnc: a manually curated database of validated sgRNAs for lncRNAs. Nucl Acid Res. 2019;47:D63–D68. doi: 10.1093/nar/gky904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Varshney G.K., Zhang S., Pei W., Adomako-Ankomah A., Fohtung J., Schaffer K., et al. CRISPRz: a database of zebrafish validated sgRNAs. Nucl Acid Res. 2016;44:D822–D826. doi: 10.1093/nar/gkv998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Sfeir A., Symington L.S. Microhomology-mediated end joining: a back-up survival mechanism or dedicated pathway? Trends Biochem Sci. 2015;40:701–714. doi: 10.1016/j.tibs.2015.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Bae S., Kweon J., Kim H.S., Kim J.S. Microhomology-based choice of Cas9 nuclease target sites. Nat Methods. 2014;11:705–706. doi: 10.1038/nmeth.3015. [DOI] [PubMed] [Google Scholar]
- 85.Chen W., McKenna A., Schreiber J., Haeussler M., Yin Y., Agarwal V., et al. Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair. Nucl Acid Res. 2019;47:7989–8003. doi: 10.1093/nar/gkz487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Shen M.W., Arbab M., Hsu J.Y., Worstell D., Culbertson S.J., Krabbe O., et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature. 2018;563:646–651. doi: 10.1038/s41586-018-0686-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Allen F., Crepaldi L., Alsinet C., Strong A.J., Kleshchevnikov V., De Angeli P., et al. Predicting the mutations generated by repair of Cas9-induced double-strand breaks. Nat Biotechnol. 2019;37:64–72. doi: 10.1038/nbt.4317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Luo J., Chen W., Xue L., Tang B. Prediction of activity and specificity of CRISPR-Cpf1 using convolutional deep learning neural networks. BMC Bioinfomatics. 2019;20:332. doi: 10.1186/s12859-019-2939-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Kim H.K., Min S., Song M., Jung S., Choi J.W., Kim Y., et al. Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity. Nat Biotechnol. 2018;36:239–241. doi: 10.1038/nbt.4061. [DOI] [PubMed] [Google Scholar]
- 90.Najm F.J., Strand C., Donovan K.F., Hegde M., Sanson K.R., Vaimberg E.W., et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat Biotechnol. 2018;36:179–189. doi: 10.1038/nbt.4048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Shibata A. Regulation of repair pathway choice at two-ended DNA double-strand breaks. Mutat Res. 2017;803–805:51–55. doi: 10.1016/j.mrfmmm.2017.07.011. [DOI] [PubMed] [Google Scholar]
- 92.Leenay R.T., Aghazadeh A., Hiatt J., Tse D., Hultquist J.F., Krogan N., et al. Systematic characterization of genome editing in primary T cells reveals proximal genomic insertions and enables machine learning prediction of CRISPR-Cas9 DNA repair outcomes. bioRxiv. 2018:404947. [Google Scholar]
- 93.Park J., Bae S., Kim J.S. Cas-Designer: a web-based tool for choice of CRISPR-Cas9 target sites. Bioinformatics. 2015;31:4014–4016. doi: 10.1093/bioinformatics/btv537. [DOI] [PubMed] [Google Scholar]
- 94.Xie X., Ma X., Zhu Q., Zeng D., Li G., Liu Y.G. CRISPR-GE: a convenient software toolkit for CRISPR-based genome editing. Mol Plant. 2017;10:1246–1249. doi: 10.1016/j.molp.2017.06.004. [DOI] [PubMed] [Google Scholar]
- 95.Liu H., Ding Y., Zhou Y., Jin W., Xie K., Chen L.L. CRISPR-P 2.0: an improved CRISPR-Cas9 tool for genome tditing in plants. Mol Plant. 2017;10:530–532. doi: 10.1016/j.molp.2017.01.003. [DOI] [PubMed] [Google Scholar]
- 96.Lei Y., Lu L., Liu H.Y., Li S., Xing F., Chen L.L. CRISPR-P: a web tool for synthetic single-guide RNA design of CRISPR-system in plants. Mol Plant. 2014;7:1494–1496. doi: 10.1093/mp/ssu044. [DOI] [PubMed] [Google Scholar]
- 97.Lorenz R., Luntzer D., Hofacker I.L., Stadler P.F., Wolfinger M.T. SHAPE directed RNA folding. Bioinformatics. 2016;32:145–147. doi: 10.1093/bioinformatics/btv523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Zhao G., Li J., Tang Y. AsCRISPR: a web server for allele-specific single guide RNA design in precision medicine. CRISPR J. 2020;3:512–522. doi: 10.1089/crispr.2020.0071. [DOI] [PubMed] [Google Scholar]
- 99.Chen C.L., Rodiger J., Chung V., Viswanatha R., Mohr S.E., Hu Y., et al. SNP-CRISPR: a web tool for SNP-specific genome editing. G3 (Bethesda) 2020;10:489–494. doi: 10.1534/g3.119.400904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Siegner S.M., Karasu M.E., Schröder M.S., Kontarakis Z., Corn J.E. PnB Designer: a web application to design prime and base editor guide RNAs for animals and plants. BMC Bioinformatics. 2021;22:101. doi: 10.1186/s12859-021-04034-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Mashal R.D., Koontz J., Sklar J. Detection of mutations by cleavage of DNA heteroduplexes with bacteriophage resolvases. Nat Genet. 1995;9:177–183. doi: 10.1038/ng0295-177. [DOI] [PubMed] [Google Scholar]
- 102.Sentmanat M.F., Peters S.T., Florian C.P., Connelly J.P., Pruett-Miller S.M. A survey of validation strategies for CRISPR-Cas9 editing. Sci Rep. 2018;8:888. doi: 10.1038/s41598-018-19441-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Brinkman E.K., Chen T., Amendola M., van Steensel B. Easy quantitative assessment of genome editing by sequence trace decomposition. Nucl Acid Res. 2014;42:e168. doi: 10.1093/nar/gku936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Etard C., Joshi S., Stegmaier J., Mikut R., Strahle U. Tracking of indels by decomposition is a simple and effective method to assess efficiency of guide RNAs in zebrafish. Zebrafish. 2017;14:586–588. doi: 10.1089/zeb.2017.1454. [DOI] [PubMed] [Google Scholar]
- 105.Brinkman E.K., Kousholt A.N., Harmsen T., Leemans C., Chen T., Jonkers J., et al. Easy quantification of template-directed CRISPR/Cas9 editing. Nucl Acid Res. 2018;46:e58. doi: 10.1093/nar/gky164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Ma X., Chen L., Zhu Q., Chen Y., Liu Y.G. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products. Mol Plant. 2015;8:1285–1287. doi: 10.1016/j.molp.2015.02.012. [DOI] [PubMed] [Google Scholar]
- 107.Liu W., Xie X., Ma X., Li J., Chen J., Liu Y.G. DSDecode: a web-based tool for decoding of sequencing chromatograms for genotyping of targeted mutations. Mol Plant. 2015;8:1431–1433. doi: 10.1016/j.molp.2015.05.009. [DOI] [PubMed] [Google Scholar]
- 108.Guell M., Yang L., Church G.M. Genome editing assessment using CRISPR genome analyzer (CRISPR-GA) Bioinformatics. 2014;30:2968–2970. doi: 10.1093/bioinformatics/btu427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Park J., Lim K., Kim J.S., Bae S. Cas-analyzer: an online tool for assessing genome editing results using NGS data. Bioinformatics. 2017;33:286–288. doi: 10.1093/bioinformatics/btw561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Clement K., Rees H., Canver M.C., Gehrke J.M., Farouni R., Hsu J.Y., et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat Biotechnol. 2019;37:224–226. doi: 10.1038/s41587-019-0032-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Connelly JP, Pruett-Miller SM. CRIS.py: a versatile and high-throughput analysis program for CRISPR-based genome editing. Sci Rep 2019;9:4194. [DOI] [PMC free article] [PubMed]
- 112.Lindsay H., Burger A., Biyong B., Felker A., Hess C., Zaugg J., et al. CrispRVariants charts the mutation spectrum of genome engineering experiments. Nat Biotechnol. 2016;34:701–702. doi: 10.1038/nbt.3628. [DOI] [PubMed] [Google Scholar]
- 113.Leenay R.T., Aghazadeh A., Hiatt J., Tse D., Roth T.L., Apathy R., et al. Large dataset enables prediction of repair after CRISPR-Cas9 editing in primary T cells. Nat Biotechnol. 2019;37:1034–1037. doi: 10.1038/s41587-019-0203-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Wang J., Xiang X., Bolund L., Zhang X., Cheng L., Luo Y. GNL-Scorer: a generalized model for predicting CRISPR on-target activity by machine learning and featurization. J Mol Cell Biol. 2020;12:909–911. doi: 10.1093/jmcb/mjz116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Boel A., Steyaert W., De Rocker N., Menten B., Callewaert B., De Paepe A., et al. BATCH-GE: batch analysis of next-generation sequencing data for genome editing assessment. Sci Rep. 2016;6:30330. doi: 10.1038/srep30330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Steyaert W., Boel A., Coucke P., Willaert A. BATCH-GE: analysis of NGS data for genome editing assessment. Methods Mol Biol. 2018;1865:83–90. doi: 10.1007/978-1-4939-8784-9_6. [DOI] [PubMed] [Google Scholar]
- 117.Li W., Xu H., Xiao T., Cong L., Love M.I., Zhang F., et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol. 2014;15:554. doi: 10.1186/s13059-014-0554-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Winter J., Breinig M., Heigwer F., Brugemann D., Leible S., Pelz O., et al. caRpools: an R package for exploratory data analysis and documentation of pooled CRISPR/Cas9 screens. Bioinformatics. 2016;32:632–634. doi: 10.1093/bioinformatics/btv617. [DOI] [PubMed] [Google Scholar]
- 119.Diaz A.A., Qin H., Ramalho-Santos M., Song J.S. HiTSelect: a comprehensive tool for high-complexity-pooled screen analysis. Nucl Acid Res. 2015;43:e16. doi: 10.1093/nar/gku1197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Yu J., Silva J., Califano A. ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling. Bioinformatics. 2016;32:260–267. doi: 10.1093/bioinformatics/btv556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Hart T., Moffat J. BAGEL: a computational framework for identifying essential genes from pooled library screens. BMC Bioinformatics. 2016;17:164. doi: 10.1186/s12859-016-1015-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Jia G., Wang X., Xiao G. A permutation-based non-parametric analysis of CRISPR screen data. BMC Genomics. 2017;18:545. doi: 10.1186/s12864-017-3938-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Winter J., Schwering M., Pelz O., Rauscher B., Zhan T., Heigwer F., et al. CRISPRAnalyzeR: interactive analysis, annotation and documentation of pooled CRISPR screens. bioRxiv. 2017:109967. [Google Scholar]
- 124.Spahn P.N., Bath T., Weiss R.J., Kim J., Esko J.D., Lewis N.E., et al. PinAPL-Py: a comprehensive web-application for the analysis of CRISPR/Cas9 screens. Sci Rep. 2017;7:15854. doi: 10.1038/s41598-017-16193-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Jeong H.H., Kim S.Y., Rousseaux M.W.C., Zoghbi H.Y., Liu Z. Beta-binomial modeling of CRISPR pooled screen data identifies target genes with greater sensitivity and fewer false negatives. Genome Res. 2019;29:999–1008. doi: 10.1101/gr.245571.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Crosetto N., Mitra A., Silva M.J., Bienko M., Dojer N., Wang Q., et al. Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing. Nat Method. 2013;10:361–365. doi: 10.1038/nmeth.2408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Wang X., Wang Y., Wu X., Wang J., Wang Y., Qiu Z., et al. Unbiased detection of off-target cleavage by CRISPR-Cas9 and TALENs using integrase-defective lentiviral vectors. Nat Biotechnol. 2015;33:175–178. doi: 10.1038/nbt.3127. [DOI] [PubMed] [Google Scholar]
- 128.Tsai S.Q., Nguyen N.T., Malagon-Lopez J., Topkar V.V., Aryee M.J., Joung J.K. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets. Nat Method. 2017;14:607–614. doi: 10.1038/nmeth.4278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Wienert B., Wyman S.K., Richardson C.D., Yeh C.D., Akcakaya P., Porritt M.J., et al. Unbiased detection of CRISPR off-targets in vivo using DISCOVER-Seq. Science. 2019;364:286–289. doi: 10.1126/science.aav9023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Lazzarotto C.R., Malinin N.L., Li Y., Zhang R., Yang Y., Lee G., et al. CHANGE-seq reveals genetic and epigenetic effects on CRISPR-Cas9 genome-wide activity. Nat Biotechnol. 2020;38:1317–1327. doi: 10.1038/s41587-020-0555-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Yang L., Zhu Y., Yu H., Cheng X., Chen S., Chu Y., et al. scMAGeCK links genotypes with multiple phenotypes in single-cell CRISPR screens. Genome Biol. 2020;21:19. doi: 10.1186/s13059-020-1928-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Schoonenberg V.A.C., Cole M.A., Yao Q., Macias-Treviño C., Sher F., Schupp P.G., et al. CRISPRO: identification of functional protein coding sequences based on genome editing dense mutagenesis. Genome Biol. 2018;19:169. doi: 10.1186/s13059-018-1563-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.He W., Zhang L., Villarreal O.D., Fu R., Bedford E., Dou J., et al. De novo identification of essential protein domains from CRISPR-Cas9 tiling-sgRNA knockout screens. Nat Commun. 2019;10:4541. doi: 10.1038/s41467-019-12489-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Li C., Zhang R., Meng X., Chen S., Zong Y., Lu C., et al. Targeted, random mutagenesis of plant genes with dual cytosine and adenine base editors. Nat Biotechnol. 2020;38:875–882. doi: 10.1038/s41587-019-0393-7. [DOI] [PubMed] [Google Scholar]
- 135.Ren C., Liu X., Zhang Z., Wang Y., Duan W., Li S., et al. CRISPR/Cas9-mediated efficient targeted mutagenesis in Chardonnay (Vitis vinifera L.) Sci Rep. 2016;6:32289. doi: 10.1038/srep32289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Zhang H., Zhang J., Wei P., Zhang B., Gou F., Feng Z., et al. The CRISPR/Cas9 system produces specific and homozygous targeted gene editing in rice in one generation. Plant Biotech J. 2014;12:797–807. doi: 10.1111/pbi.12200. [DOI] [PubMed] [Google Scholar]
- 137.Sun X., Hu Z., Chen R., Jiang Q., Song G., Zhang H., et al. Targeted mutagenesis in soybean using the CRISPR-Cas9 system. Sci Rep. 2015;5:10342. doi: 10.1038/srep10342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Li C., Unver T., Zhang B. A high-efficiency CRISPR/Cas9 system for targeted mutagenesis in cotton (Gossypium hirsutum L.) Sci Rep. 2017;7:43902. doi: 10.1038/srep43902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Pattanayak V., Lin S., Guilinger J.P., Ma E., Doudna J.A., Liu D.R. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nat Biotechnol. 2013;31:839–843. doi: 10.1038/nbt.2673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Jensen K.T., Floe L., Petersen T.S., Huang J., Xu F., Bolund L., et al. Chromatin accessibility and guide sequence secondary structure affect CRISPR-Cas9 gene editing efficiency. FEBS Lett. 2017;591:1892–1901. doi: 10.1002/1873-3468.12707. [DOI] [PubMed] [Google Scholar]
- 141.Uusi-Makela M.I.E., Barker H.R., Bauerlein C.A., Hakkinen T., Nykter M., Ramet M. Chromatin accessibility is associated with CRISPR-Cas9 efficiency in the zebrafish (Danio rerio) PLoS One. 2018;13:e0196238. doi: 10.1371/journal.pone.0196238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Yarrington R.M., Verma S., Schwartz S., Trautman J.K., Carroll D. Nucleosomes inhibit target cleavage by CRISPR-Cas9 in vivo. Proc Natl Acad Sci U S A. 2018;115:9351–9358. doi: 10.1073/pnas.1810062115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Lu Z., Marand A.P., Ricci W.A., Ethridge C.L., Zhang X., Schmitz R.J. The prevalence, evolution and chromatin signatures of plant regulatory elements. Nat Plants. 2019;5:1250–1259. doi: 10.1038/s41477-019-0548-z. [DOI] [PubMed] [Google Scholar]
- 144.Wu J., Huang B., Chen H., Yin Q., Liu Y., Xiang Y., et al. The landscape of accessible chromatin in mammalian preimplantation embryos. Nature. 2016;534:652–657. doi: 10.1038/nature18606. [DOI] [PubMed] [Google Scholar]
- 145.Zhang F., Voytas D.F. Modulating gene translational control through genome editing. Natl Sci Rev. 2019;6:391. doi: 10.1093/nsr/nwy123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Si X., Zhang H., Wang Y., Chen K., Gao C. Manipulating gene translation in plants by CRISPR-Cas9-mediated genome editing of upstream open reading frames. Nat Protoc. 2020;15:338–363. doi: 10.1038/s41596-019-0238-3. [DOI] [PubMed] [Google Scholar]
- 147.Hellens R.P., Brown C.M., Chisnal M.A.W., Waterhouse P.M., Macknight R.C. The emerging world of small ORFs. Trends Plant Sci. 2016;21:317–328. doi: 10.1016/j.tplants.2015.11.005. [DOI] [PubMed] [Google Scholar]
- 148.Zhang H., Si X., Ji X., Fan R., Liu J., Chen K., et al. Genome editing of upstream open reading frames enables translational control in plants. Nat Biotechnol. 2018;36:894–898. doi: 10.1038/nbt.4202. [DOI] [PubMed] [Google Scholar]
- 149.Chen X. Small RNAs and their roles in plant development. Annu Rev Cell Dev Biol. 2009;25:21–44. doi: 10.1146/annurev.cellbio.042308.113417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Moazed D. Small RNAs in transcriptional gene silencing and genome defence. Nature. 2009;457:413–420. doi: 10.1038/nature07756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Fritz J.V., Heintz-Buschart A., Ghosal A., Wampach L., Etheridge A., Galas D., et al. Sources and functions of extracellular small RNAs in human circulation. Annual Rev Nutr. 2016;36:301–336. doi: 10.1146/annurev-nutr-071715-050711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 152.Zhang B., Pan X., Cobb G.P., Anderson T.A. MicroRNAs as oncogenes and tumor suppressors. Dev Biol. 2007;302:1–12. doi: 10.1016/j.ydbio.2006.08.028. [DOI] [PubMed] [Google Scholar]
- 153.Zhang B., Unver T. A critical and speculative review on microRNA technology in crop improvement: current challenges and future directions. Plant Sci. 2018;274:193–200. doi: 10.1016/j.plantsci.2018.05.031. [DOI] [PubMed] [Google Scholar]
- 154.Hanna R.E., Doench J.G. Design and analysis of CRISPR-Cas experiments. Nat Biotechnol. 2020;38:813–823. doi: 10.1038/s41587-020-0490-7. [DOI] [PubMed] [Google Scholar]
- 155.Hiranniramol K., Chen Y., Liu W., Wang X. Generalizable sgRNA design for improved CRISPR/Cas9 editing efficiency. Bioinformatics. 2020;36:2684–2689. doi: 10.1093/bioinformatics/btaa041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Wong N., Liu W., Wang X. WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system. Genome Biol. 2015;16:218. doi: 10.1186/s13059-015-0784-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Minkenberg B., Zhang J., Xie K., Yang Y. CRISPR-PLANT v2: an online resource for highly specific guide RNA spacers based on improved off-target analysis. Plant Biotechnol J. 2019;17:5–8. doi: 10.1111/pbi.13025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Wang D., Zhang C., Wang B., Li B., Wang Q., Liu D., et al. Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning. Nat Commun. 2019;10:4284. doi: 10.1038/s41467-019-12281-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Kluesner M.G., Nedveck D.A., Lahr W.S., Garbe J.R., Abrahante J.E., Webbor B.R., et al. EditR: a method to quantify base editing from Sanger sequencing. CRISPR J. 2018;1:239–250. doi: 10.1089/crispr.2018.0014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Hill J.T., Demarest B.L., Bisgrove B.W., Su Y.C., Smith M., Yost H.J. Poly peak parser: method and software for identification of unknown indels using Sanger sequencing of polymerase chain reaction products. Dev Dyn. 2014;243:1632–1636. doi: 10.1002/dvdy.24183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Lee H, Chang HY, Cho SW, Ji HP. CRISPRpic: fast and precise analysis for CRISPR-induced mutations via prefixed index counting. NAR Genom Bioinform 2020;2:lqaa012. [DOI] [PMC free article] [PubMed]
- 162.Wang X., Tilford C., Neuhaus I., Mintier G., Guo Q., Feder J.N., et al. CRISPR-DAV: CRISPR NGS data analysis and visualization pipeline. Bioinformatics. 2017;33:3811–3812. doi: 10.1093/bioinformatics/btx518. [DOI] [PubMed] [Google Scholar]
- 163.Li W., Koster J., Xu H., Chen C.H., Xiao T., Liu J.S., et al. Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR. Genome Biol. 2015;16:281. doi: 10.1186/s13059-015-0843-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Meyers R.M., Bryan J.G., McFarland J.M., Weir B.A., Sizemore A.E., Xu H., et al. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat Genet. 2017;49:1779–1784. doi: 10.1038/ng.3984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Frock R.L., Hu J.Z., Meyers R.M., Ho Y.J., Kii E., Alt F.W. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat Biotechnol. 2015;33:179–186. doi: 10.1038/nbt.3101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Yan W.X., Mirzazadeh R., Garnerone S., Scott D., Schneider M.W., Kallas T., et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat Commun. 2017;8:15058. doi: 10.1038/ncomms15058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Zuo E., Sun Y., Wei W., Yuan T., Ying W., Sun H., et al. GOTI, a method to identify genome-wide off-target effects of genome editing in mouse embryos. Nat Protoc. 2020;15:3009–3029. doi: 10.1038/s41596-020-0361-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168.Cameron P., Fuller C.K., Donohoue P.D., Jones B.N., Thompson M.S., Carter M.M., et al. Mapping the genomic landscape of CRISPR-Cas9 cleavage. Nat Method. 2017;14:600–606. doi: 10.1038/nmeth.4284. [DOI] [PubMed] [Google Scholar]
- 169.Kim D., Bae S., Park J., Kim E., Kim S., Yu H.R., et al. Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells. Nat Method. 2015;12:237–243. doi: 10.1038/nmeth.3284. [DOI] [PubMed] [Google Scholar]
- 170.Lin J., Zhang Z., Zhang S., Chen J., Wong K.C. CRISPR-Net: a recurrent convolutional network quantifies CRISPR off-target activities with mismatches and indels. Adv Sci. 2020;7:1903562. [Google Scholar]