Skip to main content
Plant Communications logoLink to Plant Communications
. 2023 Dec 24;5(3):100783. doi: 10.1016/j.xplc.2023.100783

CrisprStitch: Fast evaluation of the efficiency of CRISPR editing systems

Yangshuo Han 1,2,3,4, Guanqing Liu 1,2,4, Yuechao Wu 1,2, Yu Bao 1,2, Yong Zhang 3,∗∗, Tao Zhang 1,2,
PMCID: PMC10943576  PMID: 38146164

Dear Editor,

Targeted genome-editing technology using designed nucleases has been rapidly advancing and catalyzing significant breakthroughs in the life sciences. Custom-designed CRISPR editing experiments can be efficiently evaluated through next-generation sequencing (NGS). Nevertheless, NGS data analysis currently lacks a user-friendly pipeline capable of automatically calculating mutations and evaluating editing efficiency in bulk, as well as providing a more comprehensive visualization of results.

Since the initial demonstrations of programmable DNA cleavage by the Cas9 nuclease, there has been an explosion in the discovery, engineering, and application of CRISPR–Cas genome-editing tools. CRISPR–Cas systems can easily be engineered and deployed by exploiting various target sequences within a guide RNA molecule if the target sequence is proximate to a suitable PAM (Gasiunas et al., 2012; Jinek et al., 2012). This process introduces single-nucleotide variations and insertions or deletions through non-homologous end joining and induces accurate replacement via homology-directed repair (Hille et al., 2018).

Because of its benefits in terms of cost effectiveness, extensive coverage, and accuracy, high-throughput sequencing has been used to screen the results of various scenarios induced by CRISPR nuclease (Tang et al., 2017). Several bioinformatics tools have emerged for screening CRISPR editing results using NGS data. Noteworthy among these tools are Hi-TOM (Liu et al., 2019), CRISPResso2 (Clement et al., 2019), CRISPR-GRANT (Fu et al., 2023), CRIS.py (Connelly and Pruett-Miller 2019), CRISPRpic (Lee et al., 2020), and CRISPRMatch (You et al., 2018). Among these tools, command-line software tools such as CRISPRMatch, CRIS.py, and CRISPRpic lack user-friendly interfaces and require specific computer environments and bioinformatics expertise for operation. CRISPR-GRANT provides a cross-platform GUI for evaluation of genome-editing results from both amplicon and whole-genome sequencing data. However, it lacks a batch-processing ability. In addition to analysis of genome-editing results from canonical CRISPR–Cas systems, CRISPResso2 also enables analysis of base-editing and prime-editing outcomes with an improved Needleman–Wunsch algorithm. However, certain online platforms like CRISPResso2 and Hi-TOM require the upload of size-limited files and involve prolonged processing times. CrisprStitch has therefore been optimized to address these challenges (Figure 1A).

Figure 1.

Figure 1

Overview of CrisprStitch and its use in CRISPR editing analysis.

(A) Comparison of major features for NGS-based CRISPR editing analysis among different tools.

(B) Overall workflow of the CrisprStitch process. (1) Merging: paired-end FASTQ files provided by the user are merged for each read pair. (2) Haplotyping: reads are categorized by barcodes representing each sample. The read count for each haplotype is grouped and calculated. (3) Alignment and calculation: haplotype sequences are mapped to target sites. Mutation rates at each locus are calculated and visualized.

(C) All calculations are performed locally on the user’s computer using web browser engines, without uploading data to any remote servers.

(D) After data processing, the application provides an overview of different mutation results in tables and plots. Deletions, deletion sizes, insertions, insertion sizes, and substitutions are included. Data in bar charts are represented as mean ± SD.

In this study, we endeavored to fulfill user requirements by developing CrisprStitch, a web application that enables researchers to quickly obtain thorough insight into genome-editing outcomes. It processes high-throughput amplicon sequencing data from CRISPR editing experiments with integrated analysis steps, including paired-end read merging, read mapping, read count normalization, mutation frequency calculation (deletions and insertions), evaluation of genome-editing efficiency and accuracy, and results visualization. Leveraging the capabilities of modern web browsers, CrisprStitch operates as a server-less application. It performs analysis in local web browsers, eliminating the need to upload files to a remote server and thus ensuring data safety. Our findings demonstrate that CrisprStitch is a promising tool for quickly assessing the efficiency of CRISPR editing systems.

An overview of CrisprStitch

Here, we describe CrisprStitch, a server-less web application designed to provide non-bioinformaticians with a user-friendly platform for quantification and visualization of genome-editing efficiency statistics within a target region. CrisprStitch accepts raw FASTQ sequencing files as input and generates histograms that illustrate genome-editing efficiency. Depending on the intended use, it can be accessed by visiting either https://zhangtaolab.org/software/crisprstitch or https://bioinfor.yzu.edu.cn/software/crisprstitch. We also provide an install package for a local computer system. Detailed usage examples can be found on the webpage or within the software.

Both the web app and desktop versions of CrisprStitch process data and output results exclusively using the web browser on the user’s computer. All procedures are performed locally, eliminating the need for a server, which not only conserves network resources but also enhances analysis speed. Importantly, there are no restrictions on the size of input files. At present, CrisprStitch is compatible with all modern web browsers, including Microsoft Edge, Mozilla Firefox, Safari, and Google Chrome. For users seeking an alternative to the web app version, CrisprStitch is available as a desktop version that is compatible with the three major platforms: Microsoft Windows, macOS, and Linux. Depending on the platform, the size of the desktop version of CrisprStitch ranges from 378.4 to 403.3 MB. The desktop versions for Windows, macOS, and Linux are available at https://github.com/zhangtaolab/CrisprStitch/ and https://zhangtaolab.org/software/crisprstitch.

The required files, including amplicon sequencing data, sample information, and target region sequences, are loaded to the user’s computer within the calculation pipeline. CrisprStitch accepts paired-end FASTQ sequencing data and also supports pre-merged sequencing data. Sample information can be either imported from a local file (in CSV/TSV/other Microsoft Excel–readable formats) or entered separately via a form, enabling users to import multiple samples simultaneously. The target region sequences are used in the subsequent mapping phase.

Case study: Rapid evaluation of different CRISPR–Cas editing data in rice

With a versatile tool designed for comprehensive analysis—including read mapping, mutation frequency calculation, evaluation of genome-editing system efficiency, and generation of visualized results—we gained the capability to quickly evaluate any newly developed editing system. As a proof of concept, we analyzed a set of CRISPR–Cas12a data from rice protoplasts (Tang et al., 2017) (NCBI BioProject: PRJNA356347), selecting eight samples from the experiment, labeled TX-12 to TX-20 (detailed usage of the tool is shown in the Supplemental Note). Despite the substantial volume of data (348.4 MB for both ends of FASTQ pair files), CrisprStitch successfully completed the calculation in only 28.7 s (Supplemental Figures 1 and 2; Supplemental Tables 1 and 2).

To assess the performance of CrisprStitch on various CRISPR–Cas systems, we extended our evaluation beyond the conventional TTTV-PAM Cas12a. An engineering system using CRISPR–Cas9 from Faecalibaculum rodentium that recognizes the NNTA PAM was investigated using CrisprStitch, quickly yielding credible results (Supplemental Figure 3; Supplemental Table 3).

Detailed application principles

The integrated pipeline automates a series of analytical steps for user convenience. First, it joins paired-end sequencing reads if an overlap of more than 6 base pairs is detected between two reads. Reads are otherwise dropped if no such overlap is found. By leveraging the JavaScript ReadableStream object, we efficiently read large FASTQ files in spliced chunks, minimizing cache usage. This innovative approach differs from conventional methods, as it subsequently organizes the joined reads on the basis of their custom barcode sequences. Notably, reads with identical barcode sequences are harmoniously merged into a single haplotype, resulting in more efficient processing. Second, all haplotypes are mapped to the target editing region using bioseq.js (https://github.com/lh3/bioseq-js). The alignment results are then sorted by the number of haplotypes and the types of genome-editing system, and target regions for mutation calculation are defined. Third, single-nucleotide variations and insertions or deletions are detected using the mapped reads. All reads are classified according to their mutation type for frequency calculation. Finally, CrisprStitch generates plots summarizing mutation frequency, detailed mutation types, and genome-editing efficiency at each position (Figure 1B). Importantly, all calculations are performed locally on the user’s computer, ensuring data security by avoiding the transmission of data to external servers and preempting potential issues (Figure 1C).

The results of read alignment are displayed as a formatted HTML element, showcasing the top 15 haplotypes with the highest read counts in the target regions. In addition, CrisprStitch offers a range of histograms that depict genome-editing outcomes within the target region. Plots illustrating deletions, substitutions, and insertions are generated using Echarts (Li et al., 2018) (Figure 1D). All results, including PDF plots, text-based alignment results, and comprehensive descriptions in Excel format, can be downloaded for future reference and analysis.

Although CrisprStitch holds promise as a tool for analyzing NGS data from CRISPR genome-editing experiments, it is important to acknowledge certain drawbacks that could influence its overall effectiveness. A notable limitation is the absence of quality-control mechanisms for raw data, which could potentially result in the introduction of erroneous or noisy data into the analysis. Users must therefore perform standalone quality-control procedures before inputting data to CrisprStitch.

In summary, CrisprStitch enables automated and swift analysis of NGS data derived from CRISPR genome-editing experiments. It evaluates the effectiveness of various CRISPR–Cas systems and guide RNAs, presenting mutation events in a secure, effective, and user-friendly manner. This software is useful for genome engineering research because of its simplicity and practicality. We anticipate that CrisprStitch will enhance the user experience and make a valuable contribution to establishment of a standard analysis pipeline for high-throughput genome-editing data.

Funding

This work was financially supported by the National Natural Science Foundation of China (grant no. 32270585), the Key R&D Program of Jiangsu Province (Modern Agriculture) (BE2022335), the Project of Zhongshan Biological Breeding Laboratory (BM2022008-02), the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) to T.Z., and the National Natural Science Foundation of China (award no. 32270433) to Y.Z.

Author contributions

T.Z. and Y.Z. conceived the project. Y.H., G.L., and T.Z. designed the application and website. Y.H., G.L., Y.W., Y.B., and T.Z. performed application and website tests and debugging. Y.H., G.L., Y.Z., and T.Z. wrote the manuscript. All authors approved the manuscript.

Acknowledgments

We thank Dr. Yiping Qi, Dr. Xu Tang, Tingting Fan, Qiurong Ren, Yao He, Zhaohui Zhong, Shanyue Liao, and Shishi Liu for their invaluable guidance and advice. No conflict of interest is declared.

Published: December 24, 2023

Footnotes

Published by the Plant Communications Shanghai Editorial Office in association with Cell Press, an imprint of Elsevier Inc., on behalf of CSPB and CEMPS, CAS.

Supplemental information is available at Plant Communications Online.

Contributor Information

Yong Zhang, Email: zhangyong916@swu.edu.cn.

Tao Zhang, Email: zhangtao@yzu.edu.cn.

Supplemental information

Document S1. Supplemental Figures 1–3, Supplemental Tables 1–3, and Supplemental Note
mmc1.pdf (1.2MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (2.7MB, pdf)

References

  1. Clement K., Rees H., Canver M.C., Gehrke J.M., Farouni R., Hsu J.Y., Cole M.A., Liu D.R., Joung J.K., Bauer D.E., et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 2019;37:224–226. doi: 10.1038/s41587-019-0032-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Connelly J.P., Pruett-Miller S.M. CRIS.py: A Versatile and High-throughput Analysis Program for CRISPR-based Genome Editing. Sci. Rep. 2019;9:4194. doi: 10.1038/s41598-019-40896-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Fu H., Shan C., Kang F., Yu L., Li Z., Yin Y. CRISPR-GRANT: a cross-platform graphical analysis tool for high-throughput CRISPR-based genome editing evaluation. BMC Bioinf. 2023;24:219. doi: 10.1186/s12859-023-05333-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Gasiunas G., Barrangou R., Horvath P., Siksnys V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. USA. 2012;109:E2579–E2586. doi: 10.1073/pnas.1208507109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Hille F., Richter H., Wong S.P., Bratovic M., Ressel S. The Biology of CRISPR-Cas: Backward and Forward. Cell. 2018;172:1239–1259. doi: 10.1016/j.cell.2017.11.032. [DOI] [PubMed] [Google Scholar]
  6. Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J.A., Charpentier E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Lee H.J., Chang H.Y., Cho S.W., Ji H.P. CRISPRpic: fast and precise analysis for CRISPR-induced mutations via prefixed index counting. NAR Genom. Bioinform. 2020;2:lqaa012. doi: 10.1093/nargab/lqaa012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Li D., Mei H., Shen Y., Su S., Zhang W., Wang J., Zu M., Chen W. ECharts: A declarative framework for rapid construction of web-based visualization. Vis. Inform. 2018;2:136–146. [Google Scholar]
  9. Liu Q., Wang C., Jiao X., Zhang H., Song L., Li Y., Gao C., Wang K. Hi-TOM: a platform for high-throughput tracking of mutations induced by CRISPR/Cas systems. Sci. China Life Sci. 2019;62:1–7. doi: 10.1007/s11427-018-9402-9. [DOI] [PubMed] [Google Scholar]
  10. Tang X., Lowder L.G., Zhang T., Malzahn A.A., Zheng X., Voytas D.F., Zhong Z., Chen Y., Ren Q., Li Q., et al. A CRISPR-Cpf1 system for efficient genome editing and transcriptional repression in plants. Nat. Plants. 2017;3 doi: 10.1038/nplants.2017.103. [DOI] [PubMed] [Google Scholar]
  11. You Q., Zhong Z., Ren Q., Hassan F., Zhang Y., Zhang T. CRISPRMatch: An Automatic Calculation and Visualization Tool for High-throughput CRISPR Genome-editing Data Analysis. Int. J. Biol. Sci. 2018;14:858–862. doi: 10.7150/ijbs.24581. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supplemental Figures 1–3, Supplemental Tables 1–3, and Supplemental Note
mmc1.pdf (1.2MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (2.7MB, pdf)

Articles from Plant Communications are provided here courtesy of Elsevier

RESOURCES