Skip to main content
. 2024 Jan 8;11:1335901. doi: 10.3389/fbioe.2023.1335901

TABLE 4.

ML and DL-based tools for Genome editing applications.

Ref ML models used Dataset Description and key contribution Performance evaluation metrics Limitation Target
Chuai et al. (2018) DeepCRISPR (DCDNN) Thirteen distinct human cell lines produced a total of 0.68 billion sgRNA sequences This computational framework surpasses existing in silico tools by combining sgRNA on-target/off-target site prediction into a single system with DL. Spearman: 0.246, AUROC: 0.804, AUPRC: 0.303 The model acquires an understanding of which attributes are crucial for improved sgRNA structure, even when trained with a limited number of samples On and Off-target
Lin and Wong (2018) CNN and FNN, Random Forest, GBTs, and LR GUIDE-seq Tsai et al. (2015), CRISPOR Concordet and Haeussler (2018) The key contribution of this paper is the development and implementation of a deep CNN for accurately predicting off-target mutations in CRISPR-Cas9 gene editing AUROC: 97.2% for CNN, AUROC: 97% for FNN Off-target
Xue et al. (2018) DeepCas9 (1D CNN) Wang Wang et al. (2014), Doench V1 Doench et al. (2014), Doench V2 Doench et al. (2016), C.elegans (F et al. (2015) HCT116 Hart et al. (2015), Z fish Gagnon et al. (2014); Moreno-Mateos et al. (2015); Varshney et al. (2015), Chari Chari et al. (2015), Haeussler Haeussler et al. (2016), HL-60 Xu et al. (2015) It is the first DL technique that can recognize CRISPRCas9 sgRNA activity directly from genetic sequences without the need for feature input Spearman: 0.23-0.61 These datasets’ sgRNA activity was completely limited to clinical assays, where the measured cleavage efficiency served as a clear indicator of KO efficacy On-target
Liu et al. (2019) SeqCrispr (RNN + CNN + transfer learning) DeepCRISPR Chuai et al. (2018), CRISPR-Cpf1 Zaidi et al. (2017) SeqCrispr is a DL model, which integrates gene network features specific to a given context into the model Spearman: 0.77 The limited knowledge of gene activity and its fluctuating effects on phenotype, and the challenging biological interpretation of computational models all restrict the predictive model’s efficiency On-target
Wang et al. (2019) DeepHF (RNN) With approximately 50,000 gRNAs, DeepHF is the biggest gRNA on-target activity set for cells from mammals To create the final model, DeepHF extracts features using a Bi-LSTM and combines them with biological features that are manually created. Important sequence characteristics linked to gRNA activity were found in the study, which also assessed several ML algorithms for gRNA activity prediction Spearman: 0.867, 0.862, and 0.860 They were unable to determine which algorithm performed more effectively than others on endogenous sites because of the small amount of data available On-target
Shrawgi and Sisodia (2019) DeepSgRNA (CNN, with Hierarchical feature generation abilities) 40,000 sgRNA sequence examples taken from the GenomeCRISPR project database DeepSgRNA finds and forecasts RNA guides to improve performance. There is no need to create any features with the suggested model Spearman: 0.82, AUROC: 0.85 Specific sgRNA’s off-site effects have not been considered in this investigation On-target
Wang and Zhang (2019) CNN with 5layers + transfer learning Cas9, eSpCas9, Cas9 (/\recA) Yue et al. (2020) The main contribution of this paper is the development of a CNN_5 layers network for predicting sgRNA activity in prokaryotic and eukaryotic species. The model takes 43nt-long DNA sequences as input and predicts on-target activity Spearman: 0.582, 0.7105, 0.360 The limitation of this model is that it does not perform well in predicting the on-target activity for the Cas9 (/\recA) scenario On-target
Aktas et al. (2019) CNN, MLP, Bi-LSTM DeepCRISPR Chuai et al. (2018) In this work, sgRNA target estimate for CRISPR/CAS9 with DL was carried out to reduce these genomic aberrations Accuracy: 96.7% Some of the mistargeted positions caused unwanted genome distortions Off-target and on-target
Kim et al. (2019) DeepSpCas9 (3 1D-CNN) DeepSpCas9 It accurately predicted the activity of the SpCas9 enzyme Spearman: 0.73 The size of the training datasets was not ideal On-target
Liu et al. (2020) CnnCrispr (Bi-LSTM and CNN) DeepCRISPR Chuai et al. (2018) To forecast the off-target tendency of sgRNA at particular DNA fragments, CnnCrispr was proposed AUROC: 0.957, AUPRC: 0.429 RNNs are capable of implementing memory functions, but their capacity is restricted due to the possibility of gradient explosion or disappearance Off-target