Skip to main content
. 2024 Jan 11;25:13. doi: 10.1186/s13059-023-03153-y

Fig. 1.

Fig. 1

Automated machine learning and data fusion predicts depletion in CRISPRi essentiality screens. A An overview of CRISPRi essentiality screens. gRNAs are designed targeting every gene in the genome and cloned into an appropriate plasmid for expression. This plasmid collection is then transformed into the target bacteria, and depletion is measured as the change in guide frequency over growth determined by sequencing relative to a set of non-targeting gRNAs. The measured depletion (logFC) is then a mixture of the fitness effect of gene knockdown with the efficiency of silencing itself. B Comparison of Spearman correlation between actual and predicted guide depletion in tenfold cross-validation (CV) of the best model trained with Auto-Sklearn with different feature combinations, using data from [21]. C The ten most predictive features were determined using TreeExplainer on the optimal random forest model trained with Auto-Sklearn and 138 guide and gene features. Mean absolute SHAP value (left) provides a global measure of feature importance, while the beeswarm plot (right) shows the effect of each feature on each individual gRNA prediction. CDS: coding sequence. D Distribution of logFCs of gRNAs targeting essential genes from three CRISPRi genome-wide essentiality screens in E. coli. E Comparison of Spearman correlation from the tenfold CV of the best Auto-Sklearn trained model on one dataset or the three integrated datasets