(a) Edit type count distribution in the original Easy-Prime test dataset. (b) Evaluation of Easy-Prime PE2 model by testing this XGBoost model on the original Easy-Prime test dataset5, filtered against 1bp edits at position 5 of the RTT to eliminate the bias towards this edit type. n = 585. (c-g) Evaluation of Easy-Prime PE2 by testing the model on datasets generated in this study. (c) Library 1 in HEK293T, n = 92,423. (d) Library 2 (editing with PE2 and pegRNAs without tevopreQ1) in HEK293T, n = 915. (e) Library 2 (editing with PE2 and pegRNAs without tevopreQ1) in K562, n = 876. (f,g) Endogenous loci from Fig. 4a,b in HEK293T (f) and K562 (g), n = 45. (h) Intended editing efficiency rank of the best-predicted pegRNA for each pathogenic locus in library 1 (PRIDICT and Easy-Prime). Pathogenic loci with multiple pegRNAs on rank 1 (identical efficiency) and loci with less than 3 pegRNAs were excluded from this analysis. Predictions from PRIDICT were taken from 5 different cross-validations to ensure none of the predictions are included in the training set. n = 12,189. (i) Intended editing efficiency rank of the best-predicted pegRNA for each endogenous locus (PRIDICT and Easy-Prime). n = 15.