Fig. 1.
Overview of predicting mutations in E. coli using a data-driven approach. a A compendium was constructed with mutation profiles across 178 conditions over 83 features that capture attributes related to the strain, medium and stress from experiments reported in 95 publications. b We built three individual predictors, namely an Artificial Neural Network (ANN), Support Vector Machines (SVM) and a Naive Bayes (NB) model, which are integrated under one Ensemble method. c Assessment of the predictions from all three individual predictors and the Ensemble method is performed through forward validation over a novel experimental setting through the evolution and whole-genome resequencing of 35 cell lines