Fig. 5. TurNuP predictions are more accurate for enzymes similar to proteins in the training set and outperform an existing deep learning model.
a Coefficients of determination R2 for the test sets for our TurNuP model (black) and the previously published DLKcat model16 (magenta) for different levels of maximal enzyme sequence identity compared to enzymes in the training set. Numbers next to points show how many data points of this category are in the test set. The horizontal dashed line corresponds to a model that predicts the same mean kcat value for all test data points. b Mean squared errors (MSE) for the prediction of absolute proteome data compared to experimental data. Proteome predictions were achieved with enzyme-constrained genome-scale models, parameterized with kcat values predicted with TurNuP (black) or with the DLKcat model (magenta). Proteome data was predicted for four different yeast species (Sce, Saccharomyces cerevisiae; Kla, Kluyveromyces lactis; Kmx, Kluyveromyces marxianus; Yli, Yarrowia lipolytica) in 21 different culture conditions (see Methods for details). Source data are provided as a Source Data file.
