Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

[Preprint]. 2024 Oct 25:2024.10.23.619915. [Version 1] doi: 10.1101/2024.10.23.619915

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.

PMC Copyright notice

Figure 8. — The mean test spearman rho for different models across 30 samplings is plotted against training dataset size for (A) k_cat and (B) K_M. Models include Random Forest (RF), Support Vector Regression (SVR), ProteinNPT(PNPT) (107), and Convolutional Linear Regression (CLR) (107). For embeddings, SVR used a One-hot encoded MSA, RF used ESM-2 embeddings, and CLR and PNPT used Tranception(112) embeddings. Embeddings and other model hyperparameters were selected based on aggregate (mean) performance for both k_cat and K_M prediction. Shaded regions represent 95% confidence intervals across 30 training/test set samplings at each dataset size (Methods). A zero-shot evaluation of the Tranception PLM (112) is plotted as a dashed orange line. DLKcat (109) performance evaluated on all 175 sequences is plotted in (A) as a dashed black line.