Impact statement
We have developed the GReedy Accumulated strategy for Protein Engineering (GRAPE) to improve enzyme stability across various applications, combining advanced computational methods with a unique clustering and greedy accumulation approach to efficiently explore epistatic effects with minimal experimental effort. To make this strategy accessible to nonexperts, we introduced GRAPE‐WEB, an automated, user‐friendly web server that allows the design, inspection, and combination of stabilizing mutations without requiring extensive bioinformatics knowledge. GRAPE‐WEB's robust performance and accessibility provide a comprehensive and adaptable approach to protein thermostability design, suitable for both newcomers and experienced practitioners in the field. The web server is accessible at https://grape.wulab.xyz.
Nature has evolved a wide array of enzymes crucial for numerous essential biochemical functions. While these natural enzymes are rarely optimal for industrial application, protein engineering can enhance their performance by improving their physical and catalytic properties 1 . However, redesigning enzymes for new functions often reduces their stability and may even cause misfolding of the target enzyme 2 . Consequently, there is a growing demand for robust enzymes that not only meet industrial requirements but also possess evolutionary potential for further optimization.
In recent decades, directed evolution has been used to successfully tailor enzyme properties, but the costly, time‐ and labor‐demanding nature of this approach makes in silico screening of protein variants a highly attractive alternative 3 . Various computational methods, including energy‐based, phylogeny‐based, and machine learning methods, have been used to design single‐point mutations for improved protein stability. However, the accuracy of individual algorithms still suffers from a series of problems, such as insufficient conformational sampling, inaccurate force fields, or imbalances in protein data sets, leading to increased prediction errors, especially for multiple‐point mutants 1 .
To address these limitations, researchers have developed hybrid methods that incorporate complementary approaches, reducing the sampling bias of individual approaches 4 . However, the resulting expanded library of beneficial mutations has made it more challenging to identify optimal combinations 5 , 6 . When coupled mutations interact epistatically, predictions often lead to nonconvergent optimization processes, failing to achieve near‐optimal mutation combinations.
To address this challenge, we have developed the GReedy Accumulated strategy for Protein Engineering (GRAPE) to search for beneficial combination pathways in a large single‐point mutation library 7 . Leveraging hybrid methods, clustering, and greedy algorithms, GRAPE maximizes the exploration of epistatic effects while minimizing experimental efforts to identify efficient accumulation paths. The GRAPE (Figure 1) employs a hybrid design approach integrating force field‐based algorithms FoldX 8 and Rosetta 9 , along with the statistics‐based algorithm ABACUS 10 , to design potentially stabilizing single‐point mutations. After prediction, visual inspection of the predicted mutant structures eliminates variants with known pitfalls. The refined list of candidates then undergoes experimental validation to confirm their stabilizing effects. Beneficial variants are clustered into groups based on their ΔT m improvements, presumed effects, and the positions of Cα atoms for each mutation. Post‐clustering, we recommend experimentally accumulating mutations within each cluster, following a greedy strategy to achieve desired functional outcomes.
Figure 1.

General workflow of the GReedy Accumulated strategy for Protein Engineering (GRAPE)‐WEB server. (A) Input from user. (B) Single‐mutation design. (C) Greedy accumulation.
To validate GRAPE's performance, we utilized a high‐quality data set comprising 2648 mutant variants across 131 proteins sourced from ProTherm 11 , assessed using the established PoPMuSiC method 12 , as detailed in Tables S1–S3. For benchmarking against prevalent algorithms, we selected a subset of 350 mutants from 67 distinct proteins not previously used in related algorithm training or testing. We derived ΔΔG P values for PoPMuSiC‐2.0, CUPSAT 13 , Dmutant 14 , Eris 15 , and Imutant 16 from the PoPMuSiC data set 12 . Threshold values for these algorithms were based on their precision metrics. As shown in Table S4, we set thresholds for FoldX, Rosetta, and ABACUS in GRAPE‐WEB to –1.5 kcal/mol, –1.0 REU, and –2.5 AEU, respectively.
Notably, distinct mutations exclusive to each algorithm underscored their complementary roles (Table S5). Beyond mere accuracy, practical applications demand a broader set of beneficial mutations, and GRAPE, navigating through sequence space via distinct approaches, significantly expanded the repertoire of stabilizing mutations. Aiming to construct an extensive mutation library, we focused primarily on F1‐scores, encompassing both precision and recall, to ensure a broad capture of stabilizing mutations possibly overlooked by individual methods. As shown in Table 1, GRAPE demonstrated superior performance, achieving the highest F1‐score and accuracy compared to other algorithms.
Table 1.
Comparison of GRAPE with other commonly used algorithms for predicting protein stability.
| Algorithm | Threshold | Sensitivity | Specificity | Precision | Accuracy | F1‐score |
|---|---|---|---|---|---|---|
| PoPMuSiC‐2.0 | –0.5 | 0.105 | 1.000 | 1.000 | 0.757 | 0.190 |
| CUPSAT | –1.5 | 0.105 | 0.980 | 0.657 | 0.743 | 0.182 |
| Dmutant | –1.5 | 0.200 | 0.957 | 0.633 | 0.751 | 0.304 |
| Eris | –2.5 | 0.211 | 0.965 | 0.690 | 0.760 | 0.323 |
| Imutant | –1 | 0.053 | 0.957 | 0.313 | 0.711 | 0.090 |
| GRAPE | –1.5/–1/–2.5 | 0.389 | 0.906 | 0.607 | 0.766 | 0.474 |
The threshold values were chosen based on the precision results. GRAPE, GReedy Accumulated strategy for Protein Engineering.
The successful transformation of PETase from Ideonella sakaiensis 201‐F6 (IsPETase) (PDB ID: 5XH3) highlights the potential application of GRAPE for enhancing thermal stability. A library of designed single‐point mutations was obtained, with the ABACUS, Rosetta, and FoldX algorithms providing 100, 65, and 61 mutations, respectively. Additionally, consensus analysis generated 54 single‐point mutations. After structural inspection, 85 candidates were chosen for experimental validation, and 21 mutants displayed increased stability (∆T m ≥ 1.5°C). We further grouped the stabilizing mutations into three clusters. The detailed clustering results are presented in Table S6. Greedy accumulation was experimentally performed to combine the single‐point mutations in each cluster. Ultimately, only 65 combined variants were explored, resulting in the most thermostable variant, referred to as DuraPETase, with significantly enhanced thermostability (ΔT m = 31°C). Please refer to sections 5 and 6 in the Supporting Information section and Cui et al. 7 for more details of this case study.
In addition to enhancing thermal stability, the GRAPE was applied to increase the organic solvent stability of a peptidylamidoglycolate lyase from Exiguobacterium sp. 17 . Following the design of single‐point mutations and structural analysis, 62 predicted mutants were experimentally evaluated, with 17 mutants exhibiting increased resistance to denaturing agents like guanidine hydrochloride. After clustering and greedy accumulation, a final mutant (PAL14) containing 14 mutations was obtained. PAL14 demonstrated significant improvements in denaturant tolerance, achieving a 24‐fold increase in peptide C‐terminal amidation activity under 2.5 M guanidine hydrochloride. These experimental validations underscore the efficacy of the GRAPE in augmenting enzyme robustness. Please refer to section 7 in Supporting Information section and Zhu et al. 17 for more details of this case study.
Although GRAPE has demonstrated significant potential for enhancing protein thermostability in previous engineering projects, it is currently available only as a stand‐alone tool and requires extensive structural and bioinformatics experience to implement workflows. To address this limitation, we developed a web version of the GRAPE, GRAPE‐WEB (https://grape.wulab.xyz), for automatically improving protein thermostability. The main difference between the GRAPE‐WEB server and other stability engineering web tools is the ability to interact with the experimental results. With easy access to the online web server, nonexpert users can design potential stabilizing candidates, filter mutations with pitfalls through structural inspection, and perform clustering on the server without programming knowledge or software installation.
The web server was designed to have both “Stabilizing mutation design” and “Stabilizing mutation clustering” sections. In the “Stabilizing mutation design” section, users can initiate the process by submitting sequence or structure data for the target protein (Figure 1A). For proteins with available experimental or computationally predicted structures, submission of a PDB file is allowed. Otherwise, GRAPE‐WEB employs ESMFold 18 to predict the structure of the target protein, sparing users the task of prediction. Users can manually choose specific chains of the target protein for design. If ESMFold is used, the chain ID must be set to “A”. The calculations allow for the application of either default or user‐defined thresholds in designing potential stabilizing mutations, with the latter offering customization for advanced users. Configuration of settings is completed by clicking the “Prepare” button, followed by job submission by clicking the “Run” button (Figure S1A). Upon submission, the single‐mutation prediction process will be launched (Figure 1B). A unique job ID will be sent to the provided email address and displayed on the screen. The “job hashrun” will be used to retrieve predicted stabilizing mutations.
Once the job is completed, users can access the “Stabilizing mutation results” section to conduct a structural analysis by replacing the demo job hash with their assigned identifier. An example provided in the demo job features the design outcomes for a limonene‐1,2‐epoxide hydrolase (PDB ID: 1NWW) (Figure S1B). Entering the user's unique identifier would reveal mutations with ABACUS energy or folding free energy (∆∆G fold) below the user‐defined thresholds. In the results table, details of the predicted mutations are presented in six columns. The initial four columns include the mutation name and its residue index, while the subsequent two columns show the utilized design algorithms and their predicted energies. Users can visualize the structure of the mutant in the Mol* 19 viewer plugin by entering the mutation name. Within the Mol* interface, selecting the desired mutation from the sequence list using the setting button reveals its interactions with nearby amino acids, aiding in the exclusion of mutations that might lead to biophysical problems such as internal cavities, disrupted hydrogen bonds, or the exposure of hydrophobic residues, thus refining the quality of the library and minimizing screening efforts. Since all mutant structures can be directly accessed via the application programming interface (API), code snippets are provided to enable users to locally load the mutants and highlight their surroundings in PyMOL.
For users choosing the clustering step after experimental validation, the “Stabilizing mutation clustering” section (Figure 1C) requires submitting beneficial mutations and their associated ∆T m values in either .txt or .csv format (Figure S2). Additionally, the target protein's sequence data are needed. The clustering calculations leverage features automatically generated by the server. The first three columns provide the mutation and ΔT m improvements. The next three columns list the potential effects caused by the mutations, including changes in hydrogen bonding, hydrophobic interactions, and conformational entropy. The values in the “Hbond” (hydrogen bonds) and “hydrophobic” (hydrophobic interactions) columns quantify the changes in mutant interactions compared to the wild type. In the “entropy” column, only a value of 1 or 0 is displayed, where 1 indicates a potential change in entropy and 0 indicates no change. The final three columns contain the Euclidian coordinates of the Cα atoms of the mutations.
In summary, GRAPE‐WEB is a user‐friendly web server based on the GRAPE for protein stabilization. Both in silico analysis and experimental verifications support the broad applications of this strategy. By integrating complementary methods, the GRAPE further enriches the beneficial mutation library. FoldX and Rosetta rely on energy functions to describe the strength of physical interactions between atoms. However, these methods suffer from inaccuracies in energy functions and insufficient sampling. Meanwhile, ABACUS uses a statistical energy function approach. This data‐driven method is influenced by the uneven distribution of protein structure data, potentially leading to over‐smoothing in sparsely populated regions. These methods have been shown to complement one another (Table S5). By combining these tools, GRAPE leverages their strengths, increasing the number of beneficial mutations and enriching the pool of stabilizing mutations for further combination.
There are also areas for improvement in the current methodology. While molecular dynamic (MD) simulations provide detailed insights into protein dynamics, they are too computationally intensive to be integrated into GRAPE‐WEB without significantly limiting its service capacity. Due to this limitation, GRAPE‐WEB may not perform optimally for large or highly dynamic proteins. To address this, we offer a local version of GRAPE that allows users to automate MD simulations. Additionally, advancements in mutation stability prediction using deep learning methods have shown higher success rates for predicting single mutations without MD‐based filtering 20 . However, we did not incorporate these deep learning methods into GRAPE‐WEB due to limited experimental success. It is likely that the successful design of more stable enzymes will become more accessible in the future, and we plan to optimize the combination of mutation predictors as the field progresses.
AUTHOR CONTRIBUTIONS
Jinyuan Sun: Software (equal). Wenyu Shi: Software (equal). Zhihui Xing: Software (equal). Guomei Fan: Software (equal). Qinglan Sun: Software (equal). Linhuan Wu: Software (equal). Juncai Ma: Software (equal). Yinglu Cui: Conceptualization (equal); methodology (equal); resources (equal); software (equal). Bian Wu: Conceptualization (equal); supervision (equal); writing—review and editing (equal).
ETHICS STATEMENT
Ethics statement is not applicable to this study.
CONFLICT OF INTERESTS
The authors declare no conflict of interests.
Supporting information
Supporting information.
ACKNOWLEDGMENTS
This work was supported by the National Key R&D Program of China (grant no. 2021YFC2103600), the National Natural Science Foundation of China (31822002, 32170033, and 32422001), the Key Research Program of Frontier Sciences (ZDBS‐LY‐SM014), the Biological Resources Program (KFJ‐BRP‐009 and KFJ‐BRP‐017‐58) from the Chinese Academy of Sciences, the Informatization Plan of Chinese Academy of Sciences (CAS‐WX2021SF‐0111), and the Youth Innovation Promotion Association CAS (2022086).
Sun J, Shi W, Xing Z, Fan G, Sun Q, Wu L, et al. GRAPE‐WEB: an automated computational redesign web server for improving protein thermostability. mLife. 2024;3:527–531. 10.1002/mlf2.12152
Contributor Information
Yinglu Cui, Email: cuiyinglu@im.ac.cn.
Bian Wu, Email: wub@im.ac.cn.
DATA AVAILABILITY
The data that support the findings of this study are available at https://grape.wulab.xyz/.
REFERENCES
- 1. Musil M, Konegger H, Hon J, Bednar D, Damborsky J. Computational design of stable and soluble biocatalysts. ACS Catal. 2018;9:1033–1054. [Google Scholar]
- 2. Goldenzweig A, Fleishman SJ. Principles of protein stability and their application in computational design. Annu Rev Biochem. 2018;87:105–129. [DOI] [PubMed] [Google Scholar]
- 3. Fang S, Wei R, Cui Y, Su L. Advancing AI protein structure prediction and design: from amino acid “bones” to new era of all‐atom “flesh”. Green Carbon. 2024;2:209–210. [Google Scholar]
- 4. Bednar D, Beerens K, Sebestova E, Bendl J, Khare S, Chaloupkova R, et al. FireProt: energy‐and evolution‐based computational design of thermostable multiple‐point mutants. PLoS Comput Biol. 2015;11:e1004556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Wijma HJ, Floor RJ, Jekel PA, Baker D, Marrink SJ, Janssen DB. Computationally designed libraries for rapid enzyme stabilization. Protein Eng Des Sel. 2014;27:49–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Mu Q, Cui Y, Tian Y, Hu M, Tao Y, Wu B. Thermostability improvement of the glucose oxidase from Aspergillus niger for efficient gluconic acid production via computational design. Int J Biiol Macromol. 2019;136:1060–1068. [DOI] [PubMed] [Google Scholar]
- 7. Cui Y, Chen Y, Liu X, Dong S, Tian Y, Qiao Y, et al. Computational redesign of a PETase for plastic biodegradation under ambient condition by the GRAPE strategy. ACS Catal. 2021;11:1340–1350. [Google Scholar]
- 8. Delgado J, Radusky LG, Cianferoni D, Serrano L. FoldX 5.0: working with RNA, small molecules and a new graphical interface. Bioinformatics. 2019;35:4168–4169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The Rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13:3031–3048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Xiong P, Wang M, Zhou X, Zhang T, Zhang J, Chen Q, et al. Protein design with a comprehensive statistical energy function and boosted by experimental selection for foldability. Nat Commun. 2014;5:5330. [DOI] [PubMed] [Google Scholar]
- 11. Bava KA, Gromiha MM, Uedaira H, Kitajima K, Sarai A. ProTherm, version 4.0: thermodynamic database for proteins and mutants. Nucleic Acids Res. 2004;32:120D–121D. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, Rooman M. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC‐2.0. Bioinformatics. 2009;25:2537–2543. [DOI] [PubMed] [Google Scholar]
- 13. Parthiban V, Gromiha MM, Schomburg D. CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res. 2006;34:W239–W242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Zhou H, Zhou Y. Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction. Prot Sci. 2002;11:2714–2726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Yin S, Ding F, Dokholyan NV. Eris: an automated estimator of protein stability. Nat Methods. 2007;4:466–467. [DOI] [PubMed] [Google Scholar]
- 16. Capriotti E, Fariselli P, Casadio R. I‐Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005;33:W306–W310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Zhu T, Sun J, Pang H, Wu B. Computational enzyme redesign enhances tolerance to denaturants for peptide C‐terminal amidation. JACS Au. 2024;4:788–797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary‐scale prediction of atomic‐level protein structure with a language model. Science. 2023;379:1123–1130. [DOI] [PubMed] [Google Scholar]
- 19. Sehnal D, Bittrich S, Deshpande M, Svobodová R, Berka K, Bazgier V, et al. Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Res. 2021;49:W431–W437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Sun J, Zhu T, Cui Y, Wu B. Structure‐based self‐supervised learning enables ultrafast prediction of stability changes upon mutation at the protein universe scale. bioRxiv. 2023. https://www.biorxiv.org/content/10.1101/2023.08.09.552725v1
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supporting information.
Data Availability Statement
The data that support the findings of this study are available at https://grape.wulab.xyz/.
