Stabilizing proteins, simplified: A Rosetta‐based webtool for predicting favorable mutations

David F Thieker; Jack B Maguire; Stephan T Kudlacek; Andrew Leaver‐Fay; Sergey Lyskov; Brian Kuhlman

doi:10.1002/pro.4428

. 2022 Sep 21;31(10):e4428. doi: 10.1002/pro.4428

Stabilizing proteins, simplified: A Rosetta‐based webtool for predicting favorable mutations

David F Thieker ¹, Jack B Maguire ¹, Stephan T Kudlacek ¹, Andrew Leaver‐Fay ¹, Sergey Lyskov ², Brian Kuhlman ^1,^3,^✉

PMCID: PMC9490798 PMID: 36173174

Abstract

Many proteins have low thermodynamic stability, which can lead to low expression yields and limit functionality in research, industrial and clinical settings. This article introduces two, web‐based tools that use the high‐resolution structure of a protein along with the Rosetta molecular modeling program to predict stabilizing mutations. The protocols were recently applied to three genetically and structurally distinct proteins and successfully predicted mutations that improved thermal stability and/or protein yield. In all three cases, combining the stabilizing mutations raised the protein unfolding temperatures by more than 20°C. The first protocol evaluates point mutations and can generate a site saturation mutagenesis heatmap. The second identifies mutation clusters around user‐defined positions. Both applications only require a protein structure and are particularly valuable when a deep multiple sequence alignment is not available. These tools were created to simplify protein engineering and enable research that would otherwise be infeasible due to poor expression and stability of the native molecule.

Keywords: molecular modeling, protein design, protein engineering, protein stability, Rosetta molecular modeling program

1. INTRODUCTION

Recombinant proteins are critical reagents for biomedical research and continue to become increasingly important as therapeutics. ¹ , ² However, native proteins only develop a level of stability required by the local environment in which they evolve and one that is often insufficient for the non‐native context that scientists apply them. Whether the intrinsically low thermostability of a molecule limits production within a recombinant expression system, or an enzyme aggregates when exposed to necessary reaction conditions, native proteins often require stabilization for ex situ applications. The field of protein engineering represents a path toward overcoming these barriers and can be generally separated into experimental and computational approaches for selecting favorable mutations. Directed evolution applies artificial selection pressures to a system in vitro and can yield significant improvements in stability. ³ Although in silico methods miss some mutations that would be identified in an experimental screen, computational approaches can be faster, less expensive, and sample larger sequence spaces. ⁴

Computational methods for predicting stabilizing mutations have become increasingly successful. These approaches can be divided into bioinformatics and physics‐based approaches. The first compares homologous protein sequences to identify the most evolutionarily conserved amino acid at each position. Mutating a target protein toward this consensus sequence often improves stability. ⁵ The second approach considers the atomic interactions for a protein structure before and after mutagenesis. ⁶ , ⁷ , ⁸ Consensus sequence design is a conservative means for predicting mutations and is less prone to deleterious amino acid substitutions than structure‐guided approaches; however, only considering sequences exposed to natural selection pressures precludes many stabilizing mutations. Combining sequence and structure‐based methods has yielded significant improvements in stability by combining large numbers of mutations simultaneously. ⁹ , ¹⁰ Although this approach is enhanced with a deep multiple sequence alignment, it is not required for all systems. ¹¹ The advent of model generation techniques driven by machine learning ensures users are not limited by the availability of protein structures. ¹² , ¹³ There are already examples of successful protein engineering efforts starting from these models rather than experimentally determined structures. ¹⁴

To improve access to our most recent structure‐guided protocols for stabilizing proteins within the Rosetta software package, we present two webtools within the newest framework for the Rosetta Online Server that Includes Everyone (ROSIE2, https://r2.graylab.jhu.edu/). ¹⁵ The first protocol characterizes point mutations (PM) while the second assesses mutation clusters (MC) in a local, combinatorial manner. We recently tested these protocols on three structurally distinct proteins and identified a variety of point mutations and mutation clusters that increased thermostability as measured with fluorescence‐based thermal unfolding experiments. ¹⁶ , ¹⁷ In each case, after experimental assessment of the initial predictions the best performing mutations were combined to generate variants with 5–10 mutations that raised thermal stability by over 20°C and in two cases raised protein expression yields by more than 10‐fold. These initial studies required a trained researcher to perform the simulations on a high‐performance computer cluster. The following web tools make the protocols accessible to a broader community of molecular biologists and biochemists.

2. METHODS

2.1. Protein modeling concepts

Sequence optimization protocols in Rosetta rely on two key components of the software: an energy function that is used to evaluate the favorability of a given structure and sequence and sampling protocols for identifying alternative structures and sequences with lower energy. The web tools described here make use of the default full‐atom energy function in Rosetta. ¹⁸ The energy function is a linear sum of terms that represent physical phenomena. Key terms include (a) a Lennard–Jones potential that models dispersion forces and steric repulsion and provides key information for packing the interior of a protein, (b) an implicit solvation model that penalizes the burial of polar groups, ¹⁹ (c) an orientation hydrogen bonding term, (d) short‐range electrostatics, ²⁰ and (e) backbone and side chain torsion angle preferences.

When evaluating the favorability of an amino acid mutation with the Rosetta energy function, the protocols described here first allow the protein structure to adjust its conformation to accommodate the mutation. Two types of conformational sampling are performed: optimization of side chain rotamers (called “packing” in Rosetta) ²¹ and gradient‐based minimization of backbone and side chain torsion angles. During packing, a Monte Carlo protocol is used to swap in alternative rotamers one‐by‐one and accept changes that lower the calculated energy of the protein. Gradient‐based minimization involves calculating the forces on each atom (according to the Rosetta energy function) and relieving those forces by making small adjustments to atom positions.

Two minimization approaches, AtomTree and Cartesian, are used by the ssm_app and the clusters_app. ²² During AtomTree minimization the protein is represented in internal coordinates and explicit perturbations to the backbone and sidechain dihedral angles are propagated to all downstream atoms. During Cartesian minimization, the forces on each atom are considered independently and structural perturbations do not propagate along the protein chain. In general AtomTree minimization progresses more rapidly as fewer degrees of freedom are explored; however, the full atomic flexibility of Cartesian minimization often reaches a lower energy state. A significant improvement in efficiency for reaching a low‐energy state was made within the Rosetta community by applying multiple cycles of repacking and minimization that begin with significantly reduced Van der Waal's repulsive forces that gradually increase until they approximate physical interactions. ²³ , ²⁴ This process is referred to as “FastRelax,” or “FastDesign” if the amino acid sequence is allowed to change.

Constraints restrict differences between the starting structure and the simulated model. Two forms of constraints are applied within both webtools: coordinate constraints and MoveMapFactories. A coordinate constraint penalizes atomic movement away from a starting position and scales according to a harmonic potential. MoveMapFactories prevent Rosetta from perturbing torsion angles within selected residues during minimization, thereby limiting both computational time and variations in score between replicates. Although the MoveMap prevents Rosetta from manipulating the degrees of freedom within a selection, restricted atoms can move due to lever‐arm effects that occur when the positions of connected residues are adjusted unless coordinate constraints are also applied.

2.2. Specific protocols

The webtools presented here focus on conformational sampling to the region of the protein that is being mutated (Figure 1). These constraints reduce computational time and variability in the energy calculations that arise from sampling alternative conformations distal from the site of mutation. Although a fixed backbone approach such as is utilized within the FoldX platform ²⁵ would reduce computational time further, a complete lack of flexibility prevents substitutions that are experimentally valid but clash without minor adjustments (Figure 1f). The simulations are performed with RosettaScripts (the annotated scripts are provided in Supporting Information S1 and S2). The following represents a detailed explanation of how both design protocols function, with differences, described inline and summarized in Table 1.

The simulations allow local backbone flexibility around the designed residues. (a–d) Depiction of residue selections within the point mutation (a/c) and mutation cluster (b/d) protocols. Each circle on the line represents an amino acid position. The designable residue(s) are depicted in orange. Amino acids that cannot be mutated are divided into those without constraints (purple), those with weak backbone constraints (blue), and those with both strong backbone constraints and MoveMap restrictions (gray). (e) The amount of flexibility around a selected residue is depicted by aligning the best scoring model for each amino acid substitution at position 342 of Protein M. (f) A model of protein M with the A342V mutation from a simulation with constraints as described in Figure 2a (gray) is compared to a model generated with a fixed backbone (purple). In the rigid backbone simulation (purple), the mutation is incorrectly identified as unfavorable because of a steric clash with a neighboring valine. In the flexible backbone simulation, a small conformational change in the backbone relieves the clash. The experimentally derived T _m for this mutation was 2.8°C higher than WT

TABLE 1.

Comparison of conformational sampling in the point mutant (PM) and mutation cluster (MC) protocols

	PM	MC
Minimization method	AtomTree	Cartesian
Dihedrals fixed	>10 Å from residue	>10 Å from seed
Strong backbone constraints	>10 Å from residue	>10 Å from seed
Weak backbone constraints	<10 Å from residue and not primary sequence neighbors	7–10 Å from seed
Unconstrained	Residue + primary sequence neighbors (n = 3)	<7 Å from seed
Designable residues	Residue (n = 1)	Cα < 7 Å from seed
# independent trajectories (nstruct)	5	10

Open in a new tab

2.2.1. Step 1: Relax

Minor adjustments in the atomic positions of a crystal structure yield significant improvements in its calculated energy. Therefore, input structures are “relaxed” to a low energy state prior to mutagenesis to reduce noise during analysis that would occur between independent simulations. This process does not generally move atoms far but significantly reduces the calculated energy of the structure. Due to the computational cost of evaluating models during combinatorial mutagenesis, the Cluster protocol utilizes AtomTree minimization. In contrast, the SSM protocol begins with a round of AtomTree minimization before switching to Cartesian, which was shown to reach lower energy states than using either technique alone. ²² Backbone atoms (Cα‐only) are constrained to their position in the original crystal structure (coordinate constraints) for all relax steps.

2.2.2. Step 2: Mutate

A defining feature of the protocols is the constraints that are applied (Figure 1). Amino acids further than 10 Å away from the selected position are restricted by both a MoveMapFactory (i.e., torsion angles are fixed) and strong coordinate constraints. Amino acids closest to the target are free of constraints and residues between these two selections encounter weak coordinate constraints, although the selection of residues differs between the SSM and Cluster protocols. The Cluster protocol includes an option to apply a bonus to the energy for the native amino acid at each residue position, thereby limiting the number of mutations in the final models.

The PM protocol mutates the selected residue, applies coordinate constraints, and performs FastRelax (repacking/minimization). In contrast, mutations within the Cluster script are not explicitly defined. Instead, a 7 Å zone of designable residues around a user‐specified target is stochastically evaluated during the repacking/minimization procedure (FastDesign). In both protocols, additional rotamers are considered during repacking to increase sampling for the first and second side chain torsion angles of each side chain (chi1 and chi2), as well as including the initial rotamer from the crystal structure. Due to the increased search space for the cluster protocol compared to that for PM, the number of independent trajectories that are performed is doubled from 5 to 10. Only the best scoring model is considered during analysis.

To assess the favorability of mutant sequences it is necessary to compare to a calculated energy for the native protein. When making this comparison it is important that the native sequence/structure be energy minimized with the same constraints that were applied when performing the mutant calculation. For each mutant that is analyzed an independent simulation is performed with the native sequence in which the constraints are set to match the constraints being used in the mutant simulation.

3. USER INTERFACE

The prediction of stabilizing mutations was automated by creating two webtools that mutate either individual or multiple amino acids. For users who expect to run many simulations and have access to a local computing cluster, python scripts for generating input and analyzing results via the command line were deposited in the Rosetta repository (rosetta_scripts_scripts/scripts/public/stabilize_proteins_pm_mc/) and additional flags for job submission are described at the top of the file. This folder also contains a tutorial for both protocols.

3.1. Getting started

Each application only requires two inputs: a protein structure in PDB format and a selection of residues to mutate. To improve simulation efficiency, large structures should be reduced to the minimum size possible by editing the PDB file that is input. For example, stabilizing a variable domain from an antibody does not require modeling of the Fc region. By default, the protocols will process the file to remove non‐peptide residues (heteroatoms) and allow all amino acid substitutions except for cysteine. Although these settings will address most situations, advanced options are available to provide additional functionality (Appendix A.1). These options include a field for indicating ligands (heteroatoms) that should be retained during the simulation and another for designating which amino acid substitutions to consider during mutagenesis. Additionally, both protocols include an advanced option to maintain sequence symmetry across homomeric proteins and an option to upload a pre‐relaxed structure to save computational time. The webtool for mutation clusters includes two options that are not required by the point mutation application. The first is a method for limiting the number of mutations in the final model by applying a bonus to the native residue score during the design process. The second is a field to designate residues that should not mutate, which is useful for maintaining protein function during stabilization (i.e., avoid modifying a binding site).

Selecting sites for mutagenesis will be system‐dependent and require an understanding of how the protein functions. Our lab performs the point mutation scan on every position that is not expected to affect function (site saturation mutagenesis) and selects seeds for mutant clusters spread across the entire protein. Anecdotally, the most stabilizing variants fall into two categories: filling under‐packed regions within the protein core and forming interactions with residues that are distal in primary sequence (high contact order). A separate tool was previously created for identifying under‐packed regions ²⁶ ; however, a collection of seeds that sample the full structure is usually chosen by eye with a molecular viewer since improved packing will be indicated by favorable scores after the simulations are complete.

3.2. Analyzing results

Both webtools produce score tables and models of the mutant structures in CSV and PDB formats, respectively. The point mutation application also provides a heatmap depicting the change in calculated energy (ΔE) for each substitution (Figure 2a), and an additional table in the same format. The cluster application specifically produces a file containing the amino acid sequence for each model in FASTA format. The score file for mutation clusters includes a column to indicate the number of mutations and another that normalizes the score by this value which can be useful for removing constructs that score well due to many mutations that each contribute only marginal improvements to stability.

Results from the simulations. (a) The point mutation webtool produces a heatmap that depicts the difference in score between the native residue and each substitution for a given position. Only scores for favorable mutations are printed. Scores greater than +1 are assigned the same navy color. (b) Both tools provide a model for the mutated structure. This example depicts a mutation cluster that filled a void within the protein core (Experimental ΔT _m = 2°C). The WT protein is colored in grayscale and the recommended substitutions are overlaid in transparent purple

The ΔE values are reported in Rosetta Energy Units (REU), which have been parameterized against thermodynamic benchmarks to be on a similar scale as kcal/mol. ¹⁸ While the calculated ΔE are effective for removing destabilizing mutations from consideration, they are not directly correlated with thermostability improvements when only considering the best scoring variants. Scores that differ by ±0.5 REU per mutation are considered equivalent to one another. Models that produce the best (most negative) ΔE are visually inspected (checked for loss of a hydrogen bond partner) and selected for evaluation in vitro. When many different mutations are predicted to be more favorable than the native residue, only one or two are tested since the final goal is to combine the most stabilizing mutations into a single construct.

Our lab performs medium‐throughput expression and purification of ~20 variants with 24‐well plates followed by rapid T _m determination with NanoDSF. After performing a functional assay, the most stabilizing mutations are then combined for maximum enhancement. The final T _m for these combinations can usually be predicted by the cumulative T _m improvements observed experimentally for each individual construct (Figure 3). However, proximal mutations may be mutually exclusive due to interactions with a common residue. In this situation, the model of the mutant from the first round of simulations can be provided as input for a second round to evaluate the suitability of a specific combination.

These protocols predicted stabilizing mutations for three distinct proteins. (a) Differences in thermal stabilities compared to WT for mutations predicted with either the point mutation (PM, black) or mutation cluster (MC, red) protocols were determined with NanoDSF for protein M (M), a T‐cell receptor (TCR), and the Dengue E Protein Dimer (DED, T _m1). A positive value represents an improvement in stability. (b) The best mutation cluster for M introduced three mutations and improved the T _m by 7.4°C, the model is shown. (c) The best mutation cluster for DED introduced two mutations per monomer (four total) and improved the T _m by 11°C, the model is shown. (d, e) Each point represents a combination of stabilizing mutations from panel a for either protein M (Panel d) or DED (Panel e, black and purple represent data for T _m1 and T _m2 which indicates unfolding of the dimer and monomer, respectively). The additive ΔT _m represents the expected improvement in stability from combining stabilizing mutations from panel a. The actual ΔT _m represents the experimentally observed value. The dashed line represents an idealized linear relationship between the two axes

4. RESULTS AND DISCUSSION

Predicting stabilizing amino acid substitutions is useful in a variety of biochemical contexts (Appendix), and we have previously applied these protocols within three systems of medical importance (Figure 3). The first focused on the constant domains of recombinant T cell Receptors (TCRs) which demonstrate uneven expression and poor thermostability. ¹⁷ The stabilized construct boosted expression 10‐fold, which allowed for the generation of bispecific molecules that include an anti‐CD3 component for initiating T‐cell mediated killing of cells recognized by the stabilized TCR. The second study targeted dengue envelope protein homodimer which is unstable at physiological temperatures and was therefore unsuitable for vaccine applications. The T _m of both the dimer and monomer were improved by 18 and 22°C, respectively, and displayed improved antibody response in a mouse model. ¹⁶ , ²⁷ The final project involved a protein from mycoplasma (Protein M) that binds to antibodies and interferes with antigen recognition. ²⁸ Stabilization reduced aggregation in vitro and facilitated in vivo applications. This therapeutic enhanced gene therapy in mice by temporarily blocking antibodies that neutralize viral delivery.

Each of these proteins was limited to shallow multiple sequence alignments and was therefore most suitable for a structure‐guided approach to mutagenesis. After assessing the experimental melting temperature (T _m) for 50 point mutations across three systems, 50% improved stability and 14% did not affect stability. In contrast, random mutagenesis yields stabilizing mutations with a rate of ~2%. ²⁹ Combining the most stabilizing mutations enhanced stability in a predominantly additive fashion (Figure 3d,e. d/e) and yielded final constructs that improved T _m by at least 20°C for each system. Each of these stabilized constructs has enabled new avenues for research, including therapeutic applications.

Proteins with low thermostability exhibit reduced protein yields and severely limit applications in non‐native environments. The webtools presented here balance accuracy with speed to predict stabilizing amino acid substitutions, either as single‐point mutations or mutation clusters. They are equally useful for biochemical studies in which specific residues must be replaced (i.e., removing a glycosylation motif or eliminating a binding site). Aside from predicting novel mutations, the tools can also be used to interpret directed evolution studies and deep mutational scans by comparing them to the calculated energies. Despite their simplicity of execution, these tools are customizable for most circumstances, including the ability to specify amino acid substitutions and maintain sequence symmetry for homomeric proteins. Although intended for simulations only containing proteins, Rosetta is parameterized for many ligands (i.e., RNA, ³⁰ carbohydrates, ³¹ etc.) which can be retained within the advanced options. We hope that democratizing these protein engineering techniques will enable scientists outside of the field to advance their research in unforeseen ways.

4.1. Software accessibility

Both the Rosetta molecular modeling program and ROSIE webtools are freely available for academic use and can be accessed at the following links: Rosetta: https://www.rosettacommons.org/software; ROSIE2 PM: https://r2.graylab.jhu.edu/apps/submit/stabilize-pm; ROSIE2 MC: https://r2.graylab.jhu.edu/apps/submit/stabilize-cluster.

AUTHOR CONTRIBUTIONS

David F. Thieker: Conceptualization (lead); methodology (lead); software (lead); visualization (lead); writing – original draft (lead); writing – review and editing (equal). Jack B. Maguire: Conceptualization (supporting); methodology (supporting); software (supporting). Stephan T. Kudlacek: Conceptualization (supporting); methodology (supporting). Andrew Leaver‐Fay: Software (supporting). Sergey Lyskov: Software (equal). Brian Kuhlman: Conceptualization (equal); supervision (lead); writing – review and editing (equal).

FUNDING INFORMATION

This work was supported by NIH grant R35GM131923 (Brian Kuhlman). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest with the contents of this article.

Supporting information

Appendix S1 RosettaScript for performing mutation cluster protocol locally.

Click here for additional data file.^{(22.1KB, pdf)}

Appendix S2 RosettaScript for performing point mutation protocol locally.

Click here for additional data file.^{(23.1KB, pdf)}

ACKNOWLEDGEMENT

The authors thank Hayretin Yumerefendi for advice regarding the logic behind MoveMap constraints within the Rosetta protocols.

Examples of biochemical challenges

The following examples illustrate potential applications of the SSM and Cluster tools. Complete tutorials are provided in the documentation pages on the webserver, including commands for selecting residues within PyMol.

Stabilize proteins while maintaining function: Proteins are most useful because of their underlying function, whether that is binding to signaling partners or catalyzing a reaction. Maintaining that function during the design process is critical and therefore these tools allow for the selection of residues that should not be targeted for mutagenesis. For example, Protein M directly interacts with antibody light chains for neutralization. To stabilize the protein without interfering with the binding site, only residues further than 5 Å from the binding partner were considered during mutagenesis. It is important to consider any allosteric changes that may be necessary for binding/function which would not be accounted for by this distance‐based approach for selecting designable residues.

Safely interrogate functional role of specific amino acids: Testing hypotheses about protein function often require replacing specific residues, whether evaluating catalytic sites in an enzyme or the importance of hydrogen bonds across a protein interface. The SSM protocol provides information on the effects of each amino acid and can guide the suitability of replacements. Identification of positions at the binding site that are critical for stability according to the SSM heatmap may also suggest an evolutionary constraint for maintaining function. For example, a conserved Arg residue within protein M forms internal hydrogen bonds to the protein M backbone and an intermolecular hydrogen bond to a conserved Glu on antibody light chains. Both protocols allow residue types to be specified during design; for example, only substituting hydrophobic residues or introducing a charge‐swap by restricting substitutions to either positively or negatively charged residues.

Remove errant posttranslational modifications: Recombinant expression of bacterial proteins in mammalian cells may lead to unintended posttranslational modifications that impair function. For example, the expression of protein M in HEK293 cells produced soluble, but nonfunctional, protein due to N‐glycosylation sites. N‐glycans are predominantly limited to a well‐recognized consensus sequence of Asn‐X‐Ser/Thr for attachment and protein M contains three predicted sites. Each of the Asn and Ser/Thr was targeted for mutagenesis with the SSM protocol, and the most favorable substitution at each glycosylation site was combined.

Stabilize homomeric proteins: Homomeric proteins are commonly found in nature and represent an additional layer of complexity due to the need for propagating mutations across matching chains during design. For example, the dengue virus contains a homodimer that represents a promising antigen for vaccine development, but the native protein is monomeric when outside of the capsid. Both described tools for design contain options for maintaining sequence symmetry during mutagenesis and were successfully stabilized the dengue dimer.

Thieker DF, Maguire JB, Kudlacek ST, Leaver‐Fay A, Lyskov S, Kuhlman B. Stabilizing proteins, simplified: A Rosetta‐based webtool for predicting favorable mutations. Protein Science. 2022;31(10):e4428. 10.1002/pro.4428

Review Editor: Nir Ben‐Tal

Funding information National Institute of General Medical Sciences, Grant/Award Number: R35GM131923

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

REFERENCES

1. Kinch MS. An overview of FDA‐approved biologics medicines. Drug Discov Today. 2015;20:393–398. [DOI] [PubMed] [Google Scholar]
2. Hennigan JN, Lynch MD. The past, present, and future of enzyme‐based therapies. Drug Discov Today. 2021;1:117–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Roodveldt C, Aharoni A, Tawfik DS. Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol. 2005;15:50–56. [DOI] [PubMed] [Google Scholar]
4. Goldenzweig A, Fleishman SJ. Principles of protein stability and their application in computational design. Annu Rev Biochem. 2018;87:105–129. [DOI] [PubMed] [Google Scholar]
5. Jones BJ, Kan CNE, Luo C, Kazlauskas RJ. Consensus finder web tool to predict stabilizing substitutions in proteins. Enzyme Eng Evol Gen Methods. 2020;643:129. [DOI] [PubMed] [Google Scholar]
6. Leman JK, Weitzner BD, Lewis SM, et al. Macromolecular modeling and design in Rosetta: Recent methods and frameworks. Nat Methods. 2020;17:665–680. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Yin S, Ding F, Dokholyan NV. Eris: An automated estimator of protein stability. Nat Methods. 2007;4:466–467. [DOI] [PubMed] [Google Scholar]
8. Parthiban V, Gromiha MM, Schomburg D. CUPSAT: Prediction of protein stability upon point mutations. Nucleic Acids Res. 2006;34:W239–W242. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Musil M, Stourac J, Bendl J, et al. FireProt: Web server for automated design of thermostable proteins. Nucleic Acids Res. 2017;45:W393–W399. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Weinstein JJ, Goldenzweig A, Hoch S, Fleishman SJ. PROSS 2: A new server for the design of stable and highly expressed protein variants. Bioinformatics. 2021;37:123–125. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Campeotto I, Goldenzweig A, Davey J, et al. One‐step design of a stable variant of the malaria invasion protein RH5 for use as a vaccine immunogen. Proc Natl Acad Sci U S A. 2017;114:998–1002. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Cramer P. AlphaFold2 and the future of structural biology. Nat Struct Mol Biol. 2021;28:704–705. [DOI] [PubMed] [Google Scholar]
13. Du Z, Su H, Wang W, et al. The trRosetta server for fast and accurate protein structure prediction. Nat Protoc. 2021;16:5634–5651. [DOI] [PubMed] [Google Scholar]
14. Barber‐Zucker S, Mindel V, Garcia‐Ruiz E, Weinstein JJ, Alcalde M, Fleishman SJ. Stable and functionally diverse versatile peroxidases designed directly from sequences. J Am Chem Soc. 2022;144:3564–3571. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Lyskov S, Chou F‐C, Conchúir SÓ, et al. Serverification of molecular modeling applications: The Rosetta online server that includes everyone (ROSIE). PLoS One. 2013;8:e63906. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Kudlacek ST, Metz S, Thiono D, et al. Designed, highly expressing, thermostable dengue virus 2 envelope protein dimers elicit quaternary epitope antibodies. Sci Adv. 2021;7:eabg4084. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Froning K, Maguire J, Sereno A, et al. Computational stabilization of T cell receptors allows pairing with antibodies to form bispecifics. Nat Commun. 2020;11:2330. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Alford RF, Leaver‐Fay A, Jeliazkov JR, et al. The Rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13:3031–3048. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Park H, Bradley P, Greisen P Jr, et al. Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J Chem Theory Comput. 2016;12:6201–6212. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. O'Meara MJ, Leaver‐Fay A, Tyka MD, et al. Combined covalent‐electrostatic model of hydrogen bonding improves structure prediction with Rosetta. J Chem Theory Comput. 2015;11:609–622. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci U S A. 2000;97:10383–10388. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Conway P, Tyka MD, DiMaio F, Konerding DE, Baker D. Relaxation of backbone bond geometry improves protein energy landscape modeling. Protein Sci. 2014;23:47–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Khatib F, Cooper S, Tyka MD, et al. Algorithm discovery by protein folding game players. Proc Natl Acad Sci U S A. 2011;108:18949–18953. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Maguire JB, Haddox HK, Strickland D, et al. Perturbing the energy landscape for improved packing during computational protein design. Proteins Struct Funct Bioinformatics. 2021;89:436–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: An online force field. Nucleic Acids Res. 2005;33:W382–W388. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Sheffler W, Baker D. RosettaHoles2: A volumetric packing measure for protein structure refinement and validation. Protein Sci. 2010;19:1991–1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Kudlacek ST, Premkumar L, Metz SW, et al. Physiological temperatures reduce dimerization of dengue and Zika virus recombinant envelope proteins. J Biol Chem. 2018;293:8922–8933. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Li, C. , Askew, C. , Kuhlman, B. , and Thieker, D. (2021) Compositions and methods for binding antibodies and inhibiting neutralizing antibodies.
29. Broom A, Jacobi Z, Trainor K, Meiering EM. Computational tools help improve protein stability but with a solubility tradeoff. J Biol Chem. 2017;292:14349–14361. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Watkins AM, Rangan R, Das R. FARFAR2: Improved de novo Rosetta prediction of complex global RNA folds. Structure. 2020;28:963–976. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Labonte JW, Adolf‐Bryfogle J, Schief WR, Gray JJ. Residue‐centric modeling and design of saccharide and glycoconjugate structures. J Comput Chem. 2017;38:276–287. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1 RosettaScript for performing mutation cluster protocol locally.

Click here for additional data file.^{(22.1KB, pdf)}

Appendix S2 RosettaScript for performing point mutation protocol locally.

Click here for additional data file.^{(23.1KB, pdf)}

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

[pro4428-bib-0001] 1. Kinch MS. An overview of FDA‐approved biologics medicines. Drug Discov Today. 2015;20:393–398. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0002] 2. Hennigan JN, Lynch MD. The past, present, and future of enzyme‐based therapies. Drug Discov Today. 2021;1:117–133. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0003] 3. Roodveldt C, Aharoni A, Tawfik DS. Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol. 2005;15:50–56. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0004] 4. Goldenzweig A, Fleishman SJ. Principles of protein stability and their application in computational design. Annu Rev Biochem. 2018;87:105–129. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0005] 5. Jones BJ, Kan CNE, Luo C, Kazlauskas RJ. Consensus finder web tool to predict stabilizing substitutions in proteins. Enzyme Eng Evol Gen Methods. 2020;643:129. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0006] 6. Leman JK, Weitzner BD, Lewis SM, et al. Macromolecular modeling and design in Rosetta: Recent methods and frameworks. Nat Methods. 2020;17:665–680. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0007] 7. Yin S, Ding F, Dokholyan NV. Eris: An automated estimator of protein stability. Nat Methods. 2007;4:466–467. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0008] 8. Parthiban V, Gromiha MM, Schomburg D. CUPSAT: Prediction of protein stability upon point mutations. Nucleic Acids Res. 2006;34:W239–W242. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0009] 9. Musil M, Stourac J, Bendl J, et al. FireProt: Web server for automated design of thermostable proteins. Nucleic Acids Res. 2017;45:W393–W399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0010] 10. Weinstein JJ, Goldenzweig A, Hoch S, Fleishman SJ. PROSS 2: A new server for the design of stable and highly expressed protein variants. Bioinformatics. 2021;37:123–125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0011] 11. Campeotto I, Goldenzweig A, Davey J, et al. One‐step design of a stable variant of the malaria invasion protein RH5 for use as a vaccine immunogen. Proc Natl Acad Sci U S A. 2017;114:998–1002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0012] 12. Cramer P. AlphaFold2 and the future of structural biology. Nat Struct Mol Biol. 2021;28:704–705. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0013] 13. Du Z, Su H, Wang W, et al. The trRosetta server for fast and accurate protein structure prediction. Nat Protoc. 2021;16:5634–5651. [DOI] [PubMed] [Google Scholar]

[pro4428-bib-0014] 14. Barber‐Zucker S, Mindel V, Garcia‐Ruiz E, Weinstein JJ, Alcalde M, Fleishman SJ. Stable and functionally diverse versatile peroxidases designed directly from sequences. J Am Chem Soc. 2022;144:3564–3571. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0015] 15. Lyskov S, Chou F‐C, Conchúir SÓ, et al. Serverification of molecular modeling applications: The Rosetta online server that includes everyone (ROSIE). PLoS One. 2013;8:e63906. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0016] 16. Kudlacek ST, Metz S, Thiono D, et al. Designed, highly expressing, thermostable dengue virus 2 envelope protein dimers elicit quaternary epitope antibodies. Sci Adv. 2021;7:eabg4084. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0017] 17. Froning K, Maguire J, Sereno A, et al. Computational stabilization of T cell receptors allows pairing with antibodies to form bispecifics. Nat Commun. 2020;11:2330. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0018] 18. Alford RF, Leaver‐Fay A, Jeliazkov JR, et al. The Rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13:3031–3048. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0019] 19. Park H, Bradley P, Greisen P Jr, et al. Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J Chem Theory Comput. 2016;12:6201–6212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0020] 20. O'Meara MJ, Leaver‐Fay A, Tyka MD, et al. Combined covalent‐electrostatic model of hydrogen bonding improves structure prediction with Rosetta. J Chem Theory Comput. 2015;11:609–622. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0021] 21. Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci U S A. 2000;97:10383–10388. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0022] 22. Conway P, Tyka MD, DiMaio F, Konerding DE, Baker D. Relaxation of backbone bond geometry improves protein energy landscape modeling. Protein Sci. 2014;23:47–55. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0023] 23. Khatib F, Cooper S, Tyka MD, et al. Algorithm discovery by protein folding game players. Proc Natl Acad Sci U S A. 2011;108:18949–18953. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0024] 24. Maguire JB, Haddox HK, Strickland D, et al. Perturbing the energy landscape for improved packing during computational protein design. Proteins Struct Funct Bioinformatics. 2021;89:436–449. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0025] 25. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: An online force field. Nucleic Acids Res. 2005;33:W382–W388. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0026] 26. Sheffler W, Baker D. RosettaHoles2: A volumetric packing measure for protein structure refinement and validation. Protein Sci. 2010;19:1991–1995. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0027] 27. Kudlacek ST, Premkumar L, Metz SW, et al. Physiological temperatures reduce dimerization of dengue and Zika virus recombinant envelope proteins. J Biol Chem. 2018;293:8922–8933. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0028] 28. Li, C. , Askew, C. , Kuhlman, B. , and Thieker, D. (2021) Compositions and methods for binding antibodies and inhibiting neutralizing antibodies.

[pro4428-bib-0029] 29. Broom A, Jacobi Z, Trainor K, Meiering EM. Computational tools help improve protein stability but with a solubility tradeoff. J Biol Chem. 2017;292:14349–14361. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0030] 30. Watkins AM, Rangan R, Das R. FARFAR2: Improved de novo Rosetta prediction of complex global RNA folds. Structure. 2020;28:963–976. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4428-bib-0031] 31. Labonte JW, Adolf‐Bryfogle J, Schief WR, Gray JJ. Residue‐centric modeling and design of saccharide and glycoconjugate structures. J Comput Chem. 2017;38:276–287. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Stabilizing proteins, simplified: A Rosetta‐based webtool for predicting favorable mutations

David F Thieker

Jack B Maguire

Stephan T Kudlacek

Andrew Leaver‐Fay

Sergey Lyskov

Brian Kuhlman

Abstract

1. INTRODUCTION