Abstract
Protein unfolding is modeled as an ensemble of pathways, where each step in each pathway is the addition of one topologically possible conformational degree of freedom. Starting with a known protein structure, GeoFold hierarchically partitions (cuts) the native structure into substructures using revolute joints and translations. The energy of each cut and its activation barrier are calculated using buried solvent accessible surface area, side chain entropy, hydrogen bonding, buried cavities, and backbone degrees of freedom. A directed acyclic graph is constructed from the cuts, representing a network of simultaneous equilibria. Finite difference simulations on this graph simulate native unfolding pathways. Experimentally observed changes in the unfolding rates for disulfide mutants of barnase, T4 lysozyme, dihydrofolate reductase, and factor for inversion stimulation were qualitatively reproduced in these simulations. Detailed unfolding pathways for each case explain the effects of changes in the chain topology on the folding energy landscape. GeoFold is a useful tool for the inference of the effects of disulfide engineering on the energy landscape of protein unfolding.
Keywords: topology, pathways, kinetic stability, DHFR, barnase, lysozyme FIS, configurational entropy
INTRODUCTION
The configurational space of a polypeptide chain is astronomically large, yet the folding of most proteins is completed within a fraction of a second. This paradoxical observation strongly suggests that well-defined folding pathways exist, dictated by energetic interactions and topological constraints. Since the folding pathway is the inverse of the unfolding pathway, a mechanistic model for the unfolding pathway would be a helpful step towards understanding folding pathways and developing a predictive model for folding.
The folding and unfolding sides of the same energy landscape present distinct modeling issues. Easily computed structural bioinformatic models can predict the folding rate, and with it the height of the energetic barrier. The relative contact order1, a simple calculation based on the native structure, is highly anti-correlated to and predictive of the folding rate. However, the unfolding rate depends on topology of the backbone, the oligomerization state, the presence of bound ligands, local sequence-structure propensities and the protein environment. Models of unfolding based on the native state statistics alone have had only modest success2.
The other extreme of the modeling continuum are all-atom, explicit molecular dynamics simulations. These models are manageable for the unfolding process but not the folding process, because of the enormous search space of folding. For example, molecular dynamics simulations starting from the native state have produced realistic intermediate states of unfolding 3-5 and approximate unfolding rates6 for small proteins. While a few small proteins have been successfully folded by explicit molecular dynamics4, 7-8, deriving folding rates would require numerous runs and is still too computationally challenging. It is possible to speed up the folding process by reducing the energy function to a simple Go model9-10 and thereby get folding rates, but this oversimplification of the energy landscape creates its own inaccuracies9. In short, folding lends itself to statistical models but not to all-atom simulations, while unfolding lends itself to all-atom simulations but not to easily computed statistical models.
The challenge is to distill a simplified model of unfolding based on what we know about the mechanism. Along these lines, Xia et al have identified some of the structural characteristics that appear to correlate with high kinetic stability (very slow unfolding) in proteins11. They find that kinetically stable proteins tend to be of mixed secondary structure content (alpha/beta), rather than pure beta sheet or pure alpha helix. They often have a dimeric or high order assembly with the N and C termini buried in the multimer interface. In kinetically stable monomers, the termini are often tucked into the middle of a beta sheet. When not buried, chain termini are often observed wrapped around the protein like a belt or latch. The impression from structure gazing is that the situation of the chain termini somehow speaks to the unfolding rate. Therefore a simplified model must account for steric hindrance to unfolding in the native state and along the pathway.
The mechanistic model described here encodes a strictly tree-like unfolding pathway consistent with the “parsing perspective” of Dill12, the “block folding” model of Nussinov13, and Finkelstein’s “folding nuclei”14-15. Viewed in the folding direction, all steps are on-pathway, and each condensation step involves previously condensed substructures. We further restrict the pathway such that no intermediate can be formed if it requires the chain to pass through itself or if it forces an unlikely concerted motion of three or more parts, like the act of tying of a knot. Similar arguments may be found in Maity and Englander’s work on cytochrome c, describing folding events as the stepwise assembly of foldon units16. These basic principles are well established in the literature of folding theory and even taken for granted in some of the most successful algorithms for de novo structure prediction17-18.
Our program, GeoFold (Geometric unFolding) follows on the conceptual framework of our previous model, UNFOLD19. We model all steps in a pathway as two-body condensations. An ordered set of two-body condensations is a tree, and a given protein may have many such trees, comprising an ensemble of pathways. In UNFOLD, protein structures were reduced to weighted secondary structure element graphs. Contact energies were assigned to pairs of secondary structure elements using solvent accessible surface area and other terms. The graph was then hierarchically partitioned at each step, without regard for chain crossing. In nine case-studies, the simple UNFOLD pathways were found to be predictive of phi-values and other experimental data on folding pathways. The new program GeoFold now accounts for chain crossing and contains energy terms that account for most of the known energetic components of protein stability. It carries out unfolding on a graph using a finite difference approach, producing simulated experimental data, whereas the previous method only produced a heirarchy of intermediate states.
Because of the absence of any off-pathway intermediates in our model, and because off-pathway states dominate the unfolded side of the energy landscape, the current model cannot predict folding rates, only unfolding rates.
In this paper, we demonstrate that a mechanistic model which accounts for steric interactions is sufficient to explain previously unexplained differences in stability and unfolding rate for four proteins with engineered disulfide linkages.
The new program and server will be useful to explore and anticipate the energetic consequences of protein engineering, and for deciphering the genetic roots of protein aggregation and amyloid formation20. As we will show, mutations affect the kinetic stability of a protein and the accumulation of intermediate states depending on how they perturb the protein folding pathway. A better understanding of the structural determinants of kinetic stability may lead to ways to improve the shelf life of proteins by design.
RESULTS
Geofold first determines all geometrically possible pivot, hinge and break points (Figure 1) in a protein structure, splits that structure into non-overlapping substructures, then proceeds recursively, until the substructures are fully unfolded. The series of splits comprise unfolding pathways. Unfolding pathways are structured as directed acyclic graphs (DAGs) with bifurcating edges (one node connecting to two nodes) as shown in Figure 1(c). Each bifurcating edge represents an elemental unfolding step, or “cut”, that splits one substructure into two, using a pivot, hinge or break move as shown in Figure 1a-b. Each node in the graph represents a native substructure, which is a spatially contiguous subset of the native protein structure. The forward and reverse rates of reaction of each elemental subsystem are calculated using the energy function defined in Methods. The combined set of all cuts, along with the associated energy functions, represent a system of simultaneous equilibria whose solution is a set of equilibrium concentrations. The equilibrium state can be found numerically using finite difference methods. The equilibrium state and the rate at which it is obtained are the values we compare to equilibrium stability and unfolding rate in this study.
To carry out a virtual unfolding experiment, the energy landscape is tilted from the folded state to the unfolded state by changing either the temperature (T) or the desolvation energy (ω). Finite difference calculations produce a time course of all substructure concentrations. For the purposes of easy analysis, we have grouped substructures together into three classes -- folded, unfolded and intermediate -- based on buried surface area. The time course of folded state concentrations from the finite difference simulations were fit to a simple exponential decay to give empirical unfolding rates (ku). It should be noted that complex, multi-phasic unfolding has been observed in these simulations, and this is noted where appropriate, but for most purposes we used a simple half-life analysis.
GeoFold unfolding pathways can be broadly characterized as follows. At high solvation energy (ω) the folded state is always the dominant state at equilibrium, while at low ω the unfolded state always dominates. The transition in ω from folded to unfolded is generally sigmoidal, characteristic of a cooperative process, but the degree of cooperativity varies. Both two-state and three state unfolding behaviors have been observed. In three-state systems, there are values of ω where intermediate states exist at equilibrium. The unfolding rate is estimated from the half-life of unfolding and is often roughly log-linear with ω, but sometimes shows a “roll-over” at very low ω as the kinetics approaches diffusion control. We observe curvature throughout the range of ω when more than two states predominate at equilibrium. The energy landscape of the unfolding pathway generally has a maximum in the middle, characteristic of a two-state system.
Initial studies of small monomeric proteins
Ideally, GeoFold should be able to mimic the unfolding mechanism well enough to reproduce kinetics experiments and predict experimentally determined unfolding rates. To test this, simulated unfolding rates (ku) were determined for a set of well-studied, small, monomeric proteins21. Kinetic simulations were carried out over a range of desolvation free energy values ω. Rates were determined by finding the half-life t1/2 at each ω, converting it to an unfolding rate ku = ln(2)/t1/2, and then finding the rate in “pure water” by log-linear extrapolation of the ln(ku) versus ω plot. Unfortunately, the simulated unfolding rates were found to be insignificantly correlated with experimentally determined ku values (data not shown).
This meant that one or more of the assumptions built into the model were wrong. Since the failure could not be attributed to inaccuracies in the folding pathways versus inaccuracies in the energy calculations, we turned to case studies intended to isolate the effects of topology on the folding pathway. As you will see, the results suggest that the folding pathways are accurate and serve to explain the experimental results, leaving as the cause of failure the inaccuracies in the energy function, although precisely where remains to be determined.
Studies of topological perturbations versus unfolding rate
We asked whether the changes in topology created by disulfide linkages could explain experimentally determined changes in stability or unfolding rate. Four case studies are presented here: three mutants of barnase22-24 five mutants of T4 lysozyme25, two of factor for inversion stimulation (FIS)26, and one of E.coli dihydrofolate reductase (DHFR)27. In each case, biophysical studies were done to determine the stability or unfolding rate.
Because the task of optimizing the energy parameters to fit the kinetics of unfolding proved to be an impossibly difficult one, the energy function was necessarily left in an unrefined state, with each component set to a physically reasonable value as described in Methods. No attempt has been made with the current case studies to identify parts of the energy function that are responsible for higher or lower rates. Fortunately, the topological effects of disulfides on the unfolding pathway dominate other energetic components, affecting the order of events rather than the energy of each event. For the most part in these case studies, the addition of a disulfide linkage does not change the non-covalent terms of the energy function. For that reason, we can proceed to validate the mechanistic aspect of the program and its prediction of the order of events, without further empirical proof of the validity of the energy function.
Disulfide mutants of barnase
In work by Clark et al, barnase has been mutated to add disulfide linkages at three positions, 43-80, 70-92, and 85-10222-24. The crystal structures of each have been solved, and the equilibrium stabilities were determined for each mutant, both in the oxidized and reduced forms. The 85-102 mutant (PDB ID: 1bng) was stabilized against urea denaturation due to disulfide formation, unfolding at >7.5M urea as compared to 4.5M for the reduced form. The oxidized 43-80 mutant (PDB ID: 1bne) was somewhat more resistant to urea denaturation, unfolding at 5.8M versus 4.0M for the dithiol form. The 70-92 mutant (PDB ID: 1bnf) was destabilized by the disulfide bond, unfolding at 3.0M urea versus 3.4M for the dithiol form. The loss of stability was attributed to disulfide-induced distortion of local structure and disruption of a salt-bridge near the site of mutation. Further inspection on our part finds that 1bnf in the oxidized state contains local structure that is less favorable than the wild type local structure. Specifically, the Type II beta turn at high propensity sequence KSGR is replaced with a Type I turn at low turn propensity sequence SGRC in the oxidized double mutant. Turn propensity was measured using HMMSTR28.
Clarke et al attempted to explain the disulfide stabilization on the basis of decreased entropy of the denatured state, using Flory’s formula ΔS = −2.1 – 3/2(ln n); where n is the number of residues encompassed by the disulfide29. But this theory could not explain how a shorter encompassed loop, 85-102, was more stabilizing than a longer one, 43-80. The differences can be explained by the unfolding pathway predicted by GeoFold, which finds the wild type 85-102 contact to be broken early in unfolding, followed by the 70-92 contact, followed by 43-80. The C-terminal region 50-110 is a 4-stranded beta-meander with an exposed C-terminus, which provides a series of pivot points starting from the C-terminus and working inward (Figure 2(a)). The predicted pathway shows the C-terminal strand unfolding first, exposing the neighboring strand (strand 3), which unfolds, leaving residues 1-90. At this point, the rate limiting step is the pivoting of the N-terminal helix away from the (now partially unfolded) beta sheet. In 1bng, the 85-102 disulfide prevents the unfolding of strand 3, forcing the N-terminal helix to expose a greater surface area when it separates from the beta sheet. The higher solvation energy cost of this step explains the greater stability of the 85-102 mutant.
GeoFold predicts no difference in the unfolding rate of 43-80 as compared to wild type. It also predicts only a minor difference between the 85-102 mutant and the 70-92 mutant, and both are predicted to unfold much slower than wild type. The 70-92 mutant is predicted to be much more stable than it actually is, but this is believed to be the result of loop distortions observed in the 70-92 mutant. Distortion and local structure propensity are not part of the current GeoFold model.
Figure 2(b) shows the locations of the engineered disulfides relative to the unfolding order of the secondary structure elements. Figure 2(c-d) shows a plots of equilibrium unfolding, using desolvation energy ω, and unfolding rate ku with respect to ω. The log-linear relationship is steeper for the oxidized 85-102 mutant than for the wild type and 43-80 mutant, consistent with a greater amount of surface area exposed in the transition state of the former.
Disulfide mutants of lysozyme
Matsumura25 used a rational design algorithm to insert disulfide bonds into phage T4 lysozyme in places where they were predicted to minimally disturb the backbone conformation. Five double mutants were created and then subjected to thermal denaturation under reduced and oxidizing conditions. A cysteine-free mutant (C54T-C97A) served as the wild type (called WT*) and as the base from which the five disulfide mutants were made. All of the reduced mutants were slightly less stable than WT*, while three of the five mutants were significantly more stable when oxidized (Figure 3(a)).
Models for the five disulfide mutants were made using the molecular modeling software MOE (Chemical Computing Group, Montreal), using its Rotamer Explorer function plus energy minimization. All models have good disulfide bond geometry and no significant changes in the backbone coordinates.
The wild type unfolding pathway starts with the separation of the N and C-terminal domains with a pivot position located in the middle of a long, domain-crossing helix, at residue 69 (Figure 3(c), insets). According to the program, cleavage at this particular location exposes a minimum amount of buried surface area, allows a maximum number of pivoting directions as compared to all other positions in the chain, and maximally exposes new flexible pivot and hinge locations along the pathway of unfolding, as compared with all other cleavage locations. In two of the mutants, 127-154 and 90-122, the initial step of the unfolding pathway was the same as that of WT*, and the equilibrium melting point in ω was also the same, while in the other three mutants the melting point in the oxidized state was increased (Figure 3(b)), similar to the experimental results.
In the three stabilized mutants, 3-97, 21-142, and 9-164, a pivot at any position between the two cysteines was disallowed by the algorithm, and the unfolding pathway began instead with small N and C-terminal segments. But these small pivot moves did not unlock domain opening steps. Hinge motions, revolute joints with two fixed points, were required for unfolding to proceed. Figure 3(d) (insets) shows the first significant unfolding step in the stabilized mutants. Unfolding in the simulation is slower because the program awards hinges a lower configurational entropy gain than pivots, given that a single-axis revolute joint adds only one rotational degree of freedom, while a pivot motion adds two or three rotational degrees of freedom. Exposing the same amount of buried surface area while gaining less configurational entropy leads to a higher transition state energy and therefore a lower unfolding rate.
Figure 3(c-d) also show a summary of the two pathways as “age plots” where contacts are colored in the order they are lost, illustrating the inside-out unfolding order of the stabilized mutants, versus the outside-in unfolding pathway of the WT*. Flory’s equation and our mechanistic approach predict the same overall result in this case, although for barnase they do not.
A disulfide mutant of dihydrofolate reductase
A single mutation in E. coli dihydrofolate reductase (DHFR), P39C, allows a disulfide to form with wild type cysteine 8527. The wild type and the mutant in both the oxidized and reduced state were characterized by GndHCl and urea equilibrium unfolding experiments, showing that the reduced form was identical in stability to the wild type, and that the oxidized form unfolded at higher urea (or GndHCl) concentrations. Furthermore, the wild type enzyme has a sharp, two-state equilibrium unfolding curve while the oxidized mutant has an extended transition region, suggestive of one or more intermediate states (Figure 4(a)).
In remarkable agreement, the simulated equilibrium unfolding curve also shows the extended transition for the oxidized state, and the initial unfolding begins at the same urea concentration (desolvation energy ω in the simulations) for both oxidized and reduced states (Figure 4(b)).
At high ω (low urea) the predicted pathway of DHFR unfolding begins with a series of hinge motions within the loosely-packed “adenosine binding domain”, residues 37 to 91 (ABD), unfolding generally from the middle of the chain outward to the termini (Figure 4(c)). This “inside-out” pathway makes sense energetically, given that the terminal segments have extensive contacts and are more topologically tangled than the ABD, and experimentally, given that the ABD is somewhat flexible, rotating between crystal structures30. The 39-85 disulfide blocks the inside-out pathway, forcing the unfolding of the remaining protein to proceed from the termini, or “outside-in”, which requires a much lower desolvation energy ω. At low ω (high urea), the pathway is outside-in, unfolding from the C-terminus and without the use of hinge motions. Two distinct unfolding pathways lead to the broad transition seen in Figure 4(a-b).
The existence of multiple pathways in DHFR folding (two or four channels) was proposed by Matthews31-32. Their descriptions of mutually exclusive channels without equilibrium intermediate states is perfectly consistent with the mutually exclusive inside-out and outside-in pathways observed here. A similar outside-in unfolding scenario was developed based on the kinetics of methotrexate binding and tryptophan flourescence31,33, and was later supported by Gō simulations34 which showed the C-terminus folding last and the adenosine binding domain folding first. On the other hand, hydrogen/deuterium exchange NMR experiments supporting a pathway in which the termini fold first, and unfold last35; specifically, a burst phase intermediate of folding contained protected backbone H-bonds in the C-terminal strand. Iwakura36, using circular permutants, has suggested that DHFR folding depends only on the presence of early folding units and not on their order along the chain. The NMR experiments and the indifference of DHFR to circular permutation agrees with the inside-out pathway. In retrospect, it makes perfect sense that the outside-in pathway that exposes more surface area early would be favored at high denaturant versus the inside-out pathway, which is more sterically hindered, because steric hindrance does not depend on denaturant. Note that the NMR experiments and the inside-out unfolding pathway were carried out at low denaturant, whereas the Trp fluorescence and the outside-in pathways were done at high denaturant.
Symmetric disulfides in the fragment for inversion stimulation dimer
Factor for inversion stimulation (FIS, PDB ID:3jrh) is an intertwined, homodimeric DNA-binding protein. Two single site cysteine mutations were engineered into the dimer, and both mutants formed a disulfide at the two-fold symmetry interface26. This created proteins with branched, non-cyclic topology, in contrast to the other disulfide linkage mutants presented here, all of which produced a cyclic topology. For this reason, the increased stability of the these mutants cannot be explained by Flory’s formula, which models the loss of entropy in the unfolded state due to cycle formation29.
The rate limiting step in wild type FIS unfolding has been shown to occur before dissociation of the monomers37, so that any increase in stability must be due to slower unfolding, not to a faster association of monomers or decreased entropy of the unfolded state. Both disulfide bridges were shown to stabilize FIS, but the S30C mutant was more stabilized and denatured more cooperatively than the V58C mutant (Figure 5(a)). Equilibrium unfolding curves for wild type and S30C both fit a dimeric two-state model, whereas V58C best fits a 3-state model. In previous studies, a mutation of proline 61 in helix B to alanine increased stability by 4 kcal/mol and changed the folding pathway from 2-state to three-state38, and the C-terminal helices C and D were shown by limited trypsin proteolysis to unfold first in this mutant. The equilibrium intermediate was determined to be a trypsin-resistant dimeric fragment consisting of intertwined helices A and B.
In the simulations, the transition state of the wild type protein was dimeric. The slow step in wild type unfolding was the initial pivoting of helix A (either one) away from the rest of the dimer (Figure 5(a)). Both mutants were more stable than the wild type, and S30C was more stable and more cooperative than V58C, agreeing with the experimental results (Figure 5(b)). The kinetics of wild type and S30C unfolding show a log-linear relationship with ω but V58C has a distinctly curved relationship, suggesting different pathways at high and low denaturant, again similar to the experimental results.
Ignoring the floppy N-terminal hairpin which is disordered in most crystal structures, FIS is an all alpha helical dimeric protein, and unfolding can proceed only at the three junctures between the four helices, A/B, B/C, and C/D (Figure 5(a)). The intertwined dimer cannot dissociate before the A/B pivot. We observed an A/B pivot in the wild type versus a B/C pivot in the S38C mutant, where the A helices are linked and not free to pivot. In the V58C mutant, ambiguous pathways, both A/B and B/C, were observed in the simulation.
The simulated pathways agree with experimental data wherever possible. Inasmuch as the P61A and V58C mutations both serve to strengthen the dimeric interaction between helices B, the similarity between the experimental pathway of P61A and the simulated pathway of V58C is supportive of the accuracy of the program. Both mutations serve to block the propagation of the wild type unfolding pathway, forcing B/C in lieu of dissociation.
DISCUSSION
A recent study has found that it is possible to predict the unfolding rates of single-domain proteins using only information about the structural class of the protein and its size39. Indeed, kinetically stable proteins have structural class preferences11.
A model for protein flexibility has been previously described as a network glass40, where unfolding is done by a stochastic simulation using multi-jointed tethers for hydrophobic interactions and a template-based hydrogen bond potential. This method has been used to identify a transition state cluster in barnase unfolding, and consistently identified the regions most protected from H/D exchange. But H/D exchange only identifies broken hydrogen bonds, not necessarily capturing non-local side chain contacts, and in principle, hydrogen bonds in late folding helical regions could be protected from exchange early in folding. The pathway proposed by Rader40 has the helical domain unfolding first, which disagrees with our prediction (Figure 2). But predictions of phi-values in barnase by Garbuzynskiy15 using dynamic programming agree with our results, finding the high phi-values in the N-terminal helix. GeoFold differs in many ways from Rader’s method. It is deterministic and exhaustively samples alternative pathway, like Garbuzynskiy’s method, and it treats pivot and hinge energies differently. This last feature accounts for our barnase pathway, which agrees more with the experimental data. Admittedly, we placed more weight on entropic terms than on hydrogen bonds, but that is because these terms, not hydrogen bonds, explain the kinetic effects of topological changes in the chain. In barnase, unfolding internal helices requires hinge motions whose barrier heights depend on chain stiffness and the orientation of the hinge axis, and this may be more energetically unfavorable step than would be expected from the breaking of hydrogen bonds alone.
Our model uses rigid body motions to unfold proteins. This is clearly a convenient simplification of a more detailed process. An actual pivot most likely involves micro-steps in which single hydrogen bonds or hydrophobic contacts are broken, much like the model of Rader. Our simplification is justified because it models that way a set of contacts is often broken in a concerted and cooperative way with one large-scale motion, effectively separating relatively rigid sub-structures. The relative simplicity of the model and the fact that it is deterministic, not stochastic, has the advantage of allowing the pathways to be explored essentially exhaustively.
The effects of disulfide linkages on folding and unfolding have been previously explored using lattice simulations and theory. Shakhnovich41 showed that even in a very simple model, the kinetic effect of tying together two sequence positions is a function of the topology of the native state and can either speed up or slow folding, depending on whether the linkage occurs in the folding nucleus or not. Although not discussed in that paper, the implication is that the unfolding rate would be slowed if the linkage occurs outside the folding nucleus, not inside. This would place the energy perturbation on the unfolding side of the energy landscape, increasing the height of the barrier to unfolding. Indeed this is what we find.
Compared to the subtle energetic perturbations of a point mutation, the basis of phi-value analysis of folding pathways42, the addition of a disulfide bond is a relatively blunt instrument, probing the pathway by changing its course. We do not expect the current method to be able to reproduce the results of phi-value experiments unless the finer points of the energy function are extensively refined and trained first. Nonetheless, a clearer understanding of the effects of topology on kinetic stability is immensely valuable. In combination with simulations such as those presented here, disulfide mutations can experimentally elucidate the first steps in the unfolding pathway., and conversely, predictions of the first steps in unfolding could help us to engineer stability by inserting disulfide linkages.
CONCLUSIONS
The mechanism of protein unfolding has been hypothesized in this work to be a directed acyclic graph of native substructures, is accordance with theoretical studies and views12. An element of this tree is a revolute joint or a translation, splitting a substructure into two. We show that experimentally determined energetic and kinetic effects of engineered disulfides in four different proteins are captured in the energy landscapes produced for these proteins by GeoFold, based on their respective crystal structures. The unfolding pathways explain variable stabilization in barnase, lysozyme, DHFR and FIS that could not be explained by Flory’s equation for entropy loss in the unfolded state29. Disulfide links stabilize the protein relative to wild type if the linked positions dissociate early in the pathway of the wild type molecule.
Simulated disulfide mutations in DHFR and FIS both reproduced the experimentally observed increases in stability and the decreases in the cooperativity of folding. Simulated disulfide mutations in barnase and lysozyme reproduced the relative changes in the unfolding rate and in stability.
METHODS
A kinetic model
A kinetic simulation for a system of chemical equations simulates the changes in concentration of each chemical species with time. For example, given a system of two coupled equilibria, A⇆B⇆C, and starting concentrations [A], [B], and [C], the change in [B] over time is given as
(1) |
where the subscripts indicate the directions of the reactions. Eq. 1 is multiplied by a time-step to get new concentrations and the process is repeated. The simulation eventually reaches equilibrium, in this example, when kAB[A]+kCB[C] = (kBA+kBC)[B]. An accurate time-course of concentrations is obtained if the rates are correct and the time-step is sufficiently small. Protein unfolding can be viewed as a system of coupled elemental unfolding steps (Figure 1(a)).
Unfolding operators
Protein topology defines the allowable unfolding motions. Three geometric operators can be defined to describe all two part structural partitions on a chain (Figure 1(b)). As a rule, covalent linkages cannot be broken or stretched in an unfolding operation, and atoms cannot penetrate each other. If the chain crosses only once from u1 and u2, then the allowable motion is a pivot, or a single point revolute joint. Pivot rotations can be in any direction, regardless of the direction of the backbone. If the chain crosses twice, rotation around the two crossing points defines a hinge, or two-point revolute joint. If the chain crosses more than twice, then a simple non-distorting motion is impossible unless all of the points lie in a line, in which case it is still a hinge. If the chain does not cross from u1 and u2, then the model consists of multiple chains or disjoint segments of one chain. The motion in this case is a simple translation, called a break in this study. A break is assigned the highest entropy change, followed by pivots, followed by hinges.
The elemental unfolding subsystem (cut)
The directed graph consists of linked cuts. Starting from the native structure as the root of the graph, each folded species, f, is partitioned using pivots, hinges, and breaks into two smaller species, u1 and u2, at all possible locations as defined by the following conditions.
Pivots
(Figure 6, GetPivot) Residue i of f is the location of a pivot if the substructure N-terminal to i (u1=f[:i]) can rotate around i at least pivotcut=30° in any direction without colliding with the substructure C-terminal to i (u2= f[i+1:]). If f is composed of multiple chains, then the coordinates of the additional chains are grouped with u1 or u2, in all combinations. If several adjacent positions qualify as pivots, a central representative position is chosen.
Hinges
(Figure 6, GetHinge) A single-axis rotation exists around an axis defined by residues i and j if the substructure represented by u1=f[i:j] can rotate at least hingecut=30° degrees in either direction about the axis i->j without colliding with the subset u2=f[:i-1] * f[j+1:]. If multiple chains are involved, then they are grouped with u1 or u2 in all possible combinations. If several consecutive hinge positions are possible, a central representative ij pair is chosen. The two parts of u2 are labeled as different chain segments, allowing break moves.
Breaks
(Figure 6, GetBreak) If a substructure contains two chain segments, either because the protein is oligomeric or because a hinge operation has created two chain segments, and these segments can be separated by a simple translation without collisions in at least breakcut=0.05 of all possible directions, then a break exists and the two chain segments are labeled u1 and u2. If more than two segments are present in f, then all combinations of segments are tried.
The unfolding graph
(Figure 6, GetCuts) Starting with the native structure as the substructure f of the first cut, we find all geometrically possible cuts, giving preference to breaks, then pivots, then hinges. Each cut generates two substructures, u1 and u2. We then apply the same method to each of the substructures u1 and u2, recursively until the substructures are unfolded (defined below). The result is a directed acyclic graph (DAG), where the nodes are substructures and the bifurcating edges represent transition states of binary partitionings. Figure 1(c) shows a partial DAG for DHFR, showing geometrically possible unfolding steps with the energetically favored unfolding pathway highlighted in green.
Kinetic simulations
(Figure 6, UnfoldSim) A simulation of concentration changes over time can be produced by considering a single cut containing a folding intermediate f and the products u1 and u2 of a cut type, a pivot, hinge or break. The amount of f lost is proportional to its concentration [f] and the unfolding rate, which can be calculated using transition state theory43:
(2) |
κu may be called the elemental unfolding rate. The barrier to unfolding for one cut, ΔG‡u, is a function of the energies of the f, u1 and u2, and of the cut type. The subscript ‘u’ indicates that the barrier height is measured in the unfolding direction. The transmission coefficient γ is equal to the rate of decomposition of the transition state. For a normal chemical reaction, this is γ =kBT/h, or about 1013 s−1. But a diffusion-controlled folding reaction is much slower, estimated by Fersht43 to be about 106 s−1, since compared to a chemical reaction, the protein folding reaction has a longer and flatter energy landscape with respect to a bond vibration.
The amount of f gained is proportional to the [u1] and [u2] and the folding rate term
(3) |
κf may be called the elemental folding rate. The finite element simulations as described in Figure 6, UnfoldSim, are carried out on the set of all κf and κu to produce a set of all concentrations for each timestep dt. By summing over all cuts, q that involve f, and then multiplying by the timestep, we obtain the concentration change.
(4) |
To simulate unfolding, we initialize the concentration of the native state to F0. Concentrations of all nodes are recalculated until equilibrium is established. To simulate folding, the leaf nodes are initialized to F0. Equilibrium is assumed if there was no net change in concentration of the whole system. Note that the sum of the concentrations of all intermediates f that contain a given residue i, is equal to a constant, F0, throughout the simulation. That is, the total concentration of residue i is conserved.
(5) |
In other words, mass is conserved.
Components of the energy function
The folding and unfolding rates for a cut, κu and κf, are calculated directly from the substructures f, u1 and u2. Free energies are composed of two parts: the dissociation energy ΔEd, and the backbone configurational entropy ΔSq. The dissociation energy is composed of four terms: solvation energy ΔEω, hydrogen bonds ΔHh, side chain entropy ΔSλ, and buried void entropy ΔSv. Disulfide linkages are treated as constraints rather than energies.
Dissociation Energy, ΔEd
The energy of dissociation of two substructures is modeled using the increased solvation, increased sidechain entropy, loss of hydrogen bonds, and loss of buried void spaces.
(6) |
where each term is defined below. Throughout this discussion, ΔE is used for free energies with unspecified entropic and enthalpic components, ΔG for free energies with specified enthalpic and entropic parts, ΔH for purely enthalpic terms and ΔS for purely entropic terms.
Solvation free energy, ΔEω
For simplicity we assume that the hydrophobic effect, coulombic interactions and the van der Waals attractive force, are all roughly proportional to the change in solvent exposed surface area, and we therefore combine them in one term, called the solvation free energy. Note that the van der Waals repulsive term is assumed to play no part in unfolding since native structures are assumed to have no collisions. Changes in solvent accessible surface area are computed using MASKER44. The buried solvent accessible surface (SAS) exposed upon splitting one substructure, f, into two, u1 and u2, is approximated as the sum of pairwise residue SAS terms, where each term SASjk is the burial of SAS upon contact of residues j and k. Thus for a given cut, summing over residues separated by the cut, we get
(7) |
The buried surface is a good measure of the amount of water displaced by the folding step, and also is a rough estimate of the scale of the VDW attractive force. Desolvation of hydrophobic groups and hydrogen bonding groups is the primary force driving protein folding.
The solvation free energy in units of kJ mol−1 is simply
(8) |
where ω is the surface tension in J mol−1Å−2, a value that may be thought of as a modeling the effect of urea at different concentrations. A negative or low value favors solvation and unfolding (high urea), while a high value favors desolvation and folding (low urea). The value of ω corresponding to pure water may be chosen empirically. Theoretical values for hydration of buried protein surfaces45-46 range from 30 to 80 J mol−1 Å−2. In this study, we did not attempt to break down ΔEω into its component parts.
Side-chain entropy increase, ΔSλ
Upon unfolding, buried sidechains are exposed to the solvent and gain flexibility, each to a different extent. To calculate the change in sidechain entropy ΔSλ, we multiply the relative change in the sidechain exposure with published values47-48 for intrinsic sidechain entropy, ω, summing over all residues in f.
(9) |
SAS0i is the total surface area of residue i in the unfolded state, and ωi is its intrinsic sidechain entropy.
Buried void entropy, ΔSv
All internal spaces large enough to hold one spherical probe of radius 1.2A, but not large enough to hold a water molecule (radius 1.4A), were found using MASKER44. A typical high resolution crystal structure contains dozens of such cavities, which are entropically unfavorable49. Surrounding each void are neighbor residues with atoms less than 7Å from the void center. The void v is said to exist in substructure f if f contains all of v’s neighbors. ΔSv
(10) |
is the difference in the number of voids Nv, times the void cost Uv, an entropic term.
Hydrogen bonds, ΔHh
Backbone hydrogen bonds were identified by adding hydrogens onto backbone amide nitrogens and finding backbone oxygens within a distance of 2.5Å. An H-bond exists within f if both donor nitrogen and acceptor oxygen are present in f. Each H-bond was assigned an enthalpic value Hh, yielding,
(11) |
where Nh(f) is the number of H-bonds present in substructure f. For simplicity, all H-bonds were assigned the same energy. Sidechain H-bonds were ignored.
Disulfide linkages
Disulfide bonds are treated as inseparable residues, but otherwise contribute nothing to the interaction energy. Any unfolding motion that would separate two disulfide-linked cysteines is disallowed.
Configurational entropy, ΔSq
Our model assumes that configurational entropy depends only on the number of degrees of conformational freedom gained, and is independent of the size of the substructure. For example, partitioning a large subset of the protein was rewarded with the same entropy gain as partitioning a small piece. Rough entropic values were assigned to each of the three cut types, break, pivot and hinge, reflecting the approximate number of added degrees of freedom. A hinge adds a single angular degree of freedom, a pivot adds two, and a break adds all three plus some degree of translational freedom. For the two entropies,
(12) |
is enforced, and specific values were set empirically. ΔSpivot was necessarily set to the average of the two so that alternative pathways to the same state would always have the same entropy change, a requirement of any state function.
Transition state free energy
Interactions must be broken before full configurational entropy increase is possible during an unfolding step, therefore we spread the configurational entropy, ΔSq, unevenly along the reaction coordinate, apportioning more than half of the entropy to the products side, after the transition state of unfolding. A term, 0.5 ≥ σ‡ ≥ 0.0, is used to set the fraction of ΔSq expressed before the transition state. σ ‡ was set to 0.25 for this work.
Another term, 0.2 ≤ θ ‡≤ 0.8, sets the fraction of ΔEd expressed before the transition state. θ ‡ is calculated using the Hammond postulate, which states that the transition state most resembles the higher energy ground state. To quantify the Hammond behavior we adopted a reasonable simplifying assumption, that the slope of the energy with respect to the reaction coordinate is the same on both sides of the transition state. Using only the ground state energies and this assumption, the solution for the position of the transition state θ ‡ is found using similar triangles.
(13) |
Note that θ‡ goes to zero as ΔSq approaches twice the value ΔEd, which means that there would be no barrier (diffusion controlled) for weakly connected substructures. To maintain physical realism, θ‡ is constrained to be in the range 0.2 ≤ θ‡≤ 0.8.
Cavitation, ΔGc‡
Theoretical studies done independently by Scheraga50 and Baker 51 have shown a barrier to hydrophobic collapse (or its inverse) due to the atomic size of solvent. The free energy of cavitation, ΔGc‡, is expressed only in the transition state of the cut, reflecting the cavity formation that must precede the inward diffusion of water. Based on the cavitation studies of Hummer et al49 we assume a quadratic relationship between ΔSAS and ΔGc‡.
(14) |
Configurational entropy barriers, ΔGq‡
A hinge motion may require the concerted motion of several backbone torsion angles, a pivot motion only one or two angles, and a break motion requires no angular shifts. Strain due to steric interactions may be greater in a hinge motion, than in a pivot motion. ΔGq‡ serves to model the barriers to rotation that occur only in the transition state and are dependent on the type of motion. Allowed empirical settings are
(15) |
Equilibrium and transition state free energies, ΔGu-f, ΔG‡f, ΔG‡u
Using the transition state placement variables σ‡ and θ‡, the free energy barriers for folding and unfolding are,
(16) |
(17) |
(18) |
Note that ΔGu-f, = ΔG‡u - ΔG‡f, as required. Values from Eqs 17 and 18 for each elemental subsystem are used in Eqs 2 and 3 to define the elemental rates, and the whole system is simulated using the finite difference method (Figure 6, UnfoldSim).
Folded/Unfolded states
For purposes of calculating the unfolding rate from a simulated unfolding trajectory, the folded state is defined as the set of all intermediate substructures in the folding pathway that retain 90% or more of the buried SAS of the native state. In unfolding trajectories, the concentration of the folded state [F] is the sum over all folded states.
The unfolded state is defined as all intermediates substructures that have less than 1000Å2 of buried SAS. This corresponds to an extended 10-residue fragment or smaller. The concentration of the unfolded state [U] is the average, over all sequence positions i, of the sum of the concentrations of all unfolded states that contain residue i.
Simulated unfolding kinetics
The empirical unfolding rate ku was defined as ln(2)/t1/2, where t1/2 is the time at which [F] first reaches 1/2 of its initial value.
Supplementary Material
Table1.
Term | Value | Equation |
---|---|---|
R | 8.314472 J °K−1 mol−1 | 2, 3 |
dt | 10−7 - 0.01 s | 4 |
F0 | 1.0 M | 5 |
Uv | 0 J mol−1 °K−1 | 8 |
ω | 0 - 100 J mol−1Å−2 | 8 |
mλ | 1.0 | 9 |
Hh | 100 J mol−1 | 11 |
δ S break | 90 J mol−1 °K−1 | 12 |
ΔShinge | 30 J mol−1 °K−1 | 12 |
ΔSpivot | 60 J mol−1 °K−1 | 12 |
mc‡ | 10−6 J mol−1 | 14 |
Δ Ghinge‡ | 50 J mol−1 | 15, 17, 18 |
Δ Gpivot‡ | 10 J mol−1 | 15, 17, 18 |
Δ Gbreak‡ | 0 J mol−1 | 15, 17, 18 |
σ* | 0.5 | 17, 18 |
T | 300°K | 17, 18 |
Acknowledgements
This work was funded by National Science Foundation grants CCF-0432098 to M.Z. and C.B, DBI-0448072 to C.B., and MCB-0519507 to W.C and C.B. We thank Patrick Buck, Ke Xia and Yao-ming Huang for useful discussions.
Footnotes
Availability: Geo Fold source code and server are available at http://www.bioinfo.rpi.edu/bystrc/geofold/server.php
REFERENCES
- 1.Plaxco KWS, Baker KT, Contact Order D. Transition State Placement and the Refolding Rates of Single Domain Proteins. J Mol Biol. 1998;277(4):985–994. doi: 10.1006/jmbi.1998.1645. [DOI] [PubMed] [Google Scholar]
- 2.Gromiha M. M. a. T., A M, Selvaraj S. FOLD-RATE: prediction of protein folding rates from amino acid sequence. Nucleic Acids Research. 2006;34:W70–W74. doi: 10.1093/nar/gkl043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Beck DA, Daggett V. Methods for molecular dynamics simulations of protein folding/unfolding in solution. Methods (San Diego, Calif. 2004;34(1):112–20. doi: 10.1016/j.ymeth.2004.03.008. [DOI] [PubMed] [Google Scholar]
- 4.Duan Y, Kollman PA. Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science. 1998;282(5389):740–4. doi: 10.1126/science.282.5389.740. [DOI] [PubMed] [Google Scholar]
- 5.Dill KA, Chan HS. From Levinthal to pathways to funnels. Nat Struct Biol. 1997;4(1):10–9. doi: 10.1038/nsb0197-10. [DOI] [PubMed] [Google Scholar]
- 6.Snow CD, Sorin EJ, Rhee YM, Pande VS. How well can simulation predict protein folding kinetics and thermodynamics? Annual review of biophysics and biomolecular structure. 2005;34:43–69. doi: 10.1146/annurev.biophys.34.040204.144447. [DOI] [PubMed] [Google Scholar]
- 7.Paschek D, Garcia AE. Reversible temperature and pressure denaturation of a protein fragment: a replica exchange molecular dynamics simulation study. Physical review letters. 2004;93(23):238105. doi: 10.1103/PhysRevLett.93.238105. [DOI] [PubMed] [Google Scholar]
- 8.Yang WY, Gruebele M. Folding [lambda]-repressor at its speed limit. Biophysical journal. 2004;87(1):596–608. doi: 10.1529/biophysj.103.039040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chavez LL, Onuchic JN, Clementi C. Quantifying the Roughness on the Free Energy Landscape: Entropic Bottlenecks and Protein Folding Rates. Chem. Phys. 1999;111:10375–10380. doi: 10.1021/ja049510+. [DOI] [PubMed] [Google Scholar]
- 10.Buck PM, Bystroff C. Constraining local structure can speed up folding by promoting structural polarization of the folding pathway. Protein Sci. 2011;20(6):959–69. doi: 10.1002/pro.619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Xia K, Manning M, Hesham H, Lin Q, Bystroff C, Colon W. Identifying the subproteome of kinetically stable proteins via diagonal 2D SDS/PAGE. Proc Natl Acad Sci U S A. 2007;104(44):17329–34. doi: 10.1073/pnas.0705417104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hockenmaier JJ, K A, Dill KA. Routes are trees: The Parsing Perspective on Protein Foldng. Proteins. 2007;66:1–15. doi: 10.1002/prot.21195. [DOI] [PubMed] [Google Scholar]
- 13.Tsai CJ, Nussinov R. The building block folding model and the kinetics of protein folding. Protein Eng. 2001;14(10):723–33. doi: 10.1093/protein/14.10.723. [DOI] [PubMed] [Google Scholar]
- 14.Galzitskaya OV, Finkelstein AV. A theoretical search for folding/unfolding nuclei in three-dimensional protein structures. Proc Natl Acad Sci U S A. 1999;96(20):11299–304. doi: 10.1073/pnas.96.20.11299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Garbuzynskiy SO, Finkelstein AV, Galzitskaya OV. Outlining folding nuclei in globular proteins. J Mol Biol. 2004;336(2):509–25. doi: 10.1016/j.jmb.2003.12.018. [DOI] [PubMed] [Google Scholar]
- 16.Maity H, Maity M, Krishna MM, Mayne L, Englander SW. Protein folding: the stepwise assembly of foldon units. Proc Natl Acad Sci U S A. 2005;102(13):4741–6. doi: 10.1073/pnas.0501043102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol. 1997;268(1):209–25. doi: 10.1006/jmbi.1997.0959. [DOI] [PubMed] [Google Scholar]
- 18.Zhang Y, Arakaki AK, Skolnick J. TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins. 2005;61(Suppl 7):91–8. doi: 10.1002/prot.20724. [DOI] [PubMed] [Google Scholar]
- 19.Zaki MJN, Bardhan V, Bystroff D. Predicting protein folding pathways. Bioinformatics. 2004;20:i386–393. doi: 10.1093/bioinformatics/bth935. C. [DOI] [PubMed] [Google Scholar]
- 20.Goldberg AL. Protein degradation and protection against misfolded or damaged proteins. Nature. 2003;426(6968):895–9. doi: 10.1038/nature02263. [DOI] [PubMed] [Google Scholar]
- 21.Bogatyreva NS, Osypov AA, Ivankov DN. KineticDB: a database of protein folding kinetics. Nucleic Acids Research. 2009;37(suppl 1):D342–D346. doi: 10.1093/nar/gkn696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Clarke J, Fersht AR. Engineered disulfide bonds as probes of the folding pathway of barnase: Increasing the stability of proteins against the rate of denaturation. Biochemistry. 1993;32(16):4322–4329. doi: 10.1021/bi00067a022. [DOI] [PubMed] [Google Scholar]
- 23.Clarke J, Henrick K, Fersht AR. Disulfide Mutants of Barnase I: Changes in Stability and Structure Assessed by Biophysical Methods and X-ray Crystallography. J Mol Biol. 1995;253(3):493–504. doi: 10.1006/jmbi.1995.0568. [DOI] [PubMed] [Google Scholar]
- 24.Clarke J, Hounslow AM, Fersht AR. Disulfide Mutants of Barnase II: Changes in Structure and Local Stability Identified by Hydrogen Exchange. J Mol Biol. 1995;253(3):505–513. doi: 10.1006/jmbi.1995.0569. [DOI] [PubMed] [Google Scholar]
- 25.Matsumura M, Becktel WJ, Levitt M, Matthews BW. Stabilization of phage T4 lysozyme by engineered disulfide bonds. Proceedings of the National Academy of Sciences of the United States of America. 1989;86(17):6562. doi: 10.1073/pnas.86.17.6562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Meinhold D, Beach M, Shao Y, Osuna R, ColÛn W. The location of an engineered inter-subunit disulfide bond in factor for inversion stimulation (FIS) affects the denaturation pathway and cooperativity. Biochemistry. 2006;45(32):9767–9777. doi: 10.1021/bi060672n. [DOI] [PubMed] [Google Scholar]
- 27.Villafranca JE, Howell EE, Oatley SJ, Xuong NH, Kraut J. An engineered disulfide bond in dihydrofolate reductase. Biochemistry. 1987;26(8):2182–2189. doi: 10.1021/bi00382a017. [DOI] [PubMed] [Google Scholar]
- 28.Bystroff C, Thorsson V, Baker D. HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. J Mol Biol. 2000;301(1):173–90. doi: 10.1006/jmbi.2000.3837. [DOI] [PubMed] [Google Scholar]
- 29.Flory P. Statistical mechanics of chain molecules. Carl Hanser Verlag. 1989;1989:432. [Google Scholar]
- 30.Bystroff C, Kraut J. Crystal structure of unliganded Escherichia coli dihydrofolate reductase. Ligand-induced conformational changes and cooperativity in binding. Biochemistry. 1991;30(8):2227–2239. doi: 10.1021/bi00222a028. [DOI] [PubMed] [Google Scholar]
- 31.Jennings PA, Finn BE, Jones BE, Matthews CR. A reexamination of the folding mechanism of dihydrofolate reductase from Escherichia coli: Verification and refinement of a four-channel model. Biochemistry. 1993;32(14):3783–3789. doi: 10.1021/bi00065a034. [DOI] [PubMed] [Google Scholar]
- 32.Kuwajima K, Garvey EP, Finn BE, Matthews CR, Sugai S. Transient intermediates in the folding of dihydrofolate reductase as detected by far-ultraviolet circular dichroism spectroscopy. Biochemistry. 1991;30(31):7693–7703. doi: 10.1021/bi00245a005. [DOI] [PubMed] [Google Scholar]
- 33.Touchette NA, Perry KM, Matthews CR. Folding of dihydrofolate reductase from Escherichia coli. Biochemistry. 1986;25(19):5445–5452. doi: 10.1021/bi00367a015. [DOI] [PubMed] [Google Scholar]
- 34.Clementi C, Jennings PA, Onuchic JN. How native-state topology affects the folding of dihydrofolate reductase and interleukin-1. Proceedings of the National Academy of Sciences of the United States of America. 2000;97(11):5871. doi: 10.1073/pnas.100547897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jones BE, Robert Matthews C. Early intermediates in the folding of dihydrofolate reductase from Escherichia coli detected by hydrogen exchange and NMR. Protein Science. 1995;4(2):167–177. doi: 10.1002/pro.5560040204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Arai M, Maki K, Takahashi H, Iwakura M. Testing the relationship between foldability and the early folding events of dihydrofolate reductase from Escherichia coli. J Mol Biol. 2003;328(1):273–288. doi: 10.1016/s0022-2836(03)00212-2. [DOI] [PubMed] [Google Scholar]
- 37.Topping TB, Hoch DA, Gloss LM. Folding mechanism of FIS, the intertwined, dimeric factor for inversion stimulation. J Mol Biol. 2004;335(4):1065–1081. doi: 10.1016/j.jmb.2003.11.013. [DOI] [PubMed] [Google Scholar]
- 38.Hobart SA, Ilin S, Moriarty DF, Osuna R, ColÛn W. Equilibrium denaturation studies of the Escherichia coli factor for inversion stimulation: Implications for in vivo function. Protein Science. 2002;11(7):1671–1680. doi: 10.1110/ps.5050102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.De Sancho D, Munoz V. Integrated prediction of protein folding and unfolding rates from only size and structural class. Phys Chem Chem Phys. 2011;13(38):17030–43. doi: 10.1039/c1cp20402e. [DOI] [PubMed] [Google Scholar]
- 40.Rader A, Hespenheide BM, Kuhn LA, Thorpe MF. Protein unfolding: rigidity lost. Proceedings of the National Academy of Sciences of the United States of America. 2002;99(6):3540. doi: 10.1073/pnas.062492699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Abkevich VI, Shakhnovich EI. What can Disulfide Bonds Tell Us about Protein Energetics, Function and Folding: Simulations and Bioninformatics Analysis. J Mol Biol. 2000;300(4):975–985. doi: 10.1006/jmbi.2000.3893. [DOI] [PubMed] [Google Scholar]
- 42.Matouschek A, Kellis JT, Jr., Serrano L, Bycroft M, Fersht AR. Transient folding intermediates characterized by protein engineering. Nature. 1990;346(6283):440–5. doi: 10.1038/346440a0. [DOI] [PubMed] [Google Scholar]
- 43.Fersht A. Structure and mechanism in protein science. W.H. Freeman; New York: 1999. p. 614. [Google Scholar]
- 44.Bystroff C. MASKER: improved solvent-excluded molecular surface area estimations using Boolean masks. Protein Eng. 2002;15(12):959–966. doi: 10.1093/protein/15.12.959. [DOI] [PubMed] [Google Scholar]
- 45.Wang Y, Zhang H, Scott RA. A new computational model for protein folding based on atomic solvation. Protein Sci. 1995;4(7):1402–11. doi: 10.1002/pro.5560040714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Still W, Tempczyk A, Hawley R, Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 1990;112:6127–6129. [Google Scholar]
- 47.Doig AJ, Sternberg MJE. Side chain conformational entropy in protein folding. Protein Sci. 1995;4(11):2247–2251. doi: 10.1002/pro.5560041101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Pickett SD, Sternberg MJE. Empirical Scale of Side-chain Conformational Entropy in Protein Folding. J Mol Biol. 1993;231(3):825–839. doi: 10.1006/jmbi.1993.1329. [DOI] [PubMed] [Google Scholar]
- 49.Hummer G, Garde S, Garcia AE, Pohorille A, Pratt LR. An information theory model of hydrophobic interactions. National Acad Sciences. 1996;93:8951–8955. doi: 10.1073/pnas.93.17.8951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Czaplewski C, Rodziewicz Motowidlo S, Liwo A, Ripoll DR, Wawak RJ, Scheraga HA. Molecular simulation study of cooperativity in hydrophobic association. Protein Science. 2000;9(6):1235–1245. doi: 10.1110/ps.9.6.1235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Rank JA, Baker D. A desolvation barrier to hydrophobic cluster formation may contribute to the rate limiting step in protein folding. Protein Science. 1997;6(2):347–354. doi: 10.1002/pro.5560060210. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.