Significance
Single-domain proteins with symmetrical arrangement of secondary structural elements in the native state are expected to fold by diverse pathways. However, understanding the origins of pathway diversity, and the experimental signatures for identifying it, are major challenges, especially for small proteins with no obvious symmetry in the folded states. We show rigorously that upward curvature in the logarithm of unfolding rates as a function of force (or denaturants) implies that the folding occurs by diverse pathways. The theoretical concepts are illustrated using simulations of src SH3 domain, which explain the emergence of parallel pathways in single-molecule pulling experiments and provide structural description of the routes to the unfolded state. We make testable predictions illustrating the generality of the theory.
Keywords: protein folding, parallel pathways, single-molecule force spectroscopy
Abstract
Although it is known that single-domain proteins fold and unfold by parallel pathways, demonstration of this expectation has been difficult to establish in experiments. Unfolding rate, , as a function of force f, obtained in single-molecule pulling experiments on src SH3 domain, exhibits upward curvature on a plot. Similar observations were reported for other proteins for the unfolding rate . These findings imply unfolding in these single-domain proteins involves a switch in the pathway as f or is increased from a low to a high value. We provide a unified theory demonstrating that if as a function of a perturbation (f or ) exhibits upward curvature then the underlying energy landscape must be strongly multidimensional. Using molecular simulations we provide a structural basis for the switch in the pathways and dramatic shifts in the transition-state ensemble (TSE) in src SH3 domain as f is increased. We show that a single-point mutation shifts the upward curvature in to a lower force, thus establishing the malleability of the underlying folding landscape. Our theory, applicable to any perturbation that affects the free energy of the protein linearly, readily explains movement in the TSE in a β-sandwich (I27) protein and single-chain monellin as the denaturant concentration is varied. We predict that in the force range accessible in laser optical tweezer experiments there should be a switch in the unfolding pathways in I27 or its mutants.
That single-domain proteins should fold by parallel or multiple pathways emerges naturally from theoretical considerations (1–3) and computational studies (4–7). The generality of the conclusions reached in the theoretical studies is sufficiently compelling, which makes it surprising that they are not routinely demonstrated in typical ensemble folding experiments. The reasons for the difficulties in directly observing parallel folding or unfolding pathways in monomeric proteins can be appreciated based on the following arguments. Consider a protein that reaches the folded state by two different pathways. The ratio of flux through these pathways is proportional to , where and are, respectively, the free energy barriers separating the folded and unfolded states along the two pathways, is the Boltzmann constant, and T is the temperature. If is large compared with then the experimental detection of flux through the high free energy barrier pathway (H) is unlikely. External perturbations (such as mechanical force f or denaturants ) could reduce . However, the values of f (or ) should fall in an experimentally accessible range for detecting a potential switch between pathways. Despite these inherent limitations, Clarke and coworkers showed in a most insightful study that unfolding of Ig domain (I27) induced by denaturants occurs by parallel routes (8). Subsequently, additional experiments on single-chain monellin (9), using denaturants and spectroscopic probes, have firmly shown the existence of multiple paths. Thus, it seems that multiple folding routes can be detected in standard folding experiments (10, 11) provided the flux through the higher free energy barriers is not so small that it escapes detection. In addition, parallel folding pathways have been observed in repeat proteins, where inherent symmetry in the connectivity of the individual domains (12) results in parallel assembly.
Single-molecule pulling experiments in which f is applied to specific locations on the protein have demonstrated that unfolding of many proteins follows complex multiple routes. Mechanical force, unlike denaturants, does not alter the effective microscopic interactions between the residues, thus allowing for a cleaner interpretation. More importantly, by following the fate of many individual proteins the underlying heterogeneity in the routes explored by the protein can be revealed. Indeed, pulling experiments and simulations on a variety of single-domain proteins (13–15) show clear signatures of many routes for f-induced unfolding. It could be argued that in many of these studies the network of states connecting the folded and unfolded states is a consequence of complex topology, although they are all single-domain proteins. However, the src SH3 domain is a small protein with no apparent symmetry in the arrangement of secondary structure elements that folds in an apparent two-state manner. Thus, the discovery that it unfolds using parallel pathways (16, 17) is unexpected and requires a firm theoretical explanation and structural interpretation.
In single-molecule pulling experiments, performed at constant force or constant loading rate, only a 1D coordinate, the molecular extension (x), is readily measurable. When performed at constant f, it is possible to generate folding trajectories (x as a function of time), from which equilibrium 1D free energy profiles, , can be extracted using rigorous theory (18). The utility of hinges on the crucial assumption that all other degrees of freedom in the system including the solvent come to equilibrium on time scales much faster than in x, so that x itself may be considered to be an accurate reaction coordinate.
A straightforward way to assess whether a 1D picture is adequate is to analyze the f dependence of the unfolding rate , which can be experimentally obtained at constant f or computed from unfolding rates measured at different loading rates (18, 19). The observed upward curvature in the plot in src SH3 (16) was shown to be a consequence of unfolding by two pathways, one dominant at low forces and the other at high forces. It was succinctly argued that the measured data cannot be explained by multiple barriers in a 1D or a 1D profile with a single barrier in which the unfolding rate is usually fit using the Bell model , where is the unfolding rate at , and is the location of the barrier in at zero force with respect to the folded state. (Throughout this paper, by location of the barrier, or the transition state, we mean the location with respect to the folded state.) The upward curvatures in the monotonic as well as plots, observed experimentally, necessarily imply that parallel routes are involved in the unfolding process. (A nonmonotonic plot suggests catch bond behavior.)
To provide a general framework for a quantitative explanation of a broad class of experiments, we first present a rigorous theoretical proof that upward curvature in (or ) implies that the folding landscape is strongly multidimensional (SMD). Hence, such SMD landscapes cannot be reduced to 1D or superposition of physically meaningful 1D landscapes, which can rationalize the observed convex [] plot. We note en passant that the shape of the measured plot cannot be justified using even if were allowed to depend on f, moving toward the folded state as f increases. The theory only hinges on the assumption that the perturbation (f or ) is linearly coupled to the effective energy function of the protein. To illustrate the structural origin of the upward curvature in the plot we also performed simulations of f-induced unfolding of src SH3 domain, to identify the structural details of the unfolding pathways including the movement of transition states as the force is increased. The results of the simulations semiquantitatively reproduce the experimental [] plots for both the WT and the V61A mutant. More importantly, we also provide the structural basis of the switch in the unfolding pathways as f is varied, which cannot be obtained using pulling experiments. We obtain structures of the transition-state ensembles (TSEs), demonstrating the change in the TSE structures as f is increased from a low to a high value.
Results
Nature of the Energy Landscape from [] Plots.
Let us assume that the unfolding rate , as a function of a controllable external perturbation α, can be measured. We assume that α decreases the stability of the folded state linearly, as is the case in the pulling experiment with , the force applied at two points of the protein. However, the discussion below is quite general and applies to any external parameter with a linear, additive contribution to the effective protein energy function. For a protein under force, the total free energy has the general form , with a force contribution , where x is the end-to-end extension of the protein. Here, is the free energy in the absence of applied tension and the vector represents all of the additional conformational degrees of freedom besides x.
In the derivation below, we model the dynamics of the protein as diffusion of a single particle on the multidimensional landscape . The unfolding of the protein would correspond to the particle starting in the reference protein conformation in the folded state energy basin F and diffusing to any other conformation, with a given extension , representing the unfolded basin U (Fig. 1). The unfolding time for a particular trajectory is the time when the particle reaches the target conformation for the first time (known as first passage time). Averaging this time over all trajectories yields the mean first passage time (MFPT) from the unfolded to folded state, which we denote as , or the average unfolding time. The unfolding rate is the inverse of the unfolding time, .
We are interested in finding the curvature of as a function of f, and in particular the sign of . Starting from the diffusion equation, we find expressions for the MFPT from any conformation with extension , , and then for and its first two derivatives. It turns out that if we use the assumptions of a single unfolding pathway, the second derivative is negative and the curvature of has to be downward.
The summary of the subsequent derivation is as follows: (i) We start from the equation for which can be obtained from the diffusion equation (20), (ii) integrate it over the degrees of freedom, (iii) use two assumptions for evaluating the integral with inside, (iv) solve the ordinary differential equation for the unfolding time, and (v) establish that the solution implies certain constraints on the shape of the [] plot. Following this derivation in detail is not necessary for understanding the other parts of the paper.
The equation for the MFPT can be obtained from the diffusion equation (in Fokker–Planck form) by integration over x,, and t, followed by some rearrangements (20). The result is called the backward Kolmogorov equation:
[1] |
with the boundary condition , with and being the diffusion constant, which for generality is allowed to depend on the conformation. By dividing both sides of Eq. 1 by and integrating over the conformational coordinates , we obtain
[2] |
To get the result in Eq. 2, we have assumed that at the integration limits of the coordinate space of (i.e., the diffusion process is bounded). We rewrite Eq. 2 as
[3] |
where .
Further simplification of the MFPT expression depends on the nature of the multidimensional free energy . In particular, we define a class of free energies that satisfy the following two conditions:
-
i)
has a single minimum with respect to at each point x in the range to . We denote the location of this minimum as .
-
ii)
The Boltzmann factor for near is sharply peaked, so the thermodynamic contribution from conformations with coordinates far from is negligible. In other words, we assume fast equilibration along the coordinates at each x, compared with the timescale of first passage between N and U.
A schematic illustration of a satisfying these requirements is shown in Fig. 1A. Diffusion is essentially confined to a single, narrow reaction pathway in the multidimensional space. We will call any in this category weakly multidimensional (WMD) with respect to x, because the diffusion process is quasi-1D in terms of the reaction coordinate x. In contrast, any that violates either one of the above conditions will be called SMD, because it has characteristics that qualitatively distinguish it from any 1D diffusion process. Note that this categorization makes no other assumptions about the shape of except for those specified above: For example, there could be one or many free energies barriers separating N and U, or none at all. Fig. 1 B and C show two examples of that are SMD. In both cases, condition i is violated, because in the range there is no unique minimum in along . For Fig. 1B, there are two possible reaction pathways between N and U, whereas for Fig. 1C there is a single pathway, but it is nonmonotonic in x.
For an energy landscape that is WMD, there are rigorous bounds on the first and second derivatives of with respect to f. To derive these bounds, note that the WMD assumptions allow us to make a saddle-point approximation to the integral over on the left-hand side of Eq. 3, setting the value of in to . Because this will be the dominant contribution, we approximate Eq. 3:
[4] |
By simplifying the notation by defining and , Eq. 4 becomes
[5] |
The solution for from Eq. 5, with boundary condition , can be written as a Laplace transform of a function ,
[6] |
Both and are nonnegative functions [because and for all x and ], so the function is likewise nonnegative, for . From this property, it follows that , and
[7] |
for . Because the experimental data are typically plotted in terms of with respect to f, we are specifically interested in the corresponding derivatives of ,
[8] |
From Eq. 7 we see that . The sign of requires establishing the sign of the term in the square brackets in Eq. 8, which can be done by using the Cauchy–Schwarz inequality. Let us define two functions, and , where for and 0 for . Then, from Eqs. 6 and 7 we obtain , , Using the Cauchy–Schwarz inequality
[9] |
we find that . Hence, from Eq. 8 we see that . In summary, we can now state the full criterion for the validity of WMD for describing force-induced unfolding.
Criteria for WMD Unfolding Landscape.
The unfolding rate on a WMD free energy landscape under applied force f must satisfy
[10] |
If fails to satisfy either of the conditions in Eq. 10, the underlying free energy landscape must be SMD, and analyses of the measured data using the end-to-end distance, x, as a reaction coordinate are incorrect.
Scenarios for [] Plots in WMD and SMD.
A range of behaviors in [] plots can be obtained depending on the nature of the energy landscape. Stochastic simulations in a WMD (Fig. 1 A and D) show that [] has a minor downward curvature, which is readily explained by a generalized Bell model in which the transition-state location is allowed to move toward the folded state in accord with the Hammond effect (21). In contrast, in the SMD (Fig. 1 B, C, E, and F) the [] plot shows upward curvature. The upward curvature in Fig. 1E indicates loss of flux from the folded state through two channels in Fig. 1B, similar to parallel pathways in protein unfolding experiments. Interestingly, the upward curvature in Fig. 1F from the SMD landscape in Fig. 1C does not come from parallel pathways. Instead, the lifetime of the folded state first increases followed by the usual decrease as f increases. Such a counterintuitive “catch bond” behavior is well documented in a number of protein complexes (22–24). The results in Fig. 1 E and F show that violations of Eq. 10 implies that the underlying energy landscape must be SMD.
Naive Analyses of the f Dependence of .
Using the data generated by molecular dynamics simulations of force unfolding the src SH3 domain, with force applied to residues 9 and 59, for a set of forces , we calculated for each force as the inverse of the MFPT from the folded state to the unfolded state, by averaging the set of first passage times to unfolding over the trajectory index j (Materials and Methods). The [] plot for f-induced unfolding is nonlinear with upward curvature implying that the free energy landscape is SMD (Fig. 2B). We note parenthetically that the inadequacy of the Bell model cannot be fixed using movement of the transition state with f or using a 1D free energy profile with two (or more) barriers. Remarkably, the slope change in the simulations qualitatively coincides with measurements on the same protein (16, 17), where constant force was applied to the residues 7 and 59. Thus, both simulations and experiment show that the condition in Eq. 10 is violated, implying that the free energy landscape for SH3 is SMD.
The observed dependence can be fit using a sum of two exponential functions (16),
[11] |
The parameters and (unfolding rates at ) and and (putative locations of the transition states) can be precisely extracted using maximum likelihood estimation (MLE) (Materials and Methods). According to the Akaike information criterion (25), the double-exponential model is significantly more probable than the single-exponential model, for both simulations (relative likelihood of the models ) and experiments (17) (). The extracted values of and , shown in Table 1, are nm, nm for the simulations data and nm, nm for the experimental data from ref. 17 with the MLE procedure, which differ somewhat from the values reported in ref. 17 ( nm, nm). Given that the error in estimated for experimental data using MLE is large we surmise that the simulations and experiments are in good agreement. The switch in the forced unfolding behavior [estimated as the point where the third derivative of in Eq. 11 with parameters given by MLE changes sign] occurs around 25 pN for the experimental data and around 35 pN for the simulation data. These comparisons show that the simulations based on the self-organized polymer with side-chains (SOP-SC) model reproduce quantitatively the shape of the [] plot. Because simulations are done by coarse-graining the degrees of freedom, involving both solvent and proteins, the from simulations are expected to be larger than the measured values, with the discrepancy being greater at higher forces. Our previous work (26) showed that the unfolding rate in denaturants is larger by a factor of ≈150, which is similar to the difference between experiment and simulations in Fig. 2. However, because the inference about parallel pathways relies solely on the shape of the inability to quantitatively reproduce the precise value of is irrelevant.
Table 1.
Simulations | Experiment | |||||
Name | Value | Error | Units | Value | Error | Units |
1.5 | 0.4 | Seconds−1 | 2.0 | 1.1 | s−1 | |
3.9 | 10.0 | s−1 | 6.1 | 6.4 | s−1 | |
0.40 | 0.1 | Nanometers | 0.08 | 0.1 | Nanometers | |
1.16 | 0.3 | Nanometers | 1.42 | 0.12 | Nanometers | |
The definition of is given in Materials and Methods. The errors come out of log-likelihood covariance matrix with respect to parameters.
Despite the good fits to Eq. 11 neither nor can be associated with transition-state location as is traditionally assumed. We show below that such projections onto a 1D coordinate cease to have physical meaning when the underlying folding landscape is SMD. The apparent barriers to unfolding at along the pathways can be estimated using and . Using the accepted estimates for the prefactor () (27–29), and the values of and from the fits of experimental data (Table 1), we obtain , depending on the value of and . If these values are reasonable then the ratio of fluxes through the two pathways at is , which is much smaller than those obtained in the simulations by direct calculation of the flux through the two pathways. In addition, the finding that also makes no physical sense, because we expect the molecule under higher tension to be more brittle (19). These are the first indications that the fits using Eq. 11 do not provide meaningful parameters.
Structural Basis of f-Dependent Switch in Pathways.
To provide a structural interpretation of the SMD nature of f-induced unfolding of src SH3, we followed the changes in several variables describing the conformations of SH3 as force applied to residues 9 and 59 is varied in the SOP-SC simulations. Most of these are derived from measures assessing the extent to which structures of various parts of the protein overlap with the conformation in the native state. The structural overlap for two parts of the protein A and B is the fraction of broken native contacts between A and B (30),
[12] |
where the summation is over the coarse-grained beads belonging to the parts A and B, is the number of contacts between A and B in the native state, is the Heaviside function, is the tolerance in the definition of a contact, and and , respectively, are the coordinates of the beads in a given conformation and the native state. Two of the most relevant sets of contacts in the forced rupture of SH3 are the ones between the N-terminal () and C-terminal () β-strands (Fig. 2A), computed using the structural overlap, , and contacts between the RT loop (residues 15–31) and the protein core (strands and , residues 42–57) quantified by . When these structural elements unravel the structural overlap values become close to unity, signaling the global unfolding of the SH3 domain.
Depending on f, in some trajectories the RT loop ruptures from the protein first ( sharply approaches 1), followed by the break between and strands (). In other trajectories, the order is opposite, with sheet melting first, without the RT-loop rupture (Figs. 3 and 4 and Fig. S1). The calculated the fraction, , of trajectories that unravel through rupture of the RT-loop pathway depends strongly on force, suggesting that these are the two major pathways responsible for the change in the slope of the [] plot (Fig. 5C). At low forces (15 pN) , implying that of the trajectories unfold through the RT-loop pathway, and this fraction decreases monotonically to at 45 pN. Movies S1 and S2 illustrate the two unfolding scenarios.
Effect of Cysteine Cross-Linking.
To further illustrate that the slope change in Fig. 2 is due to the switching of the unfolding routes between the particular pathways discussed above, we created an in silico mutant by adding a disulfide bond between the RT loop and (mimicking a potential experiment with L24C/G54C mutant). In the cross-link mutant, the enhanced stability of the RT loop to the protein core blocks the unfolding pathway. We generated six 1,500-ms unfolding trajectories at 15 pN and did not observe unfolding in any of them, thus obtaining an estimate for the upper bound of unfolding rate of s−1 for this mutant. Comparing this unfolding rate to the rate at 15 pN for the WT (without the disulfide bridge) of 5.2 s−1 shows that blocking of the pathway decreases the average unfolding rate at 15 pN. The mutant simulations with the disulfide bridge suggest that the RTL pathway plays a major role at low forces, and the unfolding through the pathways is much slower at low force. Furthermore, these simulations also show that rupture of the protein through the pathway occurs at a very slow rate at low forces even when the unfolding flux along the RTL pathway is muted. Taken together these simulations explain the structural basis of rupture in the two major unfolding pathways.
Pathway Switch Occurs at a Lower Force in V61A Mutant.
To examine the effect of point mutations, we calculated as a function of f for the V61A mutant. In the laser optical trapping (LOT) experiments, V61A mutant does not show upward curvature in the same force range, and the [] plot in that range is linear. However, the curvature can be seen at lower forces. In simulations, we observe the same qualitative change with respect to the WT upon mutation (Fig. 6 and Table S1). If only data for forces above 15 pN are taken into account, the single-exponential model becomes slightly more likely than the double-exponential model, but inclusion of the lower forces data shows double exponential, with pathway switching coming at a lower force than for WT. The fraction of trajectories going through the RT-loop pathway decreases compared with the WT [i.e., for all f] (Fig. 6C). The loss of upward curvature in the force range above 15 pN can be explained by the more prominent role of the pathway at low forces, leading to lesser degree of switching between the pathways. The V61A mutation is in the strand, making interactions between and weaker, thus enabling the sheet to rupture more readily. Parenthetically we note that this is a remarkable result, considering that change in the SOP-SC force field is only minimal, which further illustrates that our model also captures the effect of point mutations.
Table S1.
Name | x | ||||||
Simulation WT | 1.47 | 0.40 | 1.16 | 0.76 | 0.58 | ||
Simulation V61A | 2.45 | 0.02 | 0.67 | 1.29 | 0.52 | ||
Experiment WT | 0.07 | 1.42 | 0.82 | ||||
Experiment V61A | 0.0 | 1.15 | 0.94 | ||||
Units | Seconds−1 | Seconds−1 | Nanometers | Nanometers | Seconds−1 | Nanometers | 1 |
Two types of fits were made: exponential fits and double-exponential fits . shows the relative likelihood of the double-exponential model with respect to the single-exponential model, as assessed by Akaike information criterion.
Free Energy Profiles and Transition States.
Let us assume that the free energy landscape projected onto extension as the reaction coordinate accurately captures the f-dependent unfolding kinetics. In this case, we expect the Bell model or its variation would hold, and x (assumed to be f-independent) obtained from the fitting to that model would be the distance to the transition state with respect to the folded state . If the underlying free energy landscape were SMD it is still possible to formally construct a 1D free energy profile using experimental (18) or simulation data. It is tempting to associate the distances in the projected 1D profiles with transition-state locations with respect to the folded state, as is customarily done in analyzing single-molecule force spectroscopy (SMFS) data. Such an interpretation suggests that and should correspond to the distances to the two transition states in the two pathways, with increasing with force in an apparent anti-Hammond behavior. To assess whether this is realized, we constructed 1D free energy profiles (of the WT protein) at forces 15, 30, and 45 pN to determine . It turns out that decreases rather than increases with force, demonstrating the normally expected Hammond behavior (Fig. 7), as force destabilizes the native state (21, 31) (Discussion).
We now demonstrate that and cannot be identified with transition-state locations by calculating the committor probability, (32), the fraction of trajectories that reach the folded state before the unfolded state starting from or . If and truly correspond to distances to transition states then (32), that is, the TSE should correspond to structures that have equal probability of reaching folded or unfolded state, starting from or . In sharp contrast to this expectation, the states with are visited hundreds of time before unfolding (Fig. S2), which means . Thus, the usual interpretation of or ceases to have physical meaning, which is a consequence of the strong multidimensionality of the unfolding landscape of SH3.
Force-Dependent Movement of the TSE.
The results in Fig. S2 show that the extracted values of and cannot represent the TSE. Because the underlying reaction coordinates for the inherently SMD nature of folding landscapes are difficult to guess, the TSE can only be ascertained with a method that does not use a predetermined form of the reaction coordinate. We use the , based on the theory that the TSE should correspond to structures that have equal probability () of reaching the folded or unfolded state. To determine the TSEs in our simulations, we picked the putative transition-state structures from the saddle point of the 2D histogram of the unfolding trajectories [; (kcal/mol) for pN and ; (kcal/mol) for pN], where E is the total energy of the protein. We ran multiple trajectories from each of the candidate TS structures, noting when the trajectory reaches the folded or the unfolded state first, to determine the . The set of structures with is identified with the TSE. The value for the whole ensemble is the total number of trajectories (starting from all of the candidate structures) that reach the folded state first, divided by the total number of trajectories (or, the average of the individual values).
The TSEs for 15 pN and 45 pN are given in Fig. 8. For both sets the . The low-force TSE shows that the RT loop is disconnected from the core ( state) and the 45 pN TSE has structures where the loop interacts with the core, but the contacts between N- and C-terminal β-strands are broken. The explicit TSE calculations confirm that the TSEs are similar to those found in unfolding trajectories with and .
The experimental analysis of transition states of SH3 using mechanical Φ-values (17) suggests that in the high-force pathway the important residues are Phe-10 and Val-61 (which are in the and ), along with a core residue Leu-44. For the bulk (low/zero force) pathway, Phe-10, Ile-56, and Val-61 are also apparently important in TSE, as is the RT-loop residue Leu-24, which interacts with the protein core. Our simulation results, which provide a complete structural description of the TSEs, support the experimental interpretation, namely, loss of interaction between the RT loop and the core at low forces and rupture of the sheet at high forces.
It is interesting to compute the mean extensions of the two major TSEs. The average distance between force application points for these structures is nm for 15 pN and nm for 45 pN, which (given the distances in the folded state of and nm) translates to the transition states of and nm, respectively. These values have no relation to and , further underscoring the inadequacy of using Eq. 11 to interpret [] plots in SMD.
Discussion
Hammond Behavior.
Protein folding could be viewed using a chemical reaction framework. Just like in a chemical reaction, transitions occur from a minimum on a free energy landscape (corresponding to reactant or unfolded state) to another minimum (corresponding to a product representing the folded state, or an intermediate) by crossing a free energy barrier. The top of the free energy barrier corresponds to a transition state.
Besides determining the structures of the unfolded and folded states, one of the main goals in protein folding is to identify the TSE and characterize the extent of its heterogeneity. When viewed within the chemical reaction framework, the Hammond postulate provides a qualitative description of the structure of the transition state if it is unique. The Hammond postulate states that “if two states, as, for example, a transition state and an unstable intermediate, occur consecutively during a reaction process and have nearly the same energy content, their interconversion will involve only a small reorganization of the molecular structure” (33). A corollary of the Hammond postulate is that the TS structure likely resembles the least stable species in the folding reaction.
To apply the Hammond postulate to a protein free energy landscape, perturbed by f, let us assume that at the states F and U, with equal free energy, are separated by a transition state. Increasing f will generally destabilize F, and lower the free energy of U. According to Hammond’s postulate, the transition state should be more similar to F than U as f increases. If , then the free energy of F will be lower than U, and consequently the transition state will be more U-like. As a consequence of this argument, in unfolding induced by force, the transition state should move toward the state that is destabilized by f (31), in accord with Hammond behavior. If the opposite were to happen it could be an indication of anti-Hammond behavior.
In a 1D energy landscape, the distance between a minimum and a barrier is reflected in the slope of the [] plot (m value for []), which follows from the Arrhenius law and linear coupling in the free energy. Hammond behavior for the unfolding rate would mean movement of the transition state toward the folded state resulting in the decreasing of the slope of the [] plot with f. Hence, the temptation to refer to the opposite change of slope (i.e., increasing with f) as anti-Hammond behavior is natural. However, because the increase of the slope of the [] plot necessarily means that the energy landscape is SMD, referring to movement of the transition state along a single reaction coordinate is not meaningful. Hence, the term “anti-Hammond” behavior in this case does not reflect the opposite of the Hammond postulate in either the original formulation or movement of the transition state. Moreover, even if the energy landscape is formally projected onto the reaction coordinate to which the parameter (f or ) is coupled (which is possible even in the SMD case albeit without much physical sense), the movement of the transition state on this formal 1D landscape will still obey the Hammond postulate. Such a conclusion follows from a Taylor expansion of the first derivative of the perturbed (by f or ) free-energy profile around the barrier top (),
[13] |
where and are the free energy profile and transition-state position at . Because and , we find , or , establishing that the transition state moves toward the native state, in accord with the Hammond behavior. Our conclusion holds for any perturbation f that is linearly coupled to the energy function, and that monotonically destabilizes the folded state. Thus, we surmise that upward curvature in [] or [] plots are not equivalent to anti-Hammond behavior. We note here, though, that the linear coupling of f to the protein Hamiltonian is exact and the perturbation by denaturant is approximate, although the leading order in is linear.
A similar conclusion, that is, a connection between upward curvature and multidimensionality, has been drawn analytically before, in the context of mechanochemistry of small molecules, based on the Taylor expansion of the Bell’s model, similar to Eq. 13 (34, 35). In our work, we started from the most general description rather than from the solution of the Kramer’s problem. The WMD conditions are similar to 1D assumptions when obtaining the Bell’s model, but we do not make any assumptions about the barriers. We also solved directly for the quantity we are interested in, that is, sign of , rather than movements of the transition state. Connecting the latter to the curvature of the rate requires some additional steps, which might require more assumptions.
SMD in Denaturant-Induced Unfolding.
The criterion in Eq. 10 to assess whether experiments can be analyzed using a 1D free energy profile applies to any external perturbation with a linear, additive contribution to the free energy. If we consider the unfolding rate as a function of denaturant concentration , a criterion analogous to Eq. 10 would hold if we assume that the energetic contribution due to is linear, proportional to a reaction coordinate related to the solvent-exposed surface area:
[14] |
where is the solvent-accessible surface area (SASA)-related monotone function of reaction coordinate x. Thus, for any perturbation (f or ) coupling to the Hamiltonian, the theory and applications also hold when upward curvature in the [] plot is observed.
Typically, the observed nonlinearities in the [] plots are analyzed using a double exponential fit, (8), just like is done to analyze [] plots. Here, and are the analogs of and representing the unfolding m values. It has been shown for a protein with Ig fold (8) (see Fig. S3 for the fits for several mutants of I27 using the double-exponential model) and for monellin (9) that there is upward curvature in the [] plots, which violates Eq. 10, implying that the underlying landscape in SMD. If [] plots were linear then the unfolding m value is likely to be proportional to the solvent accessible surface area in the transition state (even if the latter is heterogeneous), and the ensemble of conformations corresponding to the m value may be associated with the TSE (). However, for the [] plots with upward curvature and may not correspond to the SASAs of the respective transition states of the pathways, just as we have shown that the extracted and should not be interpreted as TSE locations at low and high f, respectively. In addition, although the for the WT is consistent with the expected value for β-sheet proteins with the I27 size, the seems unphysical. This observation combined with fairly high ratios of from a double-exponential fit (8) (Table S2 and Fig. S3) suggests that although the double-exponential model above fits the data, inferring the nature of the TSE requires an entirely new set of experiments along the lines reported by Clarke and coworkers (8).
Table S2.
Name | |||||||
A75G | 6.3 | 4.7 | 0.28 | 1.35 | 2.2 | 0.54 | |
C47A | 21.4 | 2.9 | 0.21 | 1.38 | 16.3 | 0.29 | |
F21L | 14.8 | 0.005 | 0.41 | 2.40 | 6.6 | 0.58 | |
G32A | 6.0 | 0.3 | 0.38 | 1.91 | 2.3 | 0.63 | |
I23A | 3.2 | 4.3 | 0.26 | 1.36 | 0.8 | 0.64 | |
I49V | 14.4 | 26.1 | 0.44 | 1.31 | 9.4 | 0.57 | |
L36A | 44.4 | 6.0 | 0.33 | 1.60 | 25.0 | 0.49 | |
L58A | 9.6 | 4.3 | 0.28 | 1.35 | 4.9 | 0.45 | |
L60A | 43.8 | 4.1 | 0.29 | 1.58 | 26.0 | 0.43 | |
L8A | 23.2 | 235.2 | 0.44 | 1.19 | 11.6 | 0.68 | |
V30A | 9.0 | 4.3 | 0.30 | 1.35 | 4.4 | 0.47 | |
V71A | 7.3 | 2.5 | 0.29 | 1.33 | 3.7 | 0.44 | |
WT | 5.8 | 0.6 | 0.25 | 1.36 | 4.6 | 0.31 | |
Units | mol/L−1 | mol/L−1 | mol/L−1 |
Two types of fits were made: exponential fits and double exponential fits . shows relative likelihood of the double-exponential model with respect to the single-exponential model, as assessed by Akaike information criterion.
Pathway Switch and Propensity to Aggregate.
In our previous work (36) we showed that an excited state in the spectrum of monomeric src SH3 domain has a propensity to aggregate. The structure of the , which is remarkably close to the very lowly populated structure for Fyn SH3 domain determined using using NMR (37), has a ruptured interaction between and . In other words, the value of is large. Interestingly, in our simulations unfolding of src SH3 domain occurs by weakening of these interactions at high forces (Fig. 3 and Fig. S1). Thus, the structures are dominant at high forces. Because the probability of populating of is low at low forces [ has 1–2% probability of forming at (36, 37)] it follows that SH3 aggregation is unlikely at low forces but can be promoted at high forces. Thus, SH3 domains have evolved to be aggregation-resistant, and only under unusual external conditions they can form fibrils.
Prediction for a Switch in the Force Unfolding of I27.
Based on our theory and simulations we can make a testable prediction for forced unfolding of I27. Because there is upward curvature in the denaturant-induced unfolding of I27 (8), we predict that a similar behavior should be observed for force-induced unfolding as well. In other words, there should be a switch in the pathway as the force used to unfold I27 is changed from a low to a high value. It is likely that this prediction has not been investigated because mechanical unfolding of I27 has so far been probed using only atom force microscopy (38), where high forces are used. It would be most welcome to study the unfolding behavior of I27 using LOT experiments to test our prediction.
Conclusions
We have proven that upward curvature in the unfolding rates as a function of a perturbation, which is linearly coupled to the energy function describing a protein in a solvent, implies that the underlying energy landscape is strongly multidimensional. The observation of upward curvature in the [] plots also implies that unfolding occurs by multiple pathways. In the case of f-induced unfolding of SH3 domain this implies that there is a continuous decrease in the flux of molecules that reach the unfolded state through the low-force pathway as f increases. The numerical results using model 2D free energy profiles allow us to conjecture that if a protein folds by parallel routes then the unfolding rate as a function of the linear perturbation must exhibit upward curvature. Only downward curvature in the [] plots can be interpreted using a single-barrier 1D free energy profile with a moving transition state or one with two sequential barriers (39, 40).
Our study leads to experimental proposals. For example, Förster resonance energy transfer (FRET) experiments especially when combined with force would be most welcome to measure the flux through the two paths identified for src SH3 domain. Our simulations suggest that the FRET labels between the RT loop and the protein core should capture the pathway switch, provided there is sufficient temporal resolution to observe the state with the RT loop unfolded. A more direct way is to block the RT-loop pathway with a disulfide bridge between the RT loop and the core, as we demonstrated using simulations, and assess if the unfolding rate decreases dramatically. Our work shows that the richness of data obtained in pulling experiments can only be fully explained by integrating theory and computations done under conditions that are used in these experiments.
Materials and Methods
Force-Dependent Rates for SH3 Domain Using Molecular Simulations.
The 56 residue Gallus gallus src SH3 domain from Tyr kinase consists of five β strands (PDB ID code 1SRL), which form β-sheets comprising the tertiary structure of the protein (Fig. 2A). Residues are numbered from 9 to 64. The details of the SOP-SC model are described elsewhere (36). A constant force is applied to the N-terminal end (residue 9) and residue 59 (Fig. 2A). We used Langevin dynamics, in the limit of high friction, to compute the f-dependent unfolding rates. We covered a range of forces from 12.5 pN to 45 pN generating between 50–100 trajectories at each force. From these unfolding trajectories, we calculated the first time the protein unfolded ( nm), thus obtaining a set of times , for trajectory j at force . We used umbrella sampling with weighted histogram analysis method (41) and low-friction Langevin dynamics (42) to calculate free energy profiles.
Maximum Likelihood Estimation.
For the set of M constant forces , with measurements of the unfolding time at each force, assuming exponential distribution of unfolding times , where the rate k depends on the force, the log-likelihood function is
[15] |
In the above equation is the unfolding time measured in the j-th trajectory at force . The exponential distribution allows us to take the sum over j and use the average unfolding time for each force,
[16] |
For each of the models (single- and double-exponential) the log-likelihood function L was numerically maximized with the set of data (from simulations or experiment). The two maximal values of L (for each model) were plugged into the Akaike information criterion (25) to calculate relative likelihood of the models, that is, the ratio of probabilities that the data are described by each of the models. The parameters that maximize L are used for fitting the [] plots.
Akaike Information Criterion.
The lower value of (where L is the log-likelihood function and n is the number of parameters in the model) indicates a more probable model, with the relative likelihood of models with and given by . Thus, for the comparison of Bell’s model () and double-exponential model (), the double exponential is more probable by a factor of , where and are found by maximizing L in Eq. 16.
Fits for the Titin I27 Bulk Experiment in Denaturant
We performed maximum likelihood fitting for the data in ref. 8 with single- and double-exponential models and compared the models using Akaike information criterion. The results are given in Table 1. Note the dramatic difference in the prefactors ( and ), obtained using a double-exponential fit, which is hard to explain. The difference in m values, if they correspond to the SASA in the transition state, does not seem to be meaningful. These observations suggest, just as in the case for force-induced unfolding, a de facto 1D fit does not yield physically meaningful results.
[] Plots for 2D Landscapes
To better illustrate the connection between the curvature of the plot and existence of parallel pathways, we performed Brownian dynamics simulations of force-dependent rate of escape of a particle from the bound state for the landscapes given in Fig. 1 of the main text. The resulting curves are given in Fig. 1 D–F. For each data point, we generated 8,192 trajectories. The Fig. 1A landscape is weakly multidimensional, so the plot does not exhibit upward curvature. For the landscape in Fig. 1B, two parallel pathways exist, and flux through the states depends on f as in experiments. The resulting curve has upward curvature. A double exponential fit is shown in Fig. 1D. The landscape in Fig. 1C gives rise to a more complex behavior.
Supplementary Material
Acknowledgments
This work was supported by National Science Foundation (NSF) Grant CHE 13-61946 and National Institutes of Health (NIH) Grant GM 089685. S.M. thanks the NIH and NSF for support.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1515730113/-/DCSupplemental.
References
- 1.Harrison SC, Durbin R. Is there a single pathway for the folding of a polypeptide chain? Proc Natl Acad Sci USA. 1985;82(12):4028–4030. doi: 10.1073/pnas.82.12.4028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wolynes PG, Onuchic JN, Thirumalai D. Navigating the folding routes. Science. 1995;267(5204):1619–1620. doi: 10.1126/science.7886447. [DOI] [PubMed] [Google Scholar]
- 3.Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins. 1995;21(3):167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
- 4.Guo Z, Thirumalai D, Honeycutt JD. Folding kinetics of proteins: A model study. J Chem Phys. 1992;97:525–535. [Google Scholar]
- 5.Leopold PE, Montal M, Onuchic JN. Protein folding funnels: A kinetic approach to the sequence-structure relationship. Proc Natl Acad Sci USA. 1992;89(18):8721–8725. doi: 10.1073/pnas.89.18.8721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Klimov DK, Thirumalai D. Symmetric connectivity of secondary structure elements enhances the diversity of folding pathways. J Mol Biol. 2005;353(5):1171–1186. doi: 10.1016/j.jmb.2005.09.029. [DOI] [PubMed] [Google Scholar]
- 7.Noé F, Schütte C, Vanden-Eijnden E, Reich L, Weikl TR. Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations. Proc Natl Acad Sci USA. 2009;106(45):19011–19016. doi: 10.1073/pnas.0905466106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wright CF, Lindorff-Larsen K, Randles LG, Clarke J. Parallel protein-unfolding pathways revealed and mapped. Nat Struct Biol. 2003;10(8):658–662. doi: 10.1038/nsb947. [DOI] [PubMed] [Google Scholar]
- 9.Aghera N, Udgaonkar JB. Kinetic studies of the folding of heterodimeric monellin: Evidence for switching between alternative parallel pathways. J Mol Biol. 2012;420(3):235–250. doi: 10.1016/j.jmb.2012.04.019. [DOI] [PubMed] [Google Scholar]
- 10.Sosnick TR, Barrick D. The folding of single domain proteins--Have we reached a consensus? Curr Opin Struct Biol. 2011;21(1):12–24. doi: 10.1016/j.sbi.2010.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lindberg MO, Oliveberg M. Malleability of protein folding pathways: A simple reason for complex behaviour. Curr Opin Struct Biol. 2007;17(1):21–29. doi: 10.1016/j.sbi.2007.01.008. [DOI] [PubMed] [Google Scholar]
- 12.Aksel T, Barrick D. Direct observation of parallel folding pathways revealed using a symmetric repeat protein system. Biophys J. 2014;107(1):220–232. doi: 10.1016/j.bpj.2014.04.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mickler M, et al. Revealing the bifurcation in the unfolding pathways of GFP by using single-molecule experiments and simulations. Proc Natl Acad Sci USA. 2007;104(51):20268–20273. doi: 10.1073/pnas.0705458104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Stigler J, Ziegler F, Gieseke A, Gebhardt JCM, Rief M. The complex folding network of single calmodulin molecules. Science. 2011;334(6055):512–516. doi: 10.1126/science.1207598. [DOI] [PubMed] [Google Scholar]
- 15.Kotamarthi HC, Sharma R, Narayan S, Ray S, Ainavarapu SRK. Multiple unfolding pathways of leucine binding protein (LBP) probed by single-molecule force spectroscopy (SMFS) J Am Chem Soc. 2013;135(39):14768–14774. doi: 10.1021/ja406238q. [DOI] [PubMed] [Google Scholar]
- 16.Jagannathan B, Elms PJ, Bustamante C, Marqusee S. Direct observation of a force-induced switch in the anisotropic mechanical unfolding pathway of a protein. Proc Natl Acad Sci USA. 2012;109(44):17820–17825. doi: 10.1073/pnas.1201800109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Guinn EJ, Jagannathan B, Marqusee S. Single-molecule chemo-mechanical unfolding reveals multiple transition state barriers in a small single-domain protein. Nat Commun. 2015;6:6861. doi: 10.1038/ncomms7861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hinczewski M, Gebhardt JCM, Rief M, Thirumalai D. From mechanical folding trajectories to intrinsic energy landscapes of biopolymers. Proc Natl Acad Sci USA. 2013;110(12):4500–4505. doi: 10.1073/pnas.1214051110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hyeon C, Thirumalai D. Measuring the energy landscape roughness and the transition state location of biomolecules using single molecule mechanical unfolding experiments. J Phys Condens Matter. 2007;19:113101. [Google Scholar]
- 20.van Kampen N. Stochastic Processes in Physics and Chemistry. North-Holland; Amsterdam: 1992. [Google Scholar]
- 21.Hyeon C, Thirumalai D. Forced-unfolding and force-quench refolding of RNA hairpins. Biophys J. 2006;90(10):3410–3427. doi: 10.1529/biophysj.105.078030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Marshall BT, et al. Direct observation of catch bonds involving cell-adhesion molecules. Nature. 2003;423(6936):190–193. doi: 10.1038/nature01605. [DOI] [PubMed] [Google Scholar]
- 23.Buckley CD, et al. Cell adhesion. The minimal cadherin-catenin complex binds to actin filaments under force. Science. 2014;346(6209):1254211. doi: 10.1126/science.1254211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chakrabarti S, Hinczewski M, Thirumalai D. Plasticity of hydrogen bond networks regulates mechanochemistry of cell adhesion complexes. Proc Natl Acad Sci USA. 2014;111(25):9048–9053. doi: 10.1073/pnas.1405384111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr. 1974;19:716–723. [Google Scholar]
- 26.Liu Z, Reddy G, O’Brien EP, Thirumalai D. Collapse kinetics and chevron plots from simulations of denaturant-dependent folding of globular proteins. Proc Natl Acad Sci USA. 2011;108(19):7787–7792. doi: 10.1073/pnas.1019500108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li MS, Klimov DK, Thirumalai D. Thermal denaturation and folding rates of single domain proteins: Size matters. Polymer (Guildf) 2004;45:573–579. [Google Scholar]
- 28.Yang WY, Gruebele M. Folding at the speed limit. Nature. 2003;423(6936):193–197. doi: 10.1038/nature01609. [DOI] [PubMed] [Google Scholar]
- 29.Kubelka J, Hofrichter J, Eaton WA. The protein folding ‘speed limit’. Curr Opin Struct Biol. 2004;14(1):76–88. doi: 10.1016/j.sbi.2004.01.013. [DOI] [PubMed] [Google Scholar]
- 30.Guo Z, Thirumalai D. Kinetics of protein folding: Nucleation mechanism, time scales, and pathways. Biopolymers. 1995;36(1):83–102. [Google Scholar]
- 31.Klimov DK, Thirumalai D. Stretching single-domain proteins: Phase diagram and kinetics of force-induced unfolding. Proc Natl Acad Sci USA. 1999;96(11):6166–6170. doi: 10.1073/pnas.96.11.6166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Du R, Pande VS, Grosberg AY, Tanaka T, Shakhnovich EI. On the transition coordinate for protein folding. J Chem Phys. 1998;108:334. [Google Scholar]
- 33.Hammond GS. A correlation of reaction rates. J Am Chem Soc. 1955;77(2):334–338. [Google Scholar]
- 34.Konda SSM, Brantley JN, Bielawski CW, Makarov DE. Chemical reactions modulated by mechanical stress: Extended Bell theory. J Chem Phys. 2011;135(16):164103. doi: 10.1063/1.3656367. [DOI] [PubMed] [Google Scholar]
- 35.Konda SSM, et al. Molecular catch bonds and the anti-Hammond effect in polymer mechanochemistry. J Am Chem Soc. 2013;135(34):12722–12729. doi: 10.1021/ja4051108. [DOI] [PubMed] [Google Scholar]
- 36.Zhuravlev PI, Reddy G, Straub JE, Thirumalai D. Propensity to form amyloid fibrils is encoded as excitations in the free energy landscape of monomeric proteins. J Mol Biol. 2014;426(14):2653–2666. doi: 10.1016/j.jmb.2014.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Neudecker P, et al. Structure of an intermediate state in protein folding and aggregation. Science. 2012;336(6079):362–366. doi: 10.1126/science.1214203. [DOI] [PubMed] [Google Scholar]
- 38.Rief M, Gautel M, Oesterhelt F, Fernandez JM, Gaub HE. Reversible unfolding of individual titin immunoglobulin domains by AFM. Science. 1997;276(5315):1109–1112. doi: 10.1126/science.276.5315.1109. [DOI] [PubMed] [Google Scholar]
- 39.Merkel R, Nassoy P, Leung A, Ritchie K, Evans E. Energy landscapes of receptor-ligand bonds explored with dynamic force spectroscopy. Nature. 1999;397(6714):50–53. doi: 10.1038/16219. [DOI] [PubMed] [Google Scholar]
- 40.Hyeon C, Thirumalai D. Multiple barriers in forced rupture of protein complexes. J Chem Phys. 2012;137(5):055103. doi: 10.1063/1.4739747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA. The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J Comput Chem. 1992;13:1011–1021. [Google Scholar]
- 42.Honeycutt JD, Thirumalai D. The nature of folded states of globular proteins. Biopolymers. 1992;32(6):695–709. doi: 10.1002/bip.360320610. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.