Abstract
The origin of protein folding thermodynamics is examined in terms of the energy landscape, employing an off-lattice protein model with scaled non-native attractions, which is continuously tunable between a Go-like model and a highly frustrated system. Extensive statistical temperature molecular dynamics simulations, combined with inherent structure analysis, reveal the intimate connection between the global geometric properties of the energy landscape and the statistical temperature. The basin depth of the energy landscape is shown to play a key role in the first-order-like characteristics of the statistical temperature, which are easily identified by the squared modulus of the potential energy gradient in the microcanonical ensemble.
PACS number(s): 87.15.Cc, 05.70.–a, 87.15.A–
Protein folding—i.e., how a polypeptide chain reaches a unique native state for a given sequence of amino acids [1]—is one of the fundamental problems in biological science. While significant advances have been made [2–9], fundamental open questions still remain. For example, one would like to understand the remarkable observation that most single-domain proteins exhibit a cooperative, two-state folding transition with no detectable intermediates [2].
Energy landscape theory [3–6] provides a theoretical framework for protein folding, in which the connection between a proteinlike sequence and a minimally frustrated energy landscape is rationalized via a folding funnel [4]. A good folding sequence is characterized by a smooth potential energy landscape (PEL), with an overall funnel shape biased to the native state, while a poorly designed sequence like a random heteropolymer has a rugged PEL susceptible to long-lived kinetic trapping [7]. Many simulation studies have been devoted to elucidate the determinants of a proteinlike system, such as the ratio of the folding and glass transition temperatures [3], the stability gap between a native state and low-lying excited states [8], and the collapse cooperativity [9].
Recently, Mazzoni and Casetti [10] proposed the fluctuations of the energy landscape curvature, being the potential energy, as a determinant. A dramatic enhancement of the fluctuations, , in the vicinity of the folding temperature was observed in the proteinlike folding of well-designed sequences, where ⟨⋅⟩ denotes an equilibrium average. The enhancement in is attributed to effective two-state dynamics between the native and denatured states with different curvatures [10].
Two-state protein folding is analogous to a first-order phase transition in finite-size systems [11–13]. Indeed, a recent study [14] reveals that the “backbending” ( bend) in the statistical temperature, or microcanonical caloric curve, characteristic of a first-order-like phase transition, fully characterizes protein folding and clearly distinguishes good folders from poor ones. The question naturally arises: how does the global geometry of the PEL translate into the behavior of the statistical temperature? This paper addresses the issue by establishing a rigorous connection between the statistical temperature and various energy landscape measures, leading to the identification of the principal landscape determinants of two-state folding thermodynamics.
The key quantity in our study is the statistical temperature in the microcanonical ensemble, being the microcanonical entropy. It is well known that in a finite-size system such as a protein, the microcanonical ensemble is well suited to the analysis of phase transitions [11–14]. Around a first-order phase transition, shows a convex dip, with , corresponding to backbending in or a negative specific heat. Backbending is associated with an extremum condition in the Helemholtz free energy at , which has three roots , satisfying . The multiple roots are associated with two-state transition behavior between free-energy minima at and , involving the crossing of a transition barrier at , while the transition becomes downhill-like without a barrier for the monotonic Thus, all essential features of the transitions occurring in good and poor folders are completely characterized by the behavior of .
Our analysis linking to the PEL is based on the thermodynamic identity between the statistical temperature and the configurational temperature originally introduced by Rugh [15] and later generalized by Jepps et al. [16]:
(1) |
where the subscript “” implies that the configurational temperature is evaluated as the fraction of two separate microcanonical averages, and [16]. The significance of Eq. (1) is that it allows a rigorous mathematical connection between the purely statistical quantity and the geometric structure of the energy landscape via the force, , and the curvature, .
All-atom simulations considering the protein and explicit solvent are the most accurate, but still computationally demanding. Thus a simplified description, retaining the main features of real proteins, but reducing the number of degrees of freedom through a coarse graining, is especially appealing for the characterization of protein-folding mechanisms over long time scales. The newly suggested folding determinants [10,14] have been examined in coarse-grained models, with varied sequences exhibiting a different foldability.
Here we employ the scaled Honeycutt-Thirumalai (SHT) off-lattice model [17] to mimic the effect of sequence variations and to examine how two-state folding results from the global shape of the PEL and the emerging backbending behavior of . The original HT model [18] has a -barrel native state and consists of 46 beads with three different types: hydrophobic , hydrophilc , and neutral . We modified the model to incorporate scaled, Go-like interactions by modulating the attractions between non-native contacts, which allows, for the original HT sequence , a continuous tuning from a highly frustrated PEL to a weakly rugged funnel. The potential energy is a sum of bond stretching , bending , torsion ), and nonbonded interactions ():
(2) |
where , and . The stretching and bending constants are and , respectively, and being energy and length units, with and . Torsion parameters are for a quartet of successive beads with zero or one bead, while otherwise and .
The nonbonded term in Eq. (2) contains the scaling factor to modulate the non-native attractions as , where for pairs, and for and pairs, and and for all other pairs involving . Here if and bead belongs to the non-native contact set, and otherwise . As approaches zero, the second term in introduces a strong native bias in denatured states by prohibiting close contacts between non-native pairs. All 57 native contact pairs are identified with the cutoff , based on the global minimum structure of the original HT model .
Due to the high degree of frustration in the original HT model [18,19], an enhanced sampling technique is indispensable to obtain a correct conformational sampling for both the transition and low-temperature regions, overcoming ergodicity breaking. We applied the statistical temperature molecular dynamics (STMD) algorithm [19], which generates a flat energy sampling via the dynamic refinement of the statistical temperature (for details see Ref. [19]) with four different scalings at , and 0. All STMD simulations have been performed to sample the temperature range from 0.07 to 1.3, with the average kinetic energy fixed at . A total of 8 × 104 configurations for each simulation were collected to determine the canonical and microcanonical thermodynamics, and minimized using the conjugate-gradient algorithm to map inherent structures (ISs) [20].
As summarized in Table I, the folding and collapse temperatures, identified by peaks in the -dependent heat capacities and order parameter fluctuations in Figs. 1(a) and 1(b), respectively, illustrate that the original HT model (SHT with ) is a poor folder and the SHT model with is an intermediate folder, while and 0 correspond to good folders according to the classification based on the collapse cooperativity, [9]. The validity of this classification is also seen in the profiles of the curvature fluctuations in Fig. 1(c). A marked peak shows up close to for good folders, while the plots are featureless in poor and intermediate folders [10]. Note that the temperature , corresponding to the maximum curvature fluctuation in Fig. 1(c), coincides with for good folders, implying that is also an effective folding indicator.
TABLE I.
Scaling parameter | ||||
---|---|---|---|---|
1.0 | 0.19 | 0.67 | 0.68 | 0.71 |
0.9 | 0.49 | 0.65 | 0.64 | 0.25 |
0.5 | 0.61 | 0.67 | 0.62 | 0.09 |
0.0 | 0.59 | 0.62 | 0.60 | 0.05 |
The statistical temperature , determined by STMD in Fig. 2, shows a dramatic change as the energy landscape changes with decreasing . The smooth, monotonically varying in the original HT model first starts to show a plateau, or weak backbending, and a substantial backbending appears as falls below 0.5. Remarkably, the temperatures corresponding to the plateau and backbending regions in with exactly match and , respectively. The alteration of the folding mechanism from second-order-like collapse into first-order-like folding is clearly seen through the behavior of .
We find that the folding in Go-like models occurs through the transient formation of a three-strand core near and the immediate attachment of the terminal strand to attain the native state at . A significant slowing down is observed in the folding of the original HT model with due to trapping in misfolded states. Since is the external control parameter in the microcanonical ensemble (playing the role of the temperature in the canonical ensemble), we also identify the characteristic energies to describe the folding transition as a function of , in the case of . In Fig. 2, both and approximately define the plateau region in , and both and bound the collapse region associated with the backbending of , . The good folder with starts to collapse as crosses , and completes its folding once passes through .
Assuming that STMD samples all configurations with a given , the microcanonical average in Eq. (1) is directly computed by binning in energy and collecting configurations corresponding to each bin. The configurational temperature with unit bin size shows good agreement with in both and 0, illustrating that the thermodynamic identity, Eq. (1), is robust even with a smaller system size.
To illustrate how the PEL controls , and hence the folding, via Eq. (1), both and are plotted in Figs. 3(a) and 3(b), respectively. For all scalings, the curvature monotonically increases with decreasing , showing that folded (denatured) states are associated with steeper (shallow) potential wells. The primary feature differentiating good from poor folders in Fig. 3(a) is that good folders have nearly constant curvature for , suggesting that their denatured states are structurally homogeneous. This finding is in accordance with our expectations for Go-like interactions. The main frustration of the original HT model arises from the conformational diversity of the collapsed states [18]. The native-centric in Go-like models significantly suppresses the frustration from hydrophobic mismatches of strands during the collapse by reducing non-native interstrand contacts and also makes the unfolded landscape less structured. With decreasing in good folders, the flat begins to increase steeply between and , the backbending and plateau regions of , contributing to the large curvature fluctuation and to the bimodal energy distributions [10]. By contrast, in poor folders the slope of is nearly constant, causing no structure in and corresponding to the fact that all systems studied have nearly identical at the highest and lowest .
Turning to , the most remarkable aspect of Fig. 3(b) is that retains all essential features of . Despite the discussion in the last paragraph, the dynamic range of is small and the variation of is mostly encoded into . What is the underlying physics relating to ? Some insight is gained by noting that is a measure of the geometric distance between the sampled configurations at and their associated ISs [20]. Indeed, the microcanonical basin depth of the PEL, quantified as being the average IS energy, shows a remarkable resemblance to in Fig. 4(a). More quantitatively, in a harmonic approximation, the canonical basin depth obeys being the number of degrees of freedom. Replacing the temperature by gives , which unveils a direct link translating the global geometry of the PEL into the proteinlike folding markers of . We checked the validity of the harmonic approximation by comparing and its harmonic counterparts, and [21]. As shown in Fig. 4(b), both independently derived and display the essential characteristics of .
The basin depth of the PEL as a function of is most directly seen in the profile of in Fig. 4(c). A sharp drop in , to a constant value, occurs for good folders between and . This means that the system quickly reaches the native state basin as crosses , while shows a slow decay due to residual frustrations among collapsed states in poor and intermediate folders. The plateau regions in for reflect the homogeneous inherent structures in characteristic of the denatured states of good folders as observed in in Fig. 3(a). Our analysis suggests that the Go-like model has a near-Levinthal-like landscape [22], in which the denatured state landscape is gently slanted, while the folded state landscape has a steep funnel to the native state, resulting from the native-centric interactions stabilizing the native state while homogenizing the denatured states.
In summary, we have shown that a sharp variation in the basin depth, , from the denatured states to the native state, gives rise to the first-order-like folding in good folders. The basin depth is reflected in the statistical temperature through the relations . Two-state protein folding is associated with marked transitions of the potential energy gradient , the average IS energy , and the curvature .
Acknowledgments
We are grateful to the National Science Foundation (Grant No. CHE-0750309) and the National Institutes of Health (Grant No. RO1 GM076688) for the generous support of our research.
References
- [1].Anfinsen C, Science 181, 223 (1973). [DOI] [PubMed] [Google Scholar]
- [2].Fersht AR, Curr. Opin. Struct. Biol 7, 3 (1997); [DOI] [PubMed] [Google Scholar]; Jackson SE, Folding Des 3, R81 (1998). [DOI] [PubMed] [Google Scholar]
- [3].Bryngelson JD and Wolynes PG, Proc. Natl. Acad. Sci. U.S.A 84, 7524 (1987). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Leopold PE, Montal M, and Onuchic JN, Proc. Natl. Acad. Sci. U.S.A 89, 8721 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Wales DJ, Energy Landscapes (Cambridge University Press, Cambridge, England, 2003). [Google Scholar]
- [6].Dill KA and Chan HS, Nat. Struct. Biol 4, 10 (1997). [DOI] [PubMed] [Google Scholar]
- [7].Nymeyer H, Garcia AE, and Onuchic JN, Proc. Natl. Acad. Sci. U.S.A 95, 5921 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Sali A, Shakhnovich EI, and Karplus M, Nature (London) 369, 248 (1994). [DOI] [PubMed] [Google Scholar]
- [9].Klimov DK and Thirumalai D, Phys. Rev. Lett 76, 4070 (1996). [DOI] [PubMed] [Google Scholar]
- [10].Mazzoni LN and Casetti L, Phys. Rev. Lett 97, 218104 (2006); [DOI] [PubMed] [Google Scholar]; Phys. Rev. E 77, 051917 (2008). [Google Scholar]
- [11].Gross DHE and Kenney JF, J. Chem. Phys 122, 224111 (2005). [DOI] [PubMed] [Google Scholar]
- [12].Wales DJ and Berry RS, Phys. Rev. Lett 73, 2875 (1994). [DOI] [PubMed] [Google Scholar]
- [13].Junghans C, Bachmann M, and Janke W, Phys. Rev. Lett 97, 218103 (2006). [DOI] [PubMed] [Google Scholar]
- [14].Hernandez-Rojas J and Gomez Llorente JM, Phys. Rev Lett 100, 258104 (2008). [DOI] [PubMed] [Google Scholar]
- [15].Rugh HH, Phys. Rev. Lett 78, 772 (1997). [Google Scholar]
- [16].Jepps OG, Ayton G, and Evans DJ, Phys. Rev. E 62, 4757 (2000). [DOI] [PubMed] [Google Scholar]
- [17].Kim J and Keyes T, J. Phys. Chem. B 112, 954 (2008). [DOI] [PubMed] [Google Scholar]
- [18].Honeycutt JD and Thirumalai D, Proc. Natl. Acad. Sci. U.S.A 87, 3526 (1990). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Kim J, Straub JE, and Keyes T, Phys. Rev. Lett 97, 050601 (2006); [DOI] [PubMed] [Google Scholar]; J. Chem. Phys 126, 135101 (2007).17430069 [Google Scholar]
- [20].Stillinger FH and Weber TA, Phys. Rev. A 28, 2408 (1983); [Google Scholar]; Science 225, 983 (1984). [DOI] [PubMed] [Google Scholar]
- [21].In a harmonic approximation to the PEL around each IS configuration and is the Hessian at the IS. Denoting .
- [22].Kaya H and Chan HS, Proteins: Struct., Funct., Genet 52, 510 (2003). [DOI] [PubMed] [Google Scholar]