Abstract
We study the competition between topological effects and sequence inhomogeneities in determining the thermodynamics and the un/folding kinetics of a β-hairpin. Our work utilizes a new exactly solvable model that allows for arbitrary configurations of native contacts. In general, the competition between heterogeneity and topology results in a crossover of the dominant transition state. Interestingly, near this crossover, the single reaction coordinate picture can be seriously misleading. Our results also suggest that inferring the folding pathway from unfolding simulations is not always justified.
In their important work (1), Muñoz et al. discovered a sharp un/folding transition between two dominant thermodynamic ensembles during hairpin formation in the small peptide, the 16-residue C-terminal fragment of streptococcal protein G (sequence GEWTYDDATKTFTVTE). In such a system, it is often assumed that the formation of a folding droplet (2, 3) starting from the β-turn (4, 5) is the determining factor in transition kinetics. However, given that peptides are inherently inhomogeneous, nucleation sites might develop preferentially at strong heterogeneities and thereby change the transition kinetics. This was predicted to be the case for this specific system by a recent unfolding simulation (6). Thus, we are motivated to study the factors that govern the competition between topological effects such as turns and heterogeneity.
To accomplish this, we need a model that allows for all of these partially folded configurations to occur and compete. The popular “single sequence stretch approximation” (4) does not suffice in this respect. As introduced by Schellman (7) and used, for example, in refs. 8–10, this approach assumes that one need keep only configurations with a contiguous native peptide conformation and furthermore use only a discrete degree of freedom for native vs. nonnative backbone states. It is clear, however, that this approach rules out the possibility of heterogeneity dominance. For the aforementioned fragment, requiring contiguous native conformations disallows early formation of the hydrophobic cluster with contacts W-V and Y-F.
One could update this model as suggested in refs. 3 and 8–10 and include multiple sequences. But we prefer instead to use an exactly solvable hairpin model, which we introduce below. Aside from its capability of allowing nucleation at the heterogeneous “hot spots,” our model has the added advantage of correctly treating the fact that the backbone lives in three-dimensional continuous space. Within this model, we study the influence of heterogeneity on the overall cooperativity of the transition as well as on the aforementioned transition kinetics. We also comment on the validity of the assumption that folding and unfolding use related pathways and on the validity of determining kinetic barriers from a landscape based on a single reaction coordinate.
The Model
We consider a polymer composed of two interacting Gaussian strands connected by a β-turn. Our basic assumption (11) is that the effective Hamiltonian can be written in terms of the spatial coordinates of the interacting residues that form native interstand H-bonds, x⃗i(s); here i is a residue index counted from the β turn (i = 0), and the superscript s = N, C stands for the N-, C-terminal directed polymer. An exact model requires, of course, including nonnative interactions as well as loop self-avoidance effects; we expect these to be relatively unimportant for properly designed sequences (that favor native terms), which fold into a linear hairpin with no need for packing together multiple pieces of the peptide (limiting the role of self avoidance). We can decompose the coordinates into the (irrelevant) mean coordinate X⃗i ≡ 1/2 (x⃗i(N) + x⃗i(C)) and the interstrand separation x⃗i ≡ (x⃗i(N) − x⃗i(C)). Also, we use the notation Δi = 1 or 0 to indicate whether the ith residue pair is in contact. Specifically, we define Δi = 1 if the local interstrand distance |x⃗i| falls into an effective attraction window |x⃗i| ≤ r0,i and 0 otherwise. This “box” approximation has also been used elsewhere (11, 12).
The first component of our Hamiltonian accounts for the entropy of unfolded segments. Specifically, for a completely unfolded hairpin from residue i = n (distal end) to i = 0, the loop entropy is governed by a Gaussian Hamiltonian 1/2 Σi=1n κi | ∇ x⃗i |2 + κ0/2 | x⃗0(N) − x⃗0(C) |2, where ∇x⃗i = x⃗i − x⃗i−1 is the vector connected nearest-neighbor residue pairs, and κi is the unfolded local backbone stiffness. On the other hand, for any residue i that is in contact, we approximate r0,i ∼ 0 (as compared to (βκi)−1/2) and replace the Gaussian term of contiguous unfolded residues by the term λi | x⃗i±1|2. Finally, we assume that possible H-bonds start from i = 1 and integrate out x⃗0(N,C). Thus, we obtain the loop Hamiltonian
1 |
where Jij = 1 if i, j are nearest neighbors and 0 otherwise, σ = κ0/(κ0 + κ1), and γij = λi/κj.
The second ingredient of the Hamiltonian is the interstrand interactions Hint. First, there is a pairwise interaction to mimic the native H-bond formation and side-chain packing between the two strands. This has the form − V1,iΔi with V1,i ≥ 0. Second, there is a term representing the possible formation of hydrophobic side-chain clusters. Because the side-chain orientations are aligned alternatively in the hairpin structure (see figure 1 in ref. 1), the typical situation leading to clusters is the presence of two hydrophobic residues at locations h and h + 2, with one of them carrying a large aromatic ring (tryptophan, e.g.). From the unfolding simulation (6), the compactness and hence the solvation of the hydrophobic cluster are improved if both residues are in the contact position. This leads to a many-body term φ ≡ −V2ΔhΔh+2. In general, we may have more complicated high-order terms because of the presence of multiple hydrophobic residues. Our entire Hamiltonian is then
2 |
The Partition Function
The partition function of our model is given by
3 |
where d = 3 is the dimensionality of interest. It will be of interest to decompose this sum into terms with a fixed number of contacts Q (6, 13). Specifically, we define Zn,Q as the thermodynamic probabilities of the subensembles specified by the number of contacts Q. We will now discuss briefly how to develop a recursion relation that allows for the computation of these objects. A more detailed discussion, albeit for a homogeneous model, appears elsewhere (11).
Zn,Q is determined by the energy gain of Q local contacts and the entropy of the intervening unfolded segments. To compute this entropy, we define a functional Mij as the path integral of a completely unfolded segment connecting residue j and i, weighted by additional terms on the two ends,
4 |
with the sums running from j to i in Eq. 1 for Hloop. It is easy to derive the recursive relationship
5 |
supplemented by the boundary condition Mj,j (μ1, μ2) = [μ1 μ2/(μ1 + μ2)]3/2 − qj, with qi = r0,i3 []3. By definition, Mi,j (μ1, μ2) = 0 if j > i. For consistency with the approximations utilized above, we should set q = 0 in these formulas.
Next, we define Wn,Q as the ensemble specified by contact number Q with the further constraint that the polymer distal end i = n is in the contact position; we have Ws,a = 0 if a ≤ 0. Then, as in refs. 1, 3, 8, 10, and 11, we can use these as building blocks to construct Zn,Q. Specifically, in the absence of any clustering effect, we have for 1 ≤ s ≤ n and 1 ≤ Q ≤ s,
6 |
7 |
with ξk = γk+1,k−1, ξ′k = γk,k+1−1 for 1 ≤ k ≤ n, ξ0 = 1/σ, and Zs,0 = Ms,1 (∞, ξ0) (the coil state with Q = 0). Here, the strength of the native contacts is determined by αs ≡ qseβV1,s, which is assumed to remain finite even as we take the limit of small q.
Finally, we generalize this formula to include the clustering effect. We assume one hydrophobic cluster involving the two side-chain groups at residues h and h + 2. The recursion relations Eq. 7 will be unchanged for s ≠ (h + 2), but for s = h + 2, we have
8 |
We next turn to an evaluation of the equilibrium phase diagram of our model.
Phase Diagrams
It was shown in ref. 11 that one can obtain two-state behavior in homogeneous models of this class for large enough values of the folding-induced stiffness γ. Here, our first concern is the effect that the heterogeneities can have on system cooperativity. To address this issue, we consider the simple situation of constant κ, γ, V1, r0, and one hydrophobic cluster. We assume this cluster gives rise to a many-body term as detailed above and also, via side-chain packing, to an increase δV1 of the (otherwise homogeneous) contact energy for the two hydrophobic residues.
To present our results, we rescale all the energies by V1 and introduce the dimensionless parameter ν0 via q = ν0 [V1/kBT]3/2. In Fig. 1a, we plot for the case of ν0 = 10−4 the variation of the folding temperature Tf as we change the polymer length n. In general, there is a maximal polymer length nmax that allows for a two-state transition between the completely folded hairpin Zn,n and the coil state Zn,0; above nmax, the folding transition is defined as between the coil state and a dominant partially folded ensemble. We should mention that such a dominant ensemble is not fixed; as the temperature is reduced, it moves to the completely folded state. Also, a larger γ corresponds to a bigger nmax, because more entropy will be lost in configurations that contain interleaved folded and unfolded segments (11).
We now focus on the effect of heterogeneity. For moderate V2 and δV1, Tf is slightly shifted, but nmax is more or less unchanged. As V2 is increased further, however, the partially folded state with an intact cluster becomes more significant, and the system loses two-state behavior between Zn,n and Zn,0 at smaller n (Fig. 1 b and c). The extent to which one would be able to detect the presence of important partially folded states in the thermodynamics depends of course on the sensitivity of the probes to whether the hairpin is fully folded vs. their sensitivity to more local features, such as whether the hydrophobic cluster has been formed.
Simulations
To get some idea how folding and unfolding occur in our model, we performed Langevin simulations on 50 independent hairpins, using the Verlet algorithm proposed in ref. 14. Specifically, we have the time step Δt = 10−4 with the mass m and V1 set to unity. The box potential Δi (|x⃗i|) is approximated by exp[−(|x⃗i|/r0)s] with s = 100. The damping coefficient is taken from ref. 14. Also, we used a stiffness κ = 0.1 to allow for good visualization of the individual residue kinetics; too large a κ would force all residues to un/fold almost simultaneously.
For small inhomogeneity, we found, as expected, that the topology determined the folding pathway, i.e., that folding took place from the β-turn and unfolding from the distal end (data not shown). In the case of strong heterogeneity, however, we observed dramatic changes to the un/folding patterns (Fig. 2). Now, the hairpin unfolds from both the β-turn and distal end (similar to the observations in ref. 6), leaving the decomposition of the hydrophobic cluster for last. Folding occurs in the reverse manner, with the cluster forming first. Thus, there is a crossover in the kinetics from being topology dominated to being heterogeneity controlled. Next, we analyze this crossover by studying the free energy landscape of our model.
The Transition Kinetics
We begin by assuming that the transition rate has an Arrhenius form ktx ∼ e−βΔF, where ΔF is the free energy difference between the transition and metastable states. In our system, this is just the thermodynamic probability ratio between the transition and metastable ensembles. In the absence of heterogeneity, the metastable states are simply the coil state for folding (i.e., if T < Tf) or the folded state (if T > Tf). In the presence of heterogeneity, however, there might be additional metastable states (6). As in the simulations, we focus our discussion on the case of having a “hot-spot” near the hairpin center. Specifically, two cases are considered: (i) two hydrophobic residues at locations h and h + 2 with δV1,i = δV1 (δi,h + δi,h+2) plus an additional clustering energy V2, and (ii) a single hydrophobic residue at the location h + 2 with δV1,i = δV1 δi,h+2.
To clarify the transition rate and hence the rate-limiting structure, we use the droplet concept (2). A droplet is a configuration containing one or more sets of contiguous folded residues, and our basic method involves computing the free energy of differing droplets. For the homogeneous case, it is easy to see that we need consider only two possible droplets: one folded from the β-turn (the β-turn droplet) and one from the distal end (the distal-end droplet). The thermodynamic probabilities of these two droplets, defined as Yn,Q(β), Yn,Q(d) with the number of contacts Q can be obtained simply by using the M functionals Eq. 4 (11).
The situation is more complicated in the presence of heterogeneity. Now, we should also consider droplets surrounding the heterogeneity (a “hetero-droplet”) with a thermodynamic probability Yn,Q(h). These configurations contain contiguous unfolded segments between the heterogeneity and topological constraints. In the single heterogeneity case, a hetero-droplet with Q contacts might contain one (or two) unfolded “bubbles” between either the β-turn or the distal end and the heterogeneity (or both); for the cluster case, the situation can be even more complicated, as the intervening residue i = h + 1 might also be in or out of contact. Obviously, there can be many configurations with the same number of contacts. Hence, we define the hetero-droplet as the configuration with the maximal thermodynamic probability. Fig. 3 a1,2 and b1,2 show the hetero-droplets in the single and double heterogeneity cases.
We determine the favored transition pathway by determining the barrier heights separating differing Q values for a set of droplets. Typical results from calculating these free energies are shown in Figs. 4a and 5a for the two different heterogeneity models. For unfolding, the model exhibits a metastable intermediate state at Q = h + 2 = 6. The rate-limiting step then involves starting from this state and breaking additional H-bonds between the heterogeneity and the β-turn. For the case of the more complex heterogeneity model in Fig. 5, there is an additional intermediate state at smaller Q. For this state to unfold fully, one needs to traverse an additional barrier, which for the case shown is smaller than the previous one. Hence for unfolding, both models yield similar pathways.
For folding, however, the presence of the new intermediate state has a large effect; we note in passing that the maximal number of intermediates is correlated with the number of heterogeneities. In the case of a single heterogeneity, folding proceeds exactly as the reverse of unfolding. For the clustered inhomogeneity, on the other hand, the largest barrier is the first one encountered from the coil state, namely the formation of the cluster by traversing the Q = 1 transition state.
Overall, there is only a very narrow range of parameters for which two-state behavior coexists with heterogeneity-dominated kinetics; we therefore expect that the experimental system lies in the zipping kinetics regime. Note that at higher temperatures, the range of heterogeneous domination broadens; this might account for the disagreement between the high T results of ref. 6 and other treatments to date (1). Two other points are worth mentioning. We found that the crossover at a smaller β-turn rigidity (σ) requires less heterogeneity, consistent with a recent 33-residue GCN4 peptide experiment (15). Also, we examined the validity of directly using the free energy profile with Q as the single reaction coordinate (Zn,Q in Eq. 1) to predict the kinetic barrier. Clearly, this approach lumps together all states at a given Q as opposed to the droplet configurations that have a fixed pattern of the Q native contacts. Near the crossover, we observed a systematic underestimate of the free energy barrier via the single reaction coordinate landscape method (Fig. 5c). A possible reason for this is that near the crossover, the system has two dominant transition ensembles that are kinetically unconnected, as also noticed in ref. 8.
Discussion
In this paper, we utilized a new exactly solvable β-hairpin model to investigate a competing droplet picture of folding and unfolding. In this model, folding energies occur only if the residues are very close to each other, and this allows us to give a discretized labeling of configurations without losing the ability to get reliable estimates of the loop entropies. Our approach does not suffer from the difficulties of the single sequence stretch approximation, in that it includes all possible arrangements of native structure, not merely contiguous ones (1, 3, 8, 10). Our results were obtained by detailed analysis of the free energy landscape as well as by direct simulation. The most important finding shows that there is a direct competition between pathways dominated by topological considerations vs. those dominated by heterogeneous free energy terms. This leads to a crossover in the structure of the transition state, which could be seen experimentally as one varies the strength and location of any hydrophobic clusters and/or the rigidity of the β-turn. Our results suggest that if we demand that the system have real two-state behavior, it is likely that the fold/unfolding will be topology dominated.
Interestingly, the single reaction coordinate approach becomes rather inaccurate near this crossover as it lumps together parts of configuration space that share the same number of contacts but are in fact not connected by the model kinetics. Also, we show that it can be incorrect to assume that the most significant free energy barriers are the same for folding and unfolding. Because it is common practice to simulate unfolding and to infer thereby the computationally inaccessible folding process, this caveat will be important to keep in mind.
Of course, no model (including all-atom potentials) can be perfect. We have used a simplified Gaussian approximation for the backbone and have also neglected all nonnative contacts. Our study was for a single heterogeneous region (either one residue or one cluster), but one can extend our methodology to the multiple heterogeneity case. We are hopeful that this modeling approach will serve as a tractable way of putting in real biological complexity without sacrificing the ability to do a full analysis of the folding kinetics.
Acknowledgments
H.L. and C.G. acknowledge the support of the National Science Foundation under Grant DMR98-5735. D.A.K. acknowledges the support of the Israel Science Foundation.
Footnotes
This paper was submitted directly (Track II) to the PNAS office.
Article published online before print: Proc. Natl. Acad. Sci. USA, 10.1073/pnas.190103297.
Article and publication date are at www.pnas.org/cgi/doi/10.1073/pnas.190103297
References
- 1.Muñoz V, Thompson P A, Hofrichter J, Eaton W A. Nature (London) 1997;390:196–199. doi: 10.1038/36626. [DOI] [PubMed] [Google Scholar]
- 2.Wolynes P G. Proc Natl Acad Sci USA. 1997;94:6170–6175. doi: 10.1073/pnas.94.12.6170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Galzitskaya O V, Finkelstein A V. Proc Natl Acad Sci USA. 1999;96:11299–11304. doi: 10.1073/pnas.96.20.11299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Muñoz V, Henry E R, Hofrichter J, Eaton W A. Proc Natl Acad Sci USA. 1998;95:5872–5879. doi: 10.1073/pnas.95.11.5872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kilonski B, Ilkowski W, Skolnick J. Biophys J. 1999;77:2942–2953. doi: 10.1016/S0006-3495(99)77127-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pande V S, Rokhsar D S. Proc Natl Acad Sci USA. 1999;96:9062–9067. doi: 10.1073/pnas.96.16.9062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Schellman J A. J Phys Chem. 1958;62:1485–1494. [Google Scholar]
- 8.Alm E, Baker D. Proc Natl Acad Sci USA. 1999;96:11305–11310. doi: 10.1073/pnas.96.20.11305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Riddle D S, Grantcharova V P, Santiago J V, Alm E, Ruczinski I, Baker D. Nat Struct Biol. 1999;6:1016–1024. doi: 10.1038/14901. [DOI] [PubMed] [Google Scholar]
- 10.Muñoz V, Eaton W A. Proc Natl Acad Sci USA. 1999;96:11311–11316. doi: 10.1073/pnas.96.20.11311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Guo C, Levine H, Kessler D A. Phys Rev Lett. 2000;84:3490–3493. doi: 10.1103/PhysRevLett.84.3490. [DOI] [PubMed] [Google Scholar]
- 12.Takada S, Wolynes P G. J Chem Phys. 1997;107:9585–9598. [Google Scholar]
- 13.Hao M-H, Scheraga H A. Acc Chem Res. 1998;31:433–440. [Google Scholar]
- 14.Veitshans T, Klimov D, Thirumalai D. Folding Des. 1997;2:1–22. doi: 10.1016/S1359-0278(97)00002-3. [DOI] [PubMed] [Google Scholar]
- 15.Morna L B, Schneider J P, Kentsis A, Reddy G A, Sosnick T R. Proc Natl Acad Sci USA. 1999;96:10699–10704. doi: 10.1073/pnas.96.19.10699. [DOI] [PMC free article] [PubMed] [Google Scholar]