Abstract
We study the folding thermodynamics and kinetics of the Pin1 WW domain, a three-stranded β-sheet protein, by using all-atom (except nonpolar hydrogens) discontinuous molecular dynamics simulations at various temperatures with a Gō model. The protein exhibits a two-state folding kinetics near the folding transition temperature. A good agreement between our simulations and the experimental measurements by the Gruebele group has been found, and the simulation sheds new insights into the structure of transition state, which is hard to be straightforwardly captured in experiments. The simulation also reveals that the folding pathways at approximately the transition temperature and at low temperatures are much different, and an intermediate state at a low temperature is predicted. The transition state of this small β-protein at its folding transition temperature has a well-established hairpin 1 made of β1 and β2 strands while its low-temperature kinetic intermediate has a formed hairpin 2 composed of β2 and β3 strands. Theoretical results are compared with other simulation results as well as available experimental data. This study confirms that specific side-chain packing in an all-atom Gō model can yield a reasonable prediction of specific folding kinetics for a given protein. Different folding behaviors at different temperatures are interpreted in terms of the interplay of entropy and enthalpy in folding process.
INTRODUCTION
Slow folding rates of β-proteins make it computationally challenging to study their folding kinetics. As a result, most simulation studies on folding of β-proteins are limited to small β-sheet domains (1–6) with a few exceptions (7–10). WW domains constitute a family of three-stranded antiparallel β-sheet proteins comprised of ∼34–44 amino acids (11–14). They are named after two highly conserved tryptophans in the strands. Some WW domains are suitable for theoretical folding studies (15–22) because they unfold and refold reversibly and cooperatively (two-state kinetics) (23,24). Moreover, they are amenable to well-defined mutagenesis without significant changes in their structures (16,25–27). Protein Pin1 is a two-domain protein that regulates early mitotic events. Its WW domain binds proline-rich peptides, and another domain of Pin1 catalyzes the proline cis/trans-isomerization. The high-resolution x-ray structure of the Pin1 WW domain has been determined (1.35 Å) and only residues 6–39 are visible (28). The domain contains two hydrophobic clusters at the opposite side of strands. Core 1 contains Leu7, Trp11, Tyr24, and Pro37; Core 2 has Tyr23, Phe25, and the n-propyl portion of Arg14, as shown in Fig. 1 A. The NMR structure of this domain adopts a more flexible loop 1 (residues Ser16 to Arg21) in solution compared to the x-ray structure (29).
FIGURE 1.
(A) A cartoon display of the global-minimum-energy structure of an all-atom off-lattice model of Pin1 WW domain (Lys6 to Gly39). Three β-strands, two loop regions and coil regions, the hydrophobic Core 1 (Leu7, Trp11, Tyr24, Pro37), and Core 2 (Arg14, Tyr23, Phe25) are specifically indicated. Drawn with VMD (65). (B) Two-dimensional presentation of the Pin1 WW domain. In addition to the secondary-structure elements, the two conserved Trp residues are highlighted by squares. Ten main-chain H-bonds and four side-chain H-bonds are also shown by the dark dashed lines and the light dashed lines, respectively, and each H-bond is indicated by an arrow from a hydrogen donor to the associated hydrogen accepter. Core 1 is highlighted with a dashed-line border, while Core 2, with a solid-line border. (C) The contact map of the native structure. A residue is in contact with another residue if there is at least one square-well atomic contact between them. Square symbols denote β1-β2 contacts; diamonds, β2-β3 contacts; triangles, β1-β3 contacts; crosses, contacts among Core 1 residues; pluses, contacts among Core 2 residues; the solid squares indicate contacts between loop 1 residues and other residues; the solid diamonds refer to contacts between loop 2 residues and other ones; and other contacts are labeled with circles. The total number of native residue-residue contacts (|i−j| ≥ 2) is 91. The number of nonlocal native contacts (|i−j| > 2) is 65.
The folding transition of Pin1 WW domain has been studied intensively by experimental techniques. A series of well-designed side-chain mutagenesis by Gruebele group (26) suggested that the locally interacting loop 1 controls the folding rate of the Pin1 WW domain with nonclassical φ-values >1 at the physiological temperature, while the formation of the hydrophobic Core 1 contributes significantly to thermal stability. The main-chain amide-to-ester (A-to-E) mutation studies of the Pin1 WW domain also demonstrated that its folding transition state (TS) has nativelike main-chain hydrogen bonds (H-bonds) in loop 1, less developed H-bonds in loop 2 and other regions at 50°C, which is close to the midpoint temperature of thermal denaturation (Tm = 59°C) (27).
Pin1 WW domain has also been studied theoretically. Karanicolas et al. compared the folding kinetics and thermodynamics of three WW domains and found that the difference in hydrophobic packing leads to different folding behaviors (biphasic kinetics of FBP WW domain versus two-state kinetics of Pin1 WW domain and hYAP WW domain) (17,18). Brown et al. (30) refolded proteins L and G with a three-letter code sequence (the hydrophobic, hydrophilic, and neutral) and found that this model could also lead to a correct folding of the Pin1 WW domain. Different folding pathways, however, have been suggested by different theoretical studies. Karanicolas et al. proposed that the hairpin β2-β3 (hairpin 2) of the Pin1 WW domain folds before hairpin β1-β2 (hairpin 1) and the hydrophobic Core 1 forms at last (19) by an improved Cα model including nonnative interactions. More recently, Cecconi et al. studied the transition state ensemble (TSE) of the Pin1 WW domain by a Cα-based model with three-letter codes and found that the folding of the Pin1 WW domain was initiated from the hydrophobic contacts in β1-β2 and β2-β3 and completed by zipping the hairpin β1-β2 and hairpin β2-β3 toward loop 1 and loop 2, respectively (20).
In this article, we make an in-depth study of the folding pathway of the Pin1 WW domain based on an all-atom Gō model (31). The constant-temperature discontinuous molecular dynamics (MD) simulation was employed to examine the thermodynamic and kinetic behaviors of the Pin1 WW domain. The simulation technique and the associated Gō model have been used successfully to simulate folding thermodynamics and kinetics of a three-helix bundle and a β-hairpin (32–35). Here, we use the discontinuous MD technique because a discontinuous MD simulation with a square-well potential is computationally more efficient than a normal MD simulation for the following reasons. First, only short-range interactions are involved; second, force calculations are not required in the discontinuous MD simulations; and third, particles move at constant velocities between collisions and there is an efficient search algorithm for the collision time.
Here, we found that there are two distinct folding kinetic pathways at different temperatures for the Pin1 WW domain. At the transition temperature, Pin1 WW domain folds without any intermediates, and its transition state is characterized by a mostly formed hairpin 1. At a lower temperature, however, Pin1 WW domain folds via a kinetic intermediate with a mostly established hairpin 2. Theoretical results are compared with available experimental data.
METHODS AND SIMULATIONS
Model
The detail description of the discontinuous MD method could be found elsewhere (32–35). Here, we give a brief summary of the model in simulation. First, all the heavy-atom positions of the WW domain were acquired from the x-ray structure (residues 6–39 of Protein DataBank 1pin) (28). The N-terminus and n-methyl C-terminus of the polypeptide were capped with acetyl and amine groups, respectively. The CHARMM program (36) was used to create the initial positions of polar hydrogen atoms. Steric clashes were removed by 100 minimization steps with the CHARMM polar hydrogen parameter set 19 and distance-dependent dielectric constants (37). Thus, all heavy atoms and polar hydrogens are included in the model and the total number of atoms is 361.
We constrain two bonded atoms, as well as any 1,3 angle-constrained atomic pair and 1,4 aromatic carbon pair, i and j, to a center-to-center distance between 0.9σij and 1.1σij, where σij is the atomic distance obtained by the CHARMM energy minimization as mentioned above. This constraint is denoted by an infinitely deep square-well potential
![]() |
(1) |
We apply a bondlike potential for improper dihedral angels to maintain the L-form chirality about a tetrahedral extended heavy atom (ω0 = 35.26439°) and to keep certain planar atoms (ω0 = 0°). A 20° fluctuation is used to increase the flexibility and folding rate. That is,
![]() |
(2) |
The nonbonded i, j pair interactions are represented by a hard-core and square-well potential
![]() |
(3) |
where are the van der Waals parameters from the CHARMM polar hydrogen parameter set 19 (37) and Bij are the nonbonded bead interaction strengths. The factor of 0.8 is the typical ratio between the hard-core diameters and the van der Waals parameters in the Weeks-Chandler-Andersen perturbation theory (38), while a ratio of 1.5 (= 1.2:0.8) between the square-well and hard-core diameters is typical for small molecular systems (39).
We remove all initial hard-core overlaps in the CHARMM minimized structure by a short discontinuous MD simulation. To ensure that the original native contacts found in the x-ray structure are maintained during the process, the square-well interactions for these native contacts are set to be a large negative energy (−100). The initial hard-core diameter between any two overlapped atoms is regarded as the distance found in the energy-minimized structure. We modify the hard-core diameters at each time step until the true hard-core diameters are reached and the simulation continues until all initial overlaps are removed. The resulting structure has a main-chain root-mean-squared deviation (RMSD) from the x-ray structure of only 0.69 Å and is regarded as the global minimum structure for the protein (Fig. 1 A). In all, there are 2398 square-well atom-atom overlaps (including both backbone and side-chain contacts) in the global minimum structure. Henceforth, these shall be regarded as native atomic contacts.
To make sure that the energy of the native structure (shown in Fig. 1 A) is at the global minimum, a Gō interaction (31) is adopted. In a Gō model, a square-well interaction energy between two atoms is –ɛ (Bij = −1) for a native contact, and 0 (Bij = 0), otherwise. Henceforth, the internal energy, E* (= E/ɛ), and the reduced temperature, T* (= kBT/ɛ, in which kB is Boltzmann constant), are scaled in units of ɛ. A reduced time unit t* is also used [t* = t where M is the average mass of the atoms and σL is 1 Å]. The total number of native residue-residue contacts is 91 (the neighbor contacts between i and i ± 1 are omitted, shown in Fig. 1 C), and the number of nonlocal native contacts (|i−j| > 2) is 65.
The formation of an H-bond is defined by a cutoff distance of 2.88 Å between a donor H atom and an accepter O atom. Ten backbone H-bonds of the native structure are labeled in Fig. 1 B, which are the same as those shown in Fig. 1 of Jäger et al. (26) and in Fig. 1 of Deechongkit et al. (27); other native side-chain H-bonds are mainly located in two loop regions (26).
Simulations
The discontinuous MD simulations of a single Pin1 WW domain were carried out in the canonical ensemble. Ghost-solvent particles were employed to maintain a constant temperature by their collisions with protein atoms (40–42). Simulations were conducted at 18 temperatures from T* = 2.5 to T* = 4.8 for the thermodynamic analysis. The thermodynamic properties at the temperatures that are not simulated can be obtained through the weighted histogram method (43), which is a least-square optimization method for extracting the degeneracy factors of the energy levels and the partition function Z from multiple simulations at different temperatures. The energy and the heat capacity can be calculated from the partition function Z as
![]() |
(4) |
![]() |
(5) |
We investigated the folding kinetics at T* = 2.7, 3.0, and 3.6. For folding kinetics at T* = 2.7 and 3.0, the initial coil-like configurations and velocities were obtained from equilibrium simulations at T* = 4.0. Seventy-seven and 135 kinetic simulations were performed for t* = 200,000 at T* = 2.7 and 3.0, respectively. The coordinates were recorded every 100 reduced time units for later analysis. For folding kinetics at T* = 3.6, both folding and unfolding behaviors were observed at equilibrium simulation. The coordinates were recorded every 50 reduced time units. A progress variable Q, the fraction of nonlocal native contacts (|i – j| > 2) between residues (total number is 65, shown in Fig. 1 C), is used to monitor the folding process. Two residues are regarded as in contact if there are any square-well atomic overlaps between them.
φ-value
The experimental data for analyzing the folding transition state are φ-values obtained from protein engineering experiments (44). These experiments measured the change in the active energy of the transition state relative to the change in the free energy of the native state upon a perturbative mutation (substitution). The φ-values are often interpreted as the fraction of native structure or the so-called nativeness of each residue in the folding transition state. Thus, one way to predict φ-values is to calculate the fraction of native contacts in the TSE (45–47). That is,
![]() |
(6) |
where and
are the average of native heavy atomic contacts between residue i and other residues in the native condition and that at the transition state, respectively. A contact is considered to exist when the distance of any heavy atoms of the two nonadjacent residues is less than the square-well width (pairs i, i ± 1 are excluded). Our Gō model does not have attractive interactions for nonnative interactions; however, it does have atomic packing interactions from nonnative contacts. Considering the possibility that the nonnative packing might also contribute to φ-values, we also calculate
![]() |
(7) |
where and
are the average of all heavy atomic contacts between residue i and other residues in the native condition and that at the transition state, respectively. In this article,
and
were calculated from the configurations of equilibrium simulations at T* = 2.7.
RESULTS
Thermodynamics
The heat capacity as a function of temperature for Pin1 WW domain is shown in Fig. 2 A. A sharp peak is observed at T* = 3.60, which indicates a thermodynamic transition. This transition is, according to the weighted histogram analysis, accompanied with a bimodal distribution of energies (Fig. 2 B). Thus, the observed transition is a first-order-like two-state transition. Analysis of the structures further demonstrates that the underlying transition is a folding transition between coil-like structures (the average RMSD of main-chain atoms at T* ≥ 4.0 is >14 Å) and folded structures (the average RMSD of main-chain atoms at T* ≤ 3.4 is <3.0 Å).
FIGURE 2.
(A) Reduced heat capacity (Cv* = Cv/kB) as a function of reduced temperature (T* = kBT/ɛ). The folding transition temperature is near T* = 3.60. (B) Probability distribution of energy at the transition temperature T* = 3.59. The results were obtained from the weighted histogram method.
The two-state behavior observed above is also reflected from the free energy profiles as a function of reduced energy at different temperatures (Fig. 3). At the transition temperature (T* = 3.6), there are two stable states corresponding to an unfolded state (with a reduced energy E* ∼ −1110) and a folded state (E* ∼ −1530), respectively. The transition state has an energy E* at ∼−1290 ± 30. When temperature increases, the unfolded state becomes more stable than the native state as expected. Fig. 3, A and C, also illustrate that only one thermodynamically stable state is observed at a very high temperature or a very low temperature.
FIGURE 3.
Free energy profiles as a function of reduced energy (A) at low temperatures; (B) at approximately the folding transition temperature; (C) at high temperatures. The positions of the transition states and native states change with temperature, as indicated by the dashed lines. U, unfolded state; TS, transition state; F, folded state.
If the reduced energy is a correct order parameter, a transition state can be defined based on the peak between the coil and folded states. Fig. 3 further indicates that the transition state becomes more nativelike as temperature increases (see, for instance, from E* ∼ −1250 at T* = 3.45 to E* ∼ −1350 at T* = 3.7), exhibiting the Hammond behavior (48), while the folded state becomes less stable (E* from −1570 at T* = 3.45 to −1500 at T* = 3.7), which is consistent with the ground state effects for proteins (49). (Energy can be used very well to determine the location of transition states at the transition temperature. However, it cannot be used for determining transition states at low temperatures because a kinetic intermediate exists; see below.)
Folding kinetics at the transition temperature
Fig. 4 plots the probability distribution of the number of native contacts between β1 and β2 residues and that between β2 and β3 residues at the transition temperature T* = 3.6. At this temperature, no detectable kinetic intermediate is observed (Fig. 4). This is true when the distribution is plotted against some other progress variables.
FIGURE 4.
Probability distribution of the number of native contacts between β1 and β2 residues and that between β2 and β3 residues. A total of 5,130,000 configurations of eight trajectories at T* = 3.6 were collected for statistics.
A typical folding and unfolding trajectory is shown in Fig. 5 A, where reduced energy E*, the fraction of nonlocal native contacts Q, and RMSD in various regions are plotted as a function of time. A series of snapshots in the folding process are shown in Fig. 5 B. The folding process starts from the local contacts of loop 1 that initiates the formation of β1 and β2 hairpin. This is followed by three native hydrophobic contacts of Core 2 and the stable hairpin β1-β2. Then, local contacts of loop 2 bring β2-β3 together, to yield the final folded state.
FIGURE 5.
A folding trajectory at T* = 3.6. (A) The reduced energy (E/ɛ), the fraction of native contacts between residues (Q), and the RMSD of Pin1 WW domain are plotted as a function of reduced time (t*). (B) A series of snapshots at the marked reduced times which are in a folding regime indicated by the dashed lines in panel A. The snapshots are displayed with the same symbols as those in Fig. 1 A.
Transition state at the transition temperature
In this article, a snapshot or a conformation is defined as a transition state if its reduced energy is within −1290 ± 30 (Figs. 2 B and 3) and it also belongs to the region of an abrupt structural change in a folding or unfolding event in Fig. 5 A (50). This method of determining TSE is equivalent to the criterion of Q (51–53). We have also tried several other methods including the use of multiple progress variables and the folding probability along a trajectory (Pfold) (54). The resulting ensembles are essentially the same. Details will be discussed in a separate article. Here, we present transition state structures generated by the energy criterion only.
The samples in the TSE are made of 883 configurations collected from the folding/unfolding transitions. (We use the structures in both folding/unfolding transitions because a careful analysis indicates that the structures are the same regardless of folding or unfolding events.) The 883 configurations have an average fraction of nonlocal native contacts Q = 0.27 ± 0.05 and an average radius of gyration Rg = 13.7 ± 1.4 Å. The average main-chain-heavy-atom RMSD and all-heavy-atom RMSD values are 9.33 ± 1.65 Å and 10.24 ± 1.62 Å, respectively. The RMSDs of main-chain heavy atoms and all heavy atoms from the global minimal structure as a function of residue index number are shown in Fig. 6. Hairpin 1 and loop 1 have much smaller deviations from the global minimum structure than loop 2 and β3. Both N- and C-terminus are considerably flexible. Some typical snapshots of TSE are shown in Fig. 7. All of them have a well-established hairpin 1 but a yet flexible loop 1, loop 2 and β3.
FIGURE 6.
Average main-chain-heavy-atom RMSD (solid line) and all-heavy-atom RMSD (dashed line) of the transition-state structures from the native structure for each residue.
FIGURE 7.
Ten snapshots in the transition-state ensemble at T* = 3.6.
Table 1 shows the probabilities of the native H-bonds that are formed in TSE. The probabilities of the four backbone H-bonds between β1 and β2 strands (H1 to H4) are much higher than those of the two backbone H-bonds between β2 and β3 strands (H13 and H14). This confirms that hairpin 2 is not formed in TSE whereas hairpin 1 is well established. The H-bonds in the loop 1 region have higher values: 0.25 for H6 (H-bond between the N-H of Ser19 and the C=O of Ser16), 0.06 for H7 (H-bond between the O-H of the Ser19 side chain and the N-H of the Arg21 main chain); and the probability of H5 between the N-H of Ser16 and the C=O of Ser21, the two residues that define the loop 1, is 0.27. In contrast, the probability in loop 2 is only 0.02 for H10 (H-bond between the N-H of Asn30 and the C=O of Asn26); and 0.05 for H12 (H-bond between the N-H of Asn26 and the C=O of Ala31). The probabilities of the three local side-chain H-bonds H8, H9, H11 in the loop 2 region are 0.41, 0.25, and 0.05, respectively. Thus, loop 2 has only local native contacts. The probabilities of the four backbone H-bonds between β1 and β2 strands are higher than the probabilities of H-bonds in loop 1 region, which means the H-bonds of the loop 1 are moderately stable compared to H-bonds in hairpin 1.
TABLE 1.
Probability of H-bond formation in TSE at T* = 3.60 (near transition temperature)
Serial* | Donor† | Acceptor† | Probability | Location |
---|---|---|---|---|
H1 | Glu12 | Phe25 | 0.69 | β1-β2 |
H2 | Phe25 | Glu12 | 0.51 | |
H3 | Arg14 | Tyr23 | 0.37 | |
H4 | Tyr23 | Arg14 | 0.46 | |
H5 | Ser16 | Arg21 | 0.27 | Loop 1 |
H6 | Ser19 | Ser16 | 0.25 | |
H7 | Ser19 (O-H) | Arg21 (N-H)‡ | 0.06 | |
H8 | Ile28 | Asn26 (O=CNH2) | 0.41 | Loop 2 |
H9 | Thr29 | Asn26 (O=CNH2) | 0.25 | |
H10 | Asn30 | Asn26 | 0.02 | |
H11 | Ala31 | Thr29 (O-H) | 0.05 | |
H12 | Asn26 | Asn31 | 0.05 | |
H13 | Tyr24 | Gln33 | 0.00 | β2-β3 |
H14 | Gln33 | Tyr24 | 0.00 |
φ-value analysis
The φ-values obtained from our simulations are shown in both Fig. 8 and Table 2. The data were calculated according to the fraction of the number of native heavy atomic contacts (Eq. 6). The correlation coefficient between φsim and φexp (side-chain mutation) is 0.39. There is one outlier caused by the negative experimental φ-value for Ile28. Removing this residue improves the correlation coefficient between φsim and φexp from 0.39 to 0.50. It is of interest to note that the theoretical φ-values have a better agreement with experimental φ-values based on main-chain A-to-E mutation. The correlation coefficient between φsim and φexp of A-to-E results is 0.74 for eight experimental points.
FIGURE 8.
The simulated φ-values (just native contacts involved) and
(containing total contacts) of each residue. The experimental
from side-chain mutation and
from main-chain mutation (A-to-E mutation) both measured by Gruebele group (26,27) are also shown. The secondary-structure elements are indicated above the diagram.
TABLE 2.
Contact residues in TSE and φ-values from simulations () and experiments (
)
Residue* | Contact residues† | ![]() |
![]() |
![]() |
Location |
---|---|---|---|---|---|
L7 | P9, W11 | 0.14 | 0.18 | Nonsheet | |
P8 | G10, W11 | 0.18 | |||
P9 | L7, W11, N26, I28 | 0.48 | |||
G10 | P8, E12, F25, N26, H27, I28 | 0.49 | |||
W11 | L7, P8, P9, Y24, F25, N26 | 0.45 | β1 | ||
E12 | G10, R14, Y24, F25, H27 | 0.56 | 0.2 | ||
K13 | M15, V22, Y23, Y24 | 0.57 | 0.50 | ||
R14 | E12, S16, V22, Y23, Y24, F25 | 0.47 | 0.68 | ||
M15 | K13, R17, S19, G20, R21, V22, Y23 | 0.40 | 0.63 | ||
S16 | R14, S18, S19, R21, Y23 | 0.42 | 1.7 | 0.70 | Loop 1 |
R17 | M15, S19 | 0.75 | 0.78 | ||
S18 | S16 | 0.71 | 1.8 | ||
S19 | M15, S16, R17, R21 | 0.52 | 1.4 | 0.83 | |
G20 | M15, V22 | 0.47 | |||
R21 | M15, S16, S19, Y23 | 0.38 | |||
V22 | K13, R14, M15, G20 | 0.38 | 1.1 | 0.42 | β2 |
Y23 | K13, R14, M15, S16, R21, F25 | 0.35 | 0.64 | ||
Y24 | W11, E12, K13, R14, N26 | 0.24 | 0.52 | ||
F25 | G10, W11, E12, R14, Y23, H27, N30, A31, S32 | 0.43 | 0.49 | ||
N26 | P9, G10, W11, Y24, I28, T29, N30, A31 | 0.54 | 0.33 | ||
H27 | G10, E12, F25, T29, N30 | 0.36 | 0.28 | Loop 2 | |
I28 | P9, G10, N26, N30 | 0.50 | −0.09 | ||
T29 | N26, H27, A31 | 0.40 | 0.3 | ||
N30 | F25, N26, H27, I28, S32 | 0.20 | |||
A31 | F25, N26, T29, Q33 | 0.16 | 0.44 | β3 | |
S32 | F25, N30 | 0.09 | 0.51 | ||
Q33 | A31, E35, R36, P37 | 0.31 | |||
W34 | R36 | 0.02 | 0.36 | −0.12 | |
E35 | Q33, P37 | 0.56 | Nonsheet | ||
R36 | Q33, W34, S38, G39 | 0.26 | |||
P37 | Q33, E35 | 0.06 | |||
S38 | R36 | 0.50 |
The boldface type indicates that the associated φ-value(s) is (are) experimentally available.
Name of a residue is abbreviated with a single-letter and a serial number.
Native contacts in TSE with probabilities >0.1. A residue is in contact with another one if there is at least one square-well heavy atomic contact between them.
Results from side-chain mutation (26).
Results from A-to-E mutation (27).
It is of interest to note that some nonnative contacts (due to nonnative side-chain packing in this article) are found in TSE. The φtc values based on total contacts (including both native and nonnative contacts) have a somewhat better agreement with φexp values. The correlation coefficient between φtc and φexp is improved to 0.53 (for experimental side-chain mutation) and 0.75 (for experimental A-to-E mutation), compared to 0.39 and 0.74 between φsim and φexp, respectively. The main difference between φsim and φtc comes from two residues (Arg17 and Ser18). It is interesting that both φtc and φexp of side-chain mutation have nonclassical φ-values (φ > 1) at loop 1. Although our study implies that nonnative contacts in the transition states might play a role in folding, the relation between nonclassical φ-values and nonnative attractive interactions is still an open question, because in a Gō model, only native contacts are attractive. Further studies are needed for the effect of nonnative attractive interactions.
Folding kinetics at T* = 2.7
Fig. 9 plots the probability distribution of the number of native β1-β2 residue contacts and the number of native β2-β3 residue contacts at a low temperature T* = 2.7. At this temperature, we found a kinetic intermediate with a mostly formed hairpin β2-β3 (with 9∼10 nonlocal native β2-β3 contacts). At a slightly higher temperature T* = 3.0, a hairpin β1-β2 intermediate was observed in addition to the hairpin β2-β3 intermediate. Twenty-two out of 102 folded trajectories folded via the formation of hairpin β1-β2. Thus, the folding pathway of the Pin1 WW domain is altered as temperature changes.
FIGURE 9.
Probability distribution of the numbers of native β1-β2 and β2-β3 contacts at T* = 2.7. Seventy-seven trajectories were run, and 2000 configurations for each trajectory were collected for statistics.
DISCUSSION
In this article, an all-atom Gō model of Pin1 WW domain was simulated with the discontinuous MD technique. The model exhibited a two-state folding thermodynamics. The folding also exhibited a two-state folding kinetics at the transition temperature with a transition state of a mostly formed hairpin β1-β2. At a low temperature, a drastically different folding behavior was, however, observed. There was a kinetic intermediate with a mostly formed hairpin 2 made of β2 and β3 strands. In between, two kinetic intermediates with either hairpin 1 or hairpin 2 were observed.
The drastically different folding kinetics and pathways (Fig. 10) can be understood as follows. The most entropically favorable way for forming the hairpin 1 (or the hairpin 2) is the formation of loop 1 (or loop 2) that brings β1 and β2 (or β2 and β3) together. Local contacts of loop 2 are more likely to form than those of loop 1 because loop 2 is shorter (32). However, the formation of the hairpin 2 is unlikely to complete folding by recruiting β1, because at high temperature, their intradomain interactions are not strong enough to survive for a relative long time needed for the formation of the long loop 1. Moreover, loop 2 requires long-range interactions to be stable (Fig. 1 C). Thus, the transition state is made of more stable hairpin β1-β2 instead of hairpin β2-β3, although longer time is required for the formation of local contacts in loop 1. Once the hairpin 1 forms, it is stabilized by the hydrophobic contacts and H-bonds between strands β1 and β2, and it is relatively easy to bring β3 to fold because of a short loop 2. Situation is, however, different at low temperatures. The hairpin β2-β3 is, once formed, stable against thermal fluctuation. It becomes a kinetic intermediate while waiting for the formation of the long loop 1 to bring β1 to fold (Figs. 9 and 10), although thermodynamically, this protein is a two-state protein (Figs. 2 and 3). Above explanation reflects that folding results from an interplay between entropy (local contacts form first) and enthalpy (interaction). At approximately that transition temperature, interaction energy guides the folding because only the most stable region can survive thermal fluctuation to yield a productive folding event. At a low temperature, some early local contacts produce a kinetic trap.
FIGURE 10.
A schematic presentation of the two folding pathways at approximately the transition temperature and at a low temperature.
Recent studies devote significant efforts to find the relationship between the folding rates and various physical parameters of the structures of native proteins (55–59). These studies suggest that topology and the number of folded residues rather than detailed energetic interactions play dominant roles in determining the folding rates of small proteins (60–63). Bai et al. have found that the folding rate is correlated to the critical nucleation size of the transition state more than that of the native structure (64). Here, we find that the TSE of the Pin1 WW domain has a critical nucleation size of 18 if the criterion of is used to define the critical nucleation residues. This size from our simulation is in a reasonable agreement with that from the previous studies, i.e., 16, based on a simple analytical model (64). Our simulations reveal that the nuclear residues include residues 9–20 and residues 25–29. These residues are mainly located at loop 1, β1 and β2.
The above described simulation results are consistent with several experimental observations. It is known that the Pin1 WW domain folds in a two-state behavior thermodynamically and kinetically (26). At approximately temperature 45°C, which is slightly lower than the transition temperature (59°C), the φexp values in strands β1 and β2 are higher than φexp in β3 strand. This suggests that the hairpin 1 is more nativelike than the hairpin 2 in the TSE (26). The A-to-E mutation also shows that the transition state has a mostly formed backbone structure for hairpin 1 but a yet flexible β3 strand (27). Our simulation also predicted the different pathways of the Pin1 WW domain. The temperature dependence, however, is yet to be tested experimentally. Experimental temperature dependence so far was conducted at temperature near or higher than transition temperature (26,27). This region, however, is very difficult to perform simulations in, because folding rarely occurs at temperatures higher than transition temperatures.
To the best of our knowledge, this work is the first detailed folding kinetic study of Pin1 WW domain on the all-atom level. Using Cα models, Karanicolas et al. (19) predicted that the hairpin 2 of Pin1 WW domain folds before hairpin 1 while Cecconi et al. (20) suggested a zipping mechanism after the formation of a collapsed hydrophobic core. However, our study shows that the hairpin 1 is formed before the hairpin 2 near the transition temperature. Our result is consistent with experimental observations by Gruebele et al. (26,27). Previously, we have argued that it is difficult to reproduce the specific folding behavior of a β-hairpin by a Cα model, while it is relatively easy by using an all-atom model (34). This study confirms the above finding, that is, packing and native atomic contact interactions play an important role in the folding mechanism of proteins. (Certainly, it is possible that a sophisticated Cα-based model may be parameterized to reproduce the behavior of an all-atom model.) A similar conclusion was obtained in a separate folding kinetic study of a three-helix bundle fragment B of Staphylococcal protein A (33).
Interestingly, some theoretical studies of other WW domains agree with our findings on Pin1 WW domain. For example, the simulations of YAP 65 (15,16) and FBP 28 (22) indicate that the β2 and β3 strands of hairpin 2 are dissociated first in unfolding. This can be considered as the reverse of the folding process of Pin1 WW domain observed here. The transition state of FBP 28 also has a well-established hairpin 1 and flexible strand 3 (21,22). Moreover, loop 1 is more flexible compared to the contacts between β1 and β2 strands (21). Some misfolded intermediate states exist at low temperatures (22). These results are consistent with our results on Pin1 WW domain.
Acknowledgments
The authors are grateful for financial support from the Natural Science Foundation of China (grants No. 50533010 and No. 20574013, and a Two-Base grant); the Chinese Ministry of Education (Key grant No. 305004); the 973 Project (grant No. 2005CB522700); Science and Technology Developing Foundation of Shanghai (grant No. 055207082); and the Senior Visiting Scholar Foundation of the Key Laboratory. Computer support from National High Performance Computing Center of Shanghai is acknowledged. Y.Z. also thanks the National Institutes of Health, Bethesda, MD (grants No. R01 GM 966049 and No. R01 GM 068530), and the Howard Hughes Medical Institute, SUNY Buffalo and the Center for Computational Research and the Keck Center for Computational Biology, SUNY Buffalo for their grants.
Editor: Jose Onuchic.
References
- 1.Ferrara, P., and A. Caflisch. 2000. Folding simulations of a three-stranded antiparallel β-sheet peptide. Proc. Natl. Acad. Sci. USA. 97:10780–10785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Pande, V. S., and D. S. Rokhsar. 1999. Molecular dynamics simulations of unfolding and refolding of a β-hairpin fragment of protein G. Proc. Natl. Acad. Sci. USA. 96:9062–9067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Soto, P., and G. Colombo. 2004. Characterization of the conformational space of a triple-stranded β-sheet forming peptide with molecular dynamics simulations. Proteins. 57:734–746. [DOI] [PubMed] [Google Scholar]
- 4.García, A. E., and K. Y. Sanbonmatsu. 2001. Exploring the energy landscape of a β-hairpin in explicit solvent. Proteins. 42:345–354. [DOI] [PubMed] [Google Scholar]
- 5.Seibert, M. M., A. Patriksson, B. Hess, and D. van der Spoel. 2005. Reproducible polypeptide folding and structure prediction using molecular dynamics simulations. J. Mol. Biol. 354:173–183. [DOI] [PubMed] [Google Scholar]
- 6.Lwin, T. Z., and R. Luo. 2006. Force field influences in β-hairpin folding simulations. Protein Sci. 15:2642–2655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Guo, W. H., S. Lampoudi, and J. E. Shea. 2003. Posttransition state desolvation of the hydrophobic core of the src-SH3 protein domain. Biophys. J. 85:61–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Marianayagam, N. J., and S. E. Jackson. 2004. The folding pathway of ubiquitin from all-atom molecular dynamics simulations. Biophys. Chem. 111:159–171. [DOI] [PubMed] [Google Scholar]
- 9.Cheung, M. S., A. E. García, and J. N. Onuchic. 2002. Protein folding mediated by solvation: water expulsion and formation of the hydrophobic core occur after the structural collapse. Proc. Natl. Acad. Sci. USA. 99:685–690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gosavi, S., L. L. Chavez, P. A. Jennings, and J. N. Onuchic. 2006. Topological frustration and the folding of interleukin-1β. J. Mol. Biol. 357:986–996. [DOI] [PubMed] [Google Scholar]
- 11.Sudol, M., K. Sliwa, and T. Russo. 2001. Functions of WW domains in the nucleus. FEBS Lett. 490:190–195. [DOI] [PubMed] [Google Scholar]
- 12.Sudol, M., and T. Hunter. 2000. New wrinkles for an old domain. Cell. 103:1001–1004. [DOI] [PubMed] [Google Scholar]
- 13.Zarrinpar, A., and W. A. Lim. 2000. Converging on proline: the mechanism of WW domain peptide recognition. Nat. Struct. Biol. 7:611–613. [DOI] [PubMed] [Google Scholar]
- 14.Macias, M. J., M. Hyvonen, E. Baraldi, J. Schultz, M. Sudol, M. Saraste, and H. Oschkinat. 1996. Structure of the WW domain of a kinase-associated protein complexed with a proline-rich peptide. Nature. 382:646–649. [DOI] [PubMed] [Google Scholar]
- 15.Ibragimova, G. T., and R. C. Wade. 1999. Stability of the β-sheet of the WW domain: a molecular dynamics simulation study. Biophys. J. 77:2191–2198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ferguson, N., J. R. Pires, F. Toepert, C. M. Johnson, Y. P. Pan, R. Volkmer-Engert, J. Schneider-Mergener, V. Daggett, H. Oschkinat, and A. Fersht. 2001. Using flexible loop mimetics to extend φ-value analysis to secondary structure interactions. Proc. Natl. Acad. Sci. USA. 98:13008–13013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Karanicolas, J., and C. L. Brooks. 2003. The structural basis for biphasic kinetics in the folding of the WW domain from a formin-binding protein: lessons for protein design? Proc. Natl. Acad. Sci. USA. 100:3954–3959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Karanicolas, J., and C. L. Brooks. 2004. Integrating folding kinetics and protein function: biphasic kinetics and dual binding specificity in a WW domain. Proc. Natl. Acad. Sci. USA. 101:3432–3437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Karanicolas, J., and C. L. Brooks. 2003. Improved Gō-like models demonstrate the robustness of protein folding mechanisms towards non-native interactions. J. Mol. Biol. 334:309–325. [DOI] [PubMed] [Google Scholar]
- 20.Cecconi, F., C. Guardiani, and R. Livi. 2006. Testing simplified proteins models of the hPin1 WW domain. Biophys. J. 91:694–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Petrovich, M., A. L. Jonsson, N. Ferguson, V. Daggett, and A. R. Fersht. 2006. φ-Analysis at the experimental limits: mechanism of β-hairpin formation. J. Mol. Biol. 360:865–881. [DOI] [PubMed] [Google Scholar]
- 22.Mu, Y. G., L. Nordenskiöld, and J. P. Tam. 2006. Folding, misfolding, and amyloid protofibril formation of WW domain FBP28. Biophys. J. 90:3983–3992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ferguson, N., C. M. Johnson, M. Macias, H. Oschkinat, and A. Fersht. 2001. Ultrafast folding of WW domains without structured aromatic clusters in the denatured state. Proc. Natl. Acad. Sci. USA. 98:13002–13007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kaul, R., A. R. Angeles, M. Jäger, E. T. Powers, and J. W. Kelly. 2001. Incorporating β-turns and a turn mimetic out of context in loop 1 of the WW domain affords cooperatively folded β-sheets. J. Am. Chem. Soc. 123:5206–5212. [DOI] [PubMed] [Google Scholar]
- 25.Macias, M. J., V. Gervais, C. Civera, and H. Oschkinat. 2000. Structural analysis of WW domains and design of a WW prototype. Nat. Struct. Biol. 7:375–379. [DOI] [PubMed] [Google Scholar]
- 26.Jäger, M., H. Nguyen, J. C. Crane, J. W. Kelly, and M. Gruebele. 2001. The folding mechanism of a β-sheet: the WW domain. J. Mol. Biol. 311:373–393. [DOI] [PubMed] [Google Scholar]
- 27.Deechongkit, S., H. Nguyen, E. T. Powers, P. E. Dawson, M. Gruebele, and J. W. Kelly. 2004. Context-dependent contributions of backbone hydrogen bonding to β-sheet folding energetics. Nature. 430:101–105. [DOI] [PubMed] [Google Scholar]
- 28.Ranganathan, R., K. P. Lu, T. Hunter, and J. P. Noel. 1997. Structural and functional analysis of the mitotic rotamase Pin1 suggests substrate recognition is phosphorylation dependent. Cell. 89:875–886. [DOI] [PubMed] [Google Scholar]
- 29.Kowalski, J. A., K. Liu, and J. W. Kelly. 2002. NMR solution structure of the isolated Apo Pin1 WW domain: comparison to the x-ray crystal structures of Pin1. Biopolymers. 63:111–121. [DOI] [PubMed] [Google Scholar]
- 30.Brown, S., N. J. Fawzi, and T. Head-Gordon. 2003. Coarse-grained sequences for protein folding and design. Proc. Natl. Acad. Sci. USA. 100:10712–10717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Taketomi, H., Y. Ueda, and N. Gō. 1975. Studies on protein folding, unfolding and fluctuations by computer-simulation.1. Effect of specific amino-acid sequence represented by specific inter-unit interactions. Int. J. Pept. Protein Res. 7:445–459. [PubMed] [Google Scholar]
- 32.Linhananta, A., H. Y. Zhou, and Y. Q. Zhou. 2002. The dual role of a loop with low loop contact distance in folding and domain swapping. Protein Sci. 11:1695–1701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Linhananta, A., and Y. Q. Zhou. 2002. The role of side chain packing and native contact interactions in folding: discontinuous molecular dynamics folding simulations of an all-atom Gō model of fragment B of Staphylococcal protein A. J. Chem. Phys. 117:8983–8995. [Google Scholar]
- 34.Zhou, Y. Q., and A. Linhananta. 2002. Role of hydrophilic and hydrophobic contacts in folding of the second β-hairpin fragment of protein G: molecular dynamics simulation studies of an all-atom model. Proteins. 47:154–162. [DOI] [PubMed] [Google Scholar]
- 35.Zhou, Y. Q., and A. Linhananta. 2002. Thermodynamics of an all-atom off-lattice model of the fragment B of Staphylococcal protein A: implication for the origin of the cooperativity of protein folding. J. Phys. Chem. B. 106:1481–1485. [Google Scholar]
- 36.Brooks, B. R., R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. 1983. CHARMM—a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4:187–217. [Google Scholar]
- 37.Neria, E., S. Fischer, and M. Karplus. 1996. Simulation of activation free energies in molecular systems. J. Chem. Phys. 105:1902–1921. [Google Scholar]
- 38.Weeks, J. D., D. Chandler, and H. C. Andersen. 1971. Role of repulsive forces in determining the equilibrium structure of simple liquids. J. Chem. Phys. 54:5237–5247. [Google Scholar]
- 39.Sherwood, A. E., and J. M. Prausnitz. 1964. Intermolecular potential functions and the second and third virial coefficients. J. Chem. Phys. 41:429–437. [Google Scholar]
- 40.Andersen, H. C. 1980. Molecular dynamics simulations at constant pressure and/or temperature. J. Chem. Phys. 72:2384–2393. [Google Scholar]
- 41.Zhou, Y. Q., M. Karplus, J. M. Wichert, and C. K. Hall. 1997. Equilibrium thermodynamics of homopolymers and clusters: molecular dynamics and Monte Carlo simulations of systems with square-well interactions. J. Chem. Phys. 107:10691–10708. [Google Scholar]
- 42.Zhou, Y. Q., and M. Karplus. 1999. Folding of a model three-helix bundle protein: a thermodynamic and kinetic analysis. J. Mol. Biol. 293:917–951. [DOI] [PubMed] [Google Scholar]
- 43.Ferrenberg, A. M., and R. H. Swendsen. 1989. Optimized Monte Carlo data analysis. Phys. Rev. Lett. 63:1195–1198. [DOI] [PubMed] [Google Scholar]
- 44.Fersht, A. R., A. Matouschek, and L. Serrano. 1992. The folding of an enzyme. 1. Theory of protein engineering analysis of stability and pathway of protein folding. J. Mol. Biol. 224:771–782. [DOI] [PubMed] [Google Scholar]
- 45.Paci, E., M. Vendruscolo, C. M. Dobson, and M. Karplus. 2002. Determination of a transition state at atomic resolution from protein engineering data. J. Mol. Biol. 324:151–163. [DOI] [PubMed] [Google Scholar]
- 46.Paci, E., C. T. Friel, K. Lindorff-Larsen, S. E. Radford, M. Karplus, and M. Vendruscolo. 2004. Comparison of the transition state ensembles for folding of Im7 and Im9 determined using all-atom molecular dynamics simulations with φ-value restraints. Proteins. 54:513–525. [DOI] [PubMed] [Google Scholar]
- 47.Geierhaas, C. D., R. B. Best, E. Paci, M. Vendruscolo, and J. Clarke. 2006. Structural comparison of the two alternative transition states for folding of TI I27. Biophys. J. 91:263–275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hammond, G. S. 1955. A correlation of reaction rates. J. Am. Chem. Soc. 77:334–338. [Google Scholar]
- 49.Sánchez, I. E., and T. Kiefhaber. 2003. Hammond behavior versus ground state effects in protein folding: evidence for narrow free energy barriers and residual structure in unfolded states. J. Mol. Biol. 327:867–884. [DOI] [PubMed] [Google Scholar]
- 50.Ding, F., N. V. Dokholyan, S. V. Buldyrev, H. E. Stanley, and E. I. Shakhnovich. 2002. Direct molecular dynamics observation of protein folding transition state ensemble. Biophys. J. 83:3525–3532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Clementi, C., H. Nymeyer, and J. N. Onuchic. 2000. Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298:937–953. [DOI] [PubMed] [Google Scholar]
- 52.Cho, S. S., Y. Levy, and P. G. Wolynes. 2006. P versus Q: structural reaction coordinates capture protein folding on smooth landscapes. Proc. Natl. Acad. Sci. USA. 103:586–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wang, J., K. Zhang, H. Y. Lu, and E. K. Wang. 2005. Quantifying kinetic paths of protein folding. Biophys. J. 89:1612–1620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Du, R., V. S. Pande, A. Y. Grosberg, T. Tanaka, and E. S. Shakhnovich. 1998. On the transition coordinate for protein folding. J. Chem. Phys. 108:334–350. [Google Scholar]
- 55.Plaxco, K. W., K. T. Simons, and D. Baker. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277:985–994. [DOI] [PubMed] [Google Scholar]
- 56.Kuznetsov, I. B., and S. Rackovsky. 2004. Class-specific correlations between protein folding rate, structure-derived, and sequence-derived descriptors. Proteins. 54:333–341. [DOI] [PubMed] [Google Scholar]
- 57.Pintar, A., and S. Pongor. 2005. The “first in-last out” hypothesis on protein folding revisited. Proteins. 60:584–590. [DOI] [PubMed] [Google Scholar]
- 58.Gromiha, M. M., and S. Selvaraj. 2001. Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: application of long-range order to folding rate prediction. J. Mol. Biol. 310:27–32. [DOI] [PubMed] [Google Scholar]
- 59.Zhou, H. Y., and Y. Q. Zhou. 2002. Folding rate prediction using total contact distance. Biophys. J. 82:458–463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Baker, D. 2000. A surprising simplicity to protein folding. Nature. 405:39–42. [DOI] [PubMed] [Google Scholar]
- 61.Plaxco, K. W., K. T. Simons, I. Ruczinski, and D. Baker. 2000. Topology, stability, sequence, and length: defining the determinants of two-state protein folding kinetics. Biochemistry. 39:11177–11183. [DOI] [PubMed] [Google Scholar]
- 62.Makarov, D. E., C. A. Keller, K. W. Plaxco, and H. Metiu. 2002. How the folding rate constant of simple, single-domain proteins depends on the number of native contacts. Proc. Natl. Acad. Sci. USA. 99:3535–3539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Paci, E., K. Lindorff-Larsen, C. M. Dobson, M. Karplus, and M. Vendruscolo. 2005. Transition state contact orders correlate with protein folding rates. J. Mol. Biol. 352:495–500. [DOI] [PubMed] [Google Scholar]
- 64.Bai, Y. W., H. Y. Zhou, and Y. Q. Zhou. 2004. Critical nucleation size in the folding of small apparently two-state proteins. Protein Sci. 13:1173–1181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Humphrey, W., A. Dalke, and K. Schulten. 1996. VMD—visual molecular dynamics. J. Mol. Graph. 14:33–38. [DOI] [PubMed] [Google Scholar]