Abstract
The 20 residue Trp-cage mini-protein is one of smallest proteins that adopt a stable folded structure containing also well-defined secondary structure elements. The hydrophobic core is arranged around a single central Trp residue. Despite several experimental and simulation studies the detailed folding mechanism of the Trp-cage protein is still not completely understood. Starting from fully extended as well as from partially folded Trp-cage structures a series of molecular dynamics simulations in explicit solvent and using four different force fields was performed. All simulations resulted in rapid collapse of the protein to on average relatively compact states. The simulations indicate a significant dependence of the speed of folding to near-native states on the side chain rotamer state of the central Trp residue. Whereas the majority of intermediate start structures with the central Trp side chain in a near-native rotameric state folded successfully within less than 100 ns only a fraction of start structures reached near-native folded states with an initially non-native Trp side chain rotamer state. Weak restraining of the Trp side chain dihedral angles to the state in the folded protein resulted in significant acceleration of the folding both starting from fully extended or intermediate conformations. The results indicate that the side chain conformation of the central Trp residue can create a significant barrier for controlling transitions to a near native folded structure. Similar mechanisms might be of importance for the folding of other protein structures.
Introduction
Understanding the molecular details of the structure formation process of biomolecules is still a challenge in molecular biophysics and in structural biology. The Trp-cage protein is one of the smallest model proteins of just 20 amino acid residues that adopt a well defined fold with secondary structure elements and a hydrophobic core formed around a central Trp6 residue [1]. It can fold spontaneously into a stable 3D structure within ∼4 µs [2]. Structures of this protein were determined by NMR spectroscopy [1] and recently also by X-ray crystallography [3]. The Trp-cage structure contains an N-terminal α-helix, followed by a short 310-helix and a C-terminal Poly-ProII (PPII) helix. The central buried Trp6 residue is surrounded by residues Tyr3, Leu7, Pro12, Pro18 and Pro19 (Figure 1A). The folded structure is further stabilized by a salt bridge between Aps9– Arg16. The folding mechanism of the Trp-cage protein has been extensively investigated by various experimental [2], [4]–[15] and computational [16]–[37] methods and different folding mechanism has been proposed. A consensus folding mechanism of the Trp-cage mini protein has not been obtained yet. A two stage folding mechanism was initially suggested by Qui et al [2] based on laser temperature jump spectroscopy and further supported by a thermodynamic study by Streicher et al [4] using differential scanning calorimetry (DSC) and circular dichroism spectroscopy (CD). However, other experimental studies suggest that the folding of Trp-cage mini protein does not follow a simple two state model, rather follows a more complex process involving at least one stable intermediate along the folding pathway. Neuweiler et al [5] proposed the formation of a molten globule-like intermediate along the folding pathway based on fluorescence correlation spectroscopy experiments. Using UV-resonance Raman Spectroscopy Ahmed at al [6] indicated also a more complicated folding mechanism through an intermediate molten globule state and provided evidence for α-helical structure even in the denatured state of the Trp-cage protein. Extensive hydrophobic contacts were found even in the unfolded state of the protein by Mok et al [7] by employing photochemically induced dynamic nuclear polarization (CINDP)-NMR pulse-label-experiments. In addition, recent experimental studies using NMR and Infrared (IR) T-Jump experiments [8] showed that Trp-cage folding indeed involves intermediates during the pathway where specific native and non-native contacts already pre-exist in a hydrophobically collapsed unfolded ensemble and also suggest that the N-terminal α-helix is formed already at the intermediate stage. The latter view is also supported by a recent combined T-jump/IR-spectroscopy and Molecular Dynamics (MD) simulation study [9] that indicated an intermediate state with a preformed N-terminal α-helical structure but an orientation of the C-terminal PPII helix that differs from the native geometry.
Conventional MD simulations [16]–[20], [24], [28], [35] and advanced sampling simulations such as temperature replica-exchange MD simulation (T-REMD) [21], [23], [25]–[27], [34] have already been used successfully in folding simulation studies of Trp-cage using both implicit and also explicit solvation models. Even more sophisticated techniques like Hamiltonian-replica exchange (H-REMD) [22], [29]–[31] and transition path sampling [32], [33] methods have been used in theoretical folding studies. This also includes metadynamics inspired advanced sampling techniques to cover multi-dimensional energy landscapes [36], [37]. Starting from either unfolded/folded or fully extended structures, simulations were carried out for tenths to hundreds of nanoseconds and even some of them were extended to microseconds [34], [35] and successfully reached near native states with a root mean square deviation (RMSDCα) of the backbone as low as ∼1 Å. Reversible folding and unfolding was observed in simulation studies based on a coarse-grained (CG) representation of the protein and surrounding CG water [38]. Reversible folding and unfolding was also achieved in atomistic explicit solvent MD simulations extending to several hundred µs close to the melting temperature of the protein [35]. Several folding mechanisms including for instance early hydrophobic collapse that is followed by either a partially folded intermediate state or just a compact denatured intermediate state stabilized by long range hydrophobic contacts with early formation of salt bridge [23] and the concurrent formation of α-helix and the hydrophobic core have been suggested [26]–[30]. Recent analysis of long MD simulations and advanced sampling simulations of the Trp-cage protein using Markov-state-models allowed the detailed characterization of the folding kinetics [39]. This simulation study identified a dominant pathway in which the N-terminal α-helix formed before or during the hydrophobic collapse in agreement with results from T-jump experiments coupled with IR spectroscopy [9].
Most simulation studies so far have focused on the pathway of folding, the variety of intermediate states, the topology and secondary structure of sampled backbone conformations and occurrence of native contacts along the folding process. In the present work all atom explicit solvent simulations have been employed to investigate the role of key contacts, in particular of the central Trp6 residue on the Trp-cage folding mechanism. Multiple classical MD simulations starting from either fully extended or various collapsed Trp-cage conformations using four different force fields have been performed. The collapsed non-native starting states showed various degrees of similarity in secondary structure and side chain conformations with the native folded structure. MD simulations starting from the native structure resulted in spontaneous partial unfolding and refolding events. The simulation studies indicate that the conformation of the Trp6 residue has a critical influence on the folding process. Although of similar average initial deviation from the native structure only intermediate collapsed conformations with the Trp6 residue in a near-native rotameric state folded rapidly within <100 ns to the native structure. Stable folding of start structures with Trp6 in the non-native rotameric state required an unfolding followed by a transition of Trp6 to the native rotameric state which then rapidly folded to the native structure. Interestingly, fully extended start structures with the Trp6 residue restrained to the native rotameric side chain conformation folded rapidly in <250 ns whereas the same starting structures did not reach a near-native folded state within 1000 ns (1 µs). The results indicate that the conformation and interaction of a single side chain creates a barrier for folding and has a significant influence on the transition to near natively folded states of the Trp-cage protein. Such effects might be of general importance for other protein folding processes.
Materials and Methods
Simulation Setup
Three different sets of unrestraint continuous molecular dynamics (cMD) simulations were carried out starting from native, extended and intermediate structures of the trp-cage protein. Native simulations were started from the folded conformation of the protein taken from the pdb-entry 1Y2R (1st structure of the NMR ensemble) [1]. In the case of simulations starting from fully extended conformations, an initial extended Trp-cage structure was generated using the xleap module of the Amber package [40]. Intermediate simulations were started from a set of 15 intermediate structures that were obtained from our previous [30] BP-REMD [41] simulation (see below). The Xleap module was used to add missing hydrogen atoms and to add a Cl- ion to neutralize the systems. All systems were solvated with TIP3P [42] water molecules to form a truncated octahedral box with at least 10 Å between solute atoms and the borders of the box. The systems were prepared for using the Amber force fields ff03 [43], ff99SB [44], ff99SB_ILDN [45] and ff99SB_NMR [46]. All simulation systems were subjected to energy minimization (5000 steps) using the sander module of Amber11. During MD simulation the protein was initially harmonically restrained to the energy minimized start coordinates and the system was heated up to 300 K in three steps followed by gradual removal of the positional restraints and 0.5 ns unrestrained equilibration at 300 K. All MD simulations were carried out in explicit solvent at 300 K, long range electrostatic interactions were treated with particle mesh Ewald method [47] using a real space cutoff distance of 9 Å. The Settle [48] algorithm was used to constrain bond vibrations involving hydrogen atoms, which allowed a time step of 2 fs. Simulation trajectories were visualized using VMD [49] and figures were generated using pymol [50].
Unrestrained Production Simulations
The resulting systems were used as starting structures for the respective production runs. In the case of “native” simulations, cMD simulations were carried out for 100 ns starting from the experimental folded structure. In the second set five independent simulations (with different initial velocities) under each force field were started from completely extended structures and cMD simulations were carried out for 500 ns. In the third set 15 different simulations were carried out for 100 ns starting from 15 different intermediate structures that were identified from a previous Trp-cage folding simulation study [30]. Details of various simulations are summarized in Table 1.
Table 1. Simulation details.
Force field | Simulation time (ns) | ||||
Unrestraint | Restrainta | ||||
Native start structure | Extended startstructure | Intermediate start structure | Extended startstructure | Intermediate start structure | |
ff03 | 1×1001×500b | 5×5001×1000 | 15×100 | 1×250 | 1×100 |
ff99SB | 1×100 | 5×5001×1000 | 15×100 | 1×250 | 1×100 |
ff99SB_ILDN | 1×100 | 5×500 | 15×100 | – | 1×100 |
ff99SB_NMR | 1×100 | 5×500 | 15×100 | – | 3×100 |
Side chain dihedral angels of the Trp6 residue was weakly restrained towards the conformation in the native folded structure (see Methods).
Simulation started from native structure but with the side chain dihedrals of Trp6 residue in non-native side chain conformation (flipped by 180° from its native rotameric state).
Intermediate Starting Structures
All the intermediate structures that are used in this study were identified from our previous study [30] on folding of Trp-cage protein employing a Hamiltonian replica exchange MD approach based a backbone biasing potential along the replica runs (BP-REMD) [41]. The purpose of the biasing potential in these simulations was to reduce the energy barrier associated with the peptide backbone dihedral transitions. With this approach it was possible to successfully fold Trp-cage protein using a small set of replicas. Folding simulations typically resulted in a non-uniform distribution of conformations vs. selected collective variables such as Rmsd or fraction of native contacts. For the Trp-cage protein in a plot of total RMSD with respect to the native structure and fraction of native contacts two highly populated regimes corresponding to folded and unfolded states could be identified. Intermediate structures were randomly picked from the regime between the highly populated parts of the distribution corresponding to RMSD values of 2–6 Å and fraction of native contacts of 40–80%. The conformations were chosen to span the relevant conformational regime (RMSD: 2–6 Å) in a best possible way with a maximum pair-wise RMSD equivalent to partitioning into a set of 15 clusters with pair wise RMSD of ∼2–3 Å. Most of the structures also included a partially of completely formed N-terminal near-native α-helix.
Simulations Including Dihedral Angle Restraints on the Trp6 Side Chain Conformation
To investigate the influence of the folding process on the conformation of the Trp6 side chain harmonic restraints were included to keep the side chain dihedrals χ1 and χ2 within a window of +/−10° to 180° and 270°, respectively, observed in the native trp-cage structure (force constant for deviations >10° was 2 kcal mol−1deg−2). Restrained simulations were carried out starting from fully extended structures and from intermediate collapsed structures as indicated in Table 1 and in the Results section.
Results and Discussion
Simulations Starting from the Native Trp-cage Structure
Starting from the experimental Trp-cage structure (pdb1Y2R) cMD simulations with four different force fields were carried out for 100 ns at 300 K (Table 1). Overall in all simulations under all force field conditions the sampled structures remained close to the experimental start conformation (Figure 1). However, in several cases occasional large root mean square deviations (RMSDCα) of up to ∼3 Å from the native structure were observed (Figure 1B). These fluctuations corresponded largely to a single type of partial unfolding and refolding of the C-terminal PolyPro II (PPII) segment (illustrated in Figure 1D) dissociating from the hydrophobic core (transient disruption of the Trp6-Pro18 contact in the folded state). Conformational variability of the C-terminal segment has also been observed in the published ensemble of NMR structures [1]. Interestingly, a different orientation of the C-terminal segment relative to the N-terminal α-helix has been indicated as main difference between an intermediate and the folded structure in a T-jump/IR study [9]. In addition, in all four different force field simulations the salt bridge between Asp9 and Arg16 showed significant fluctuations and frequent disruption and reformation (Figure 1C). This has also been reported in previous MD simulation studies [14], [21], [30].
Simulations Starting from Fully Extended States
Continuous MD simulations were started from a fully extended Trp-cage protein structure generated using the Amber leap module. Five independent runs (for each force field resulting in 20 independent simulations, Table 1) using different initial velocity assignments were carried out for 500 ns at 300 K and conformations were stored for every 4 ps. After an initial rapid reduction in RMSDCα (from ∼12 Å to ∼6 Å, and even down to 3.5 Å) the sampled conformations remained on average in a collapsed state but did not reach near native structures with RMSDCα <3 Å (Figure 2A). This was found for all force fields (Amber ff03, shown in Figure 2, not shown for ff99SB, ff99SB_NMR and ff99SB_ILDN). The observation of a largely collapsed unfolded state agrees with both experimental and other computational studies [5]–[15], [26], [29]–[31]. Interestingly, however, the sampled structures in these simulations occasionally approached the backbone conformation of the partially unfolded states observed during the MD simulations starting from the native Trp-cage protein (discussed in the previous paragraph) quite closely (Figure 2B,C). In the simulations starting from the native structure these backbone states folded rapidly back into the native conformation (within few ns, Figure 1B) but not if started from the extended structures. Instead, the conformations are trapped in the intermediate state for significant simulation times. Hence, there must be another important conformational difference with respect to the native structure that prevents the transition to the native conformation (with a final RMSDCα <2 Å). As indicated below one main component of this difference corresponds to the side chain conformation of the Trp6 residue. The Trp6 dihedral side chain conformation adopted only infrequently the native rotameric state during the simulations that resulted in unfolded collapsed states (∼10%, see Figure S1) and remained in one rotameric state for long periods of simulation time (hundreds of ns, Figure S1). Besides Trp6, there is only one other large aromatic side chain, Tyr3, that participates in the hydrophobic core of the Trp-cage protein. In contrast to Trp6, this side chain adopted mainly the native rotameric state (>70%) during the 500 ns simulation time starting from extended Trp-cage conformations (and also undergoes frequent transitions).
In each of the four different force fields part of the native secondary structure was already observed in most of the cMD simulations although the folded state was not reached. In particular the N-terminal α-helix formed by residues 2–8 in the folded trp-cage protein was partially formed in most of the cMD simulations starting from extended states (Figure S2). This agrees with several experimental results [5]–[15] which provide evidence for the existence of the N-terminal α-helix, already in the collapsed intermediate state. Extension of simulations up to 1 µs did not result in the folded conformation [black lines in Figure 3]. In long cMD simulations of the Trp-cage protein by Lindorff-Larsen et al. [35] close to the melting temperature of the protein and a mean folding time of 12 µs was observed. This time, although obtained with a different force field, is considerably larger than the simulation times in the present study and is qualitatively consistent with the absence of folding events on the present time scales ∼1 µs.
As already discussed above the collapsed but non-native states of Trp-cage with a RMSDCα of 3–4 Å must differ from the native structure in some important aspect that prevents rapid folding (one may call this a pre-transition state ensemble). In order to identify these key elements we performed several simulations starting from several intermediate structures which indicate that the orientation of the side chain of Trp6 is an important factor for rapid folding. To systematically investigate the importance of conformation/orientation of the Trp6 side chain we carried out simulations with restrains on the side chain dihedrals of the Trp6 residue. Starting again from a fully extended structure two independent simulations (using ff03 and ff99SB force fields, respectively) were carried out for 250 ns with the Trp6 χ1 and χ2 side chain dihedrals restrained to stay close to the native rotameric state in the folded structure (see Methods section for details, Table 1). Interestingly, both simulations reached the folded state of the protein with a RMSDCα ∼1.5 Å already after ∼200 ns time (red lines in Figure 3A, B). The short folding time suggests a folding with significantly reduced barrier. As indicated already, unrestrained simulations started from the same initial conditions extended to 1000 ns did not reach the folded state (Figure 3A, B).
Simulations Starting from Various Intermediate Conformations
MD simulations were also started from 15 “intermediate” Trp-cage structures that were obtained in previous BP-REMD folding simulation studies [30] (Table 1, 2). Each structure was re-equilibrated before starting 100 ns MD simulations employing the four different force fields (see Methods for details). The intermediate structures were all in collapsed states with an Rg and RMSD close to what was observed as average Rg and RMSD of conformations obtained in the simulations starting from fully extended structures. Each starting structure differed, however, in the degree of initial near native structure and topology. In one set of intermediate structures (termed set 1) in addition to the N-terminal α-helix, other important native secondary structural elements such as the 310 helix and the C- terminal PPII helix were partially formed (Figure 4A). Some of these intermediate structures are overall similar to the topology seen for partially unfolded structures observed during simulations of the native Trp-cage structure (Figure 1) with a RMSDCα <3 Å from the native conformation and importantly with the Trp6 residue in the near native rotameric state (Table 2). For the start structures and all force fields rapid folding was observed within 100 ns down to RMSDCα<∼1.5 Å (Figure S3, Table 2) and most of the important structural features seen in the experimental Trp-cage structure were formed. This includes also the salt bridge between Asp9 and Arg16 which was formed but with frequent fluctuations. One can conclude that the set 1 starting conformations are within an “attractive basin” of the folded state and there is no significant barrier left with respect to the folded conformation (post-transition state ensemble).
Table 2. Results of simulations starting from different intermediate state of Trp-cage.
S. No | Starting RMSDCα (Å) | Final RMSDCα (Å) | Final state | |||
ff03 | ff99SB | ff99SB_ILDN | ff99SB_NMR | |||
Set1 | ||||||
1 | 2.8 | 0.9 | 1.0 | 0.8 | 0.9 | Folded |
2 | 3.1 | 0.8 | 1.0 | 0.9 | 0.8 | Folded |
3 | 1.8 | 0.8 | 0.9 | 1.0 | 0.85 | Folded |
4 | 3.2 | 1.0 | 1.1 | 1.2 | 1.1 | Folded |
5 | 2.9 | 0.9 | 0.9 | 1.0 | 0.9 | Folded |
Set2 | ||||||
6 | 3.9 | 0.8 | 0.8 | 1.0 | 1.1 | Folded |
7 | 4.0 | 1.0 | 1.2 | 1.2 | 1.1 | Folded |
8 | 4.9 | 0.9 | 1.2 | 1.1 | 1.0 | Folded |
9 | 5.5 | 1.1 | 1.3 | 1.2 | 0.9 | Folded |
10 | 4.5 | 1.0 | 0.9 | 1.0 | 1.0 | Folded |
11 | 3.8 | 2.9 | 3.9 | 3.4 | 3.5 | Misfolded |
12 | 4.1 | 4.5 | 4.4 | 4.3 | 4.5 | Collapsed |
13 | 5.0 | 6.2(1.0) | 6.0(0.8) | 6.5(1.9) | 6(0.9) | Unfolded (folded) |
14 | 4.0 | 7.0 | 6.5 | 5.5 | 6(1.2) | Unfolded (folded) |
15 | 3.5 | 4.0 | 5.0 | 4.5 | 3.5(1.0) | Collapsed (folded) |
Values in brackets correspond to the results of simulation with restraints on the side chain dihedral angles of the Trp6 residue to keep it close to the conformation in the native folded structure (see Methods for details). Start conditions for unrestraint and restraint simulations were the same. The Trp6 residue was in its native rotameric state in all the Set1 intermediate start structures and in non-native rotameric state in all the Set2 intermediate start structures.
In another set of intermediate structures (set 2) only the N-terminal α-helix was already partially formed, but the rest of the protein differed significantly from the native structure missing the 310 helix, part of the PPII helix and proper native tertiary contacts (Figure 4B, Table 2). These intermediate structures had an RMSDCα of 3.5–6.0 Å from the experimental structure and in most cases Trp6 was in a non-native rotameric state (Figure 4B, Table 2). In contrast to the first set only a part of the start structures from the second set of intermediate conformations reached the folded state of the Trp-cage protein within 100 ns MD simulation time (Figure 5, Table 2). In some of the simulations the Trp-cage conformation remained close to the start structure and also unfolding of the protein was observed (Figure S4, Table 2). Detailed analysis of simulation trajectories that reached the folded state of the protein revealed some interesting key events that occurred along the folding pathway. Interestingly, in most of the simulations that resulted in folding to near-native conformations the protein eventually started to move away from its compact start state (Figure 5, 6). The final transition to the folded conformation was typically associated with a sudden drop of the RMSDCα from ∼3–7 Å down to ∼<1.5 Å. A prerequisite for successful collapse to the native structure was either the presence of a near-native rotameric state of the Trp6 side chain or the transition of the Trp6 side chain to the native rotameric state typically several nanoseconds before the successful collapse to the near-native conformation (indicated for one case per force field in Figure 6 and in Figures S5, S6, S7). In the majority of successful folding events, the Trp6 first establishes interaction with the Pro12 residue (in the middle 310 helix segment). Subsequently, contacts of Trp6 with Pro18 or Pro17 from the PPII motif at the C-terminus are established to complete the process (Figure 6E, and Figure 7). Successful folding to low final RMSDCα is especially tightly coupled to contacts between Trp6 and Pro12 as well as Pro18 (there are no near-native Trp-cage structures without close Trp6-Pro12 and Trp6-Pro18 contacts, Figure 7). There is no such tight coupling between the Tyr3-Pro19 distance and occurrence of near –native Trp-cage structures. Even for sampled conformations with RMSDCα <2 Å from the native structure the Tyr3-Pro19 distance can vary between 5–12 Å (Figure 7A).
In order to further check the importance of the Trp6 rotameric state for the speed of the final Trp-cage folding, simulations in the presence of side chain dihedral restraints of the Trp6 to keep it close to the native rotameric state were performed starting from a set 2 start structure that didn’t reach the folded state within 100 ns time (similar to the simulations starting from extended structures, see above). As a result of dihedral angle restraints, the set 2 starting structures folded successfully to near-native structures within 100 ns, while the same structures didn’t reach the correctly folded state during 100 ns unrestrained simulation (compare red and black RMSDCα curves in Figure 8 and summarized results in Table 2).
One interesting non-native compact structure was sampled with Trp6 partially buried in the hydrophobic core but with the χ2 dihedral angle in the incorrect (∼60° instead of ∼270°) orientation (Figure 9). To investigate the stability of this non-native compact structure with incorrect conformation of the Trp6 residue, simulations were carried out starting from the native structure but with the incorrect side chain dihedrals of Trp6 residue (Figure 9D). The RMSDCα of this structure reached 2.5–3 Å with respect to the native Trp-cage conformation, already at very early stage of simulation and the simulation of this structure was extended to 500 ns. However, the conformation was only transiently stable and eventually underwent a transition to the natively folded conformation (Figure 9). Similar to other collapsed intermediate start structures this transition required a partial unfolding to allow the side chain to flip into the native conformation (increased RMSDCα from the native structure of ∼4–5 Å in the time range of 50–200 ns in Figure 9A) followed by a collapse to a stable near-native structure at 300 ns (with a RMSDCα <2 Å). This collapse was preceded by a flip to the correct Trp6 rotameric state (Figure 9B, C) approximately ∼70 ns before the final folding.
Analysis of Salt-bridge Formation
The salt bridge between Asp9– Arg16 observed in the native folded Trp-cage structure was also formed transiently in several simulations starting from extended or collapsed intermediate states. However, overall no significant correlation between salt bridge formation and backbone deviation of the sampled states from the native fold was observed (Figure 10A). As the RMSDCα of the sampled conformations reached ∼1.5 Å, the salt bridge distance between the Asp9 and Arg16 was still larger than 10.0 Å. However, transient formation of salt bridge was seen once the simulated system reached the near native state (with RMSDCα ∼1.5–2 Å). This agrees with the rapid fluctuations of the salt-bridge seen in the simulations that started from the native structure (Figure 1C). Contradictory to several previous simulation studies [17], [18], [20], [23], [26], [34] that often employed implicit solvent models, our results indicate that the formation of the salt bridge is not a prerequisite for the proper folding of the mini protein. And in agreement with experimental findings [8], [11], [12], [14], the present explicit solvent folding simulations indicate that salt bridge could form at a later stage of the folding process. However, in the absence of salt bridge an alternatively folded conformation was sampled as second most populated state (∼10% folded state) during the simulations. The accumulation of this state is stabilized by Pro17 stacking on Trp6 instead of Pro18 stacking on Trp6 in the native Trp-cage structure (Figure 10B, C). A similar structure was also observed previously in folding simulations of Trp-cage protein using BP-REMD [22], [30]. It was also sampled in the simulations starting from the native Trp-cage structure (not shown).
Conclusions
Trp-cage being one of the smallest and fastest folding proteins is also one of the most extensively studied proteins both experimentally and computationally. Although much has been debated about the folding pathway of the Trp-cage mini protein, it is widely accepted now with increasing experimental evidence, that the folding of Trp-cage is not a simple two stage folding mechanism but rather involves semi-stable intermediates along the folding pathway [5]–[9], [14], [15]. Early formation of near-native tertiary contacts (long range hydrophobic) is a major driving force for the collapse of this extended mini-protein to a compact intermediate state, as it is often believed to be a major reason for the fast folding of proteins. Although the initial hydrophobic collapse event was already seen in various successful folding and unfolding studies of the Trp-cage mini protein, the key events that result in final folding are not completely understood. Experimental mutagenesis studies indicate the importance of the central Trp6 residue to the folding and stability of the protein [51]. Substitution by phenylalanine or histidine results in strong destabilization (W6F) or complete unfolding (W6H) of the protein [51]. Previous simulation studies on the Trp-cage protein have employed a range of different force field models to study the folding process. This included coarse-grained or hybrid-resolution representations [38] and atomistic protein representation combined with either implicit solvent models [16]–[22] or explicit solvent models [23]–[35]. It is likely that these force field models differ in the results due to differences in the protein-solvent and side chain-backbone interactions. All these studies have largely focused on the different pathways to the folded topology and have resulted in characterization of backbone intermediate states, the critical role of secondary structure formation in different pathways of folding and the occurrence of native contacts along the folding process. Also, the important role of the Trp6 residue has already been recognized in simulation studies [19], [20] and has been included in the clustering of intermediate states [36], [37]. Experimentally, it is also not possible to capture the movement of key individual residues with high time resolution. To shed more light on the folding mechanism of the Trp-cage protein and to characterize the key role of the central Trp6 residue we employed all atom molecular dynamics simulation. The comparison of four different force fields indicated in all cases formation of intermediate collapsed and semi-compact states after simulations times of less than 500 ns starting from extended conformations. Furthermore, comparative simulations starting from fully extended states indicated a significant acceleration of the folding process if the central Trp6 residue was restraint to its native rotameric state. In addition, it was possible to define a set of intermediate starting structures with most secondary structures at least partially formed and the Trp6 residue in the near-native rotameric conformation that all folded rapidly (in all force fields) within <100 ns. In contrast, only few simulations reached the native state during 100 ns in case of a second set of collapsed structures with Trp6 in a non-native rotameric state. Rapid folding could, however, be induced even for this set if the Trp6 was restraint to its near-native rotamer during the MD simulations. The simulation studies provide qualitative evidence that the rotameric state of a single side chain creates a significant barrier for folding and can act as a control element for the folding process. Many of the rapid collapse events are unsuccessful due to an incorrect Trp6 side chain rotameric state. It determines at least in part the transition from intermediate collapsed states towards the native folded protein structure. It is likely that similar side chain effects play also a key role for the folding of other protein molecules. Previous simulation studies have given partially contradicting results on the role of the Asp9-Arg16 salt bridge on Trp-cage folding. Especially, simulations in implicit solvent emphasized the role of the salt-bridge for the formation and stability of the folded structure [17], [18], [20]. In agreement with experimental findings the Asp9-Arg16 salt bridge was neither well preserved during the simulations (in all force fields), nor is it formed in all the folded conformations that resembles the near-native structure. However, the absence of the salt bridge appeared to be partially coupled to one local backbone conformational change which corresponded to a shift of the C-terminus such that Pro17 instead of Pro18 stacked on the Trp6.
Supporting Information
Acknowledgments
This work was performed using supercomputer resources provided by the European Union PRACE-0 project and computing facilities at the Agency for Science, Technology, and Research (A*STAR) Computing Resource Centre (A*CRC).
Funding Statement
We thank the DFG (Center of Integrated Protein Science Munich: CIPSM), European Union (PRACE-0 project pr89tu) and Joint Council Office (JCO) (11/03/FG/07/05), Agency for Science, Technology and Research (A* STAR), Singapore, for financial support. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Neidigh JW, Fesinmeyer MR, Andersen HN (2002) Designing a 20-residue protein. Nat. Struct. Biol. 9: 425–430. [DOI] [PubMed] [Google Scholar]
- 2. Qiu L, Pabit SA, Roitberg AE, Hagen SJ (2002) Smaller and faster: the 20-residue Trp-cage protein folds in 4 micros. J. Am. Chem. Soc. 124: 12952–12953. [DOI] [PubMed] [Google Scholar]
- 3. Scian M, Lin JC, Le Trong I, Makhatadze GI, Stenkamp RE, et al. (2012) Crystal and NMR structures of a Trp-cage mini-protein benchmark for computational fold prediction. Proc. Natl. Acad. Sci. USA. 109: 12521–12525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Streicher WW, Makhatadze GI (2007) Unfolding thermodynamics of Trp-cage, a 20 residue miniprotein, studied by differential scanning calorimetry and circular dichroism spectroscopy. Biochemistry. 46: 2876–2880. [DOI] [PubMed] [Google Scholar]
- 5. Neuweiler H, Doose S, Sauer M (2005) A microscopic view of miniprotein folding: enhanced folding efficiency through formation of an intermediate. Proc. Natl. Acad. Sci. USA 102: 16650–16655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Ahmed Z, Beta IA, Mikhonin AV, Asher SA (2005) UV-resonance Raman thermal unfolding study of Trp-cage shows that it is not a simple two-state miniprotein. J. Am. Chem. Soc. 127: 10943–10950. [DOI] [PubMed] [Google Scholar]
- 7. Mok KH, Kuhn LT, Goez M, Day IJ, Lin JC, et al. (2007) A pre-existing hydrophobic collapse in the unfolded state of an ultrafast folding protein. Nature. 447: 106–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Culik RM, Serrano AL, Bunagan MR, Gai F (2011) Achieving secondary structural resolution in kinetic measurements of protein folding: a case study of the folding mechanism of Trp-cage. Angew. Chem. Int. Ed. Engl. 50: 10884–10887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Meuzelaar H, Marino KA, Huerta-Viga A, Panman MR, Smeenk LEJ, et al. (2013) Folding Dynamics of the Trp-cage Miniprotein: Evidence for a Native-like Intermediate from combined Time-Resolved Vibrational Spectroscopy and Molecular Dynamics Simulations. J Phys. Chem. B. 117: 11490–11501. [DOI] [PubMed] [Google Scholar]
- 10. Rovo P, Farkas V, Hegyi O, Szolomájer-Csikós O, Tóth GK, et al. (2011) Cooperativity network of Trp-cage miniproteins: probing salt-bridges. J. Pept. Sci. 17: 610–619. [DOI] [PubMed] [Google Scholar]
- 11. Williams DV, Byrne A, Stewart J, Andersen NH (2011) Optimal salt bridge for Trp-cage stabilization. Biochemistry. 50: 1143–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Barua B, Lin JC, Williams WD, Kummler P, Neidigh JW, et al. (2008) The Trp-cage: optimizing the stability of a globular miniprotein. Protein Eng. Des. Sel. 21: 171–185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Hudky P, Strner P, Farkas V, Váradi G, Tóth G, et al. (2008) Cooperation between a Salt Bridge and the Hydrophobic Core Triggers Fold Stabilization in a Trp-Cage Miniprotein. Biochemistry. 47: 1007–1016. [DOI] [PubMed] [Google Scholar]
- 14. Halabis A, Zmudzinska W, Liwo A, Oldziej S (2012) Conformational dynamics of the trp-cage miniprotein at its folding temperature. J. Phys. Chem. B. 116: 6898–6907. [DOI] [PubMed] [Google Scholar]
- 15. Lai Z, Preketes NK, Mukamel S, Wang J (2013) Monitoring the folding of Trp-cage peptide by two-dimensional infrared (2DIR) spectroscopy. J. Phys. Chem. B. 117: 4661–4669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Simmerling C, Strockbine B, Roitberg AE (2002) All-atom structure prediction and folding simulations of a stable protein. J. Am. Chem. Soc. 124: 11258–11259. [DOI] [PubMed] [Google Scholar]
- 17. Snow CD, Zagrovic B, Pande VS (2002) The Trp cage: folding kinetics and unfolded state topology via molecular dynamics simulations. J. Am. Chem. Soc. 124: 14548–14549. [DOI] [PubMed] [Google Scholar]
- 18. Chowdhury S, Lee MC, Xiong G, Duan Y (2003) Ab initio folding simulation of the Trp-cage mini-protein approaches NMR resolution. J. Mol. Biol. 327: 711–717. [DOI] [PubMed] [Google Scholar]
- 19. Chowdhury S, Lee MC, Duan Y (2004) Characterizing the rate-limiting step of Trp-Cage folding by all-atom molecular dynamics simulations. J. Phys. Chem. B. 108: 13855–13865. [Google Scholar]
- 20. Hu Z, Tang Y, Wang H, Zhang X, Lei M (2008) Dynamics and cooperativity of Trp-cage folding. Arch. Biochem. Biophys. 475: 140–147. [DOI] [PubMed] [Google Scholar]
- 21. Pitera JW, Swope W (2003) Understanding folding and design: replica-exchange simulations of “Trp-cage” miniproteins. Proc. Natl. Acad. Sci. USA. 100: 7587–7592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Kannan S, Zacharias M (2009) Folding of Trp-cage mini protein using temperature and biasing potential replica-exchange molecular dynamics simulations. Int. J. Mol. Sci. 10: 1121–1137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Zhou R (2003) Trp-cage: folding free energy landscape in explicit water. Proc. Natl. Acad. Sci. USA. 100: 13280–13285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Ota M, Ikeguchi M, Kidera A (2004) Phylogeny of protein-folding trajectories reveals a unique pathway to native structure. Proc. Natl. Acad. Sci. USA. 101: 17658–17663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Beck DA, White GW, Daggett V (2007) Exploring the energy landscape of protein folding using replica-exchange and conventional molecular dynamics simulations. J. Struct. Biol. 157: 514–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Paschek D, Nymeyer H, Garcia AE (2007) Replica exchange simulation of reversible folding/unfolding of the Trp-cage miniprotein in explicit solvent: on the structure and possible role of internal water. J. Struct. Biol. 157: 524–533. [DOI] [PubMed] [Google Scholar]
- 27. Paschek D, Hempel S, Garcia AE (2008) Computing the stability diagram of the Trp-cage miniprotein. Proc. Natl. Acad. Sci. USA. 105: 17754–17759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Yao XQ, She ZS (2008) Key residue-dominated protein folding dynamics. Biochem. Biophys. Res. Commun. 373: 64–68. [DOI] [PubMed] [Google Scholar]
- 29. Xu W, Mu Y (2008) Ab initio folding simulation of Trpcage by replica exchange with hybrid Hamiltonian. Biophys. Chem. 137: 116–125. [DOI] [PubMed] [Google Scholar]
- 30. Kannan S, Zacharias M (2009) Folding simulations of Trp-cage mini protein in explicit solvent using biasing potential replica-exchange molecular dynamics simulations. Proteins. 76: 448–460. [DOI] [PubMed] [Google Scholar]
- 31. Marinelli F, Pietrucci F, Laio A, Piana S (2009) A kinetic models of Trp-cage folding from multiple biased molecular dynamics simulations. PLoS Comput. Biol. 5(8): e1000452 doi:10.1371/journal.pcbi.1000452 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Marinelli F (2013) Following easy slope paths on a free energy landscape: The case study of the Trp-cage folding mechanism. Biophys J. 105: 1236–1247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Shao Q, Shi J, Zhu W (2012) Enhanced sampling molecular dynamics simulation captures experimentally suggested intermediate and unfolded states in the folding pathway of Trp-cage miniprotein. J. Chem. Phys. 137: 125103. [DOI] [PubMed] [Google Scholar]
- 34. Juraszek J, Bolhuis PG (2006) Sampling the multiple folding mechanisms of Trp-cage in explicit solvent. Proc. Natl. Acad. Sci. USA. 103: 15859–15864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Zheng W, Gallicchio E, Deng M, Andrec M, Levy RM (2011) Kinetic network study of the diversity and temperature dependence of Trp-Cage folding pathways: combining transition path theory with stochastic simulations. J. Phys. Chem. B. 115: 1512–1523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Day R, Paschek D, Garcia AE (2010) Microsecond simulations of the folding/unfolding thermodynamics of the Trp-cage miniprotein. Proteins. 78: 1889–1899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) How fast-folding proteins fold. Science. 334: 517–520. [DOI] [PubMed] [Google Scholar]
- 38.Han W, Schulten K (2013) Characterization of Folding Mechanisms of Trp-Cage and WW-Domain by Network Analysis of Simulations with a Hybrid-Resolution Model. J. Phys. Chem. B. doi:10.1021/jp404331d. [DOI] [PMC free article] [PubMed]
- 39.Deng NJ, Dai W, Levy RM (2013) How Kinetics within the Unfolded State Affects Protein Folding: An Analysis Based on Markov State Models and an Ultra-Long MD Trajectory. J. Phys. Chem. B. doi:10.1021/jp401962k. [DOI] [PMC free article] [PubMed]
- 40.Case DA, Darden TA, Cheatham TE III, Simmerling CL, Wang J, et al.. (2010) Amber 11. University of California, San Francisco.
- 41. Kannan S, Zacharias M (2007) Enhanced sampling of peptide and protein conformations using replica exchange simulations with a peptide backbone biasing-potential. Proteins. 66: 697–706. [DOI] [PubMed] [Google Scholar]
- 42. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79: 926–935. [Google Scholar]
- 43. Duan Y, Wu A, Chowdhury CS, Lee MC, Xiong G, et al. (2003) A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations. J. Comput. Chem. 24: 1999–2012. [DOI] [PubMed] [Google Scholar]
- 44. Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, et al. (2006) Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins. 65: 712–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, et al. (2010) Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins. 78: 1950–1958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Li DW, Bruschweiler R (2010) NMR-based protein potentials. Angew. Chem. Int. Ed. 49: 6778–6780. [DOI] [PubMed] [Google Scholar]
- 47. Darden T, York D, Pedersen L (1993) Particle mesh Ewald: an N·log(N) method for Ewald sums in large systems. J. Chem. Phys. 98: 10089–10092. [Google Scholar]
- 48. Miyamoto S, Kollman PA (1992) Settle: an analytical version of the SHAKE and RATTLE algorithm for rigid water models. J. Comput. Chem. 13: 952–962. [Google Scholar]
- 49. Humphrey W, Dalke A, Schulten K (1996) VMD–visual molecular dynamics. J. Mol. Graph. 14: 33–38. [DOI] [PubMed] [Google Scholar]
- 50.De Lano W (2002) The PyMOL molecular graphics system.San Carlos CA, USA: De Lano Scientific.
- 51. Barua B, Andersen NH (2002) Determinants of Miniprotein Stability: Can anything replace a buried H-bonded Trp sidechain. Lett. Pept. Sci. 8: 221–226. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.