Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Aug 7.
Published in final edited form as: Phys Rev E. 2018 Mar;97(3-1):032402. doi: 10.1103/PhysRevE.97.032402

Impact of hydrodynamic interactions on protein folding rates depends on temperature

Fabio C Zegarra 1,2, Dirar Homouz 1,2,3, Yossi Eliaz 1,2, Andrei G Gasic 1,2, Margaret S Cheung 1,2,*
PMCID: PMC6080349  NIHMSID: NIHMS981761  PMID: 29776093

Abstract

We investigated the impact of hydrodynamic interactions (HI) on protein folding using a coarse-grained model. The extent of the impact of hydrodynamic interactions, whether it accelerates, retards, or has no effect on protein folding, has been controversial. Together with a theoretical framework of the energy landscape theory (ELT) for protein folding that describes the dynamics of the collective motion with a single reaction coordinate across a folding barrier, we compared the kinetic effects of HI on the folding rates of two protein models that use a chain of single beads with distinctive topologies: a 64-residue α/β chymotrypsin inhibitor 2 (CI2) protein, and a 57-residue β-barrel α-spectrin Src-homology 3 domain (SH3) protein. When comparing the protein folding kinetics simulated with Brownian dynamics in the presence of HI to that in the absence of HI, we find that the effect of HI on protein folding appears to have a “crossover” behavior about the folding temperature. This means that at a temperature greater than the folding temperature, the enhanced friction from the hydrodynamic solvents between the beads in an unfolded configuration results in lowered folding rate; conversely, at a temperature lower than the folding temperature, HI accelerates folding by the backflow of solvent toward the folded configuration of a protein. Additionally, the extent of acceleration depends on the topology of a protein: for a protein like CI2, where its folding nucleus is rather diffuse in a transition state, HI channels the formation of contacts by favoring a major folding pathway in a complex free energy landscape, thus accelerating folding. For a protein like SH3, where its folding nucleus is already specific and less diffuse, HI matters less at a temperature lower than the folding temperature. Our findings provide further theoretical insight to protein folding kinetic experiments and simulations.

I. INTRODUCTION

Aqueous solvent plays an essential role in the dynamics of proteins by inducing the hydrophobic collapse of the chain and helping in the search of the specific three-dimensional structure to perform their biological function [1]. However, the motions of the solute particles of the protein are not independent and intimately coupled by the solvent. As solute particles move, they induce a flow in the solvent, which, in turn, affects the motion of neighboring solute particles. These long-range interactions between solute particles through solvent, known as hydrodynamic interactions (HI), have been shown to play an essential role in protein dynamics [24]. HI is studied extensively in polymers both analytically [5,6] and numerically [7]. HI generally accelerates the speed of collapse [811] when a polymer is quenched from good to poor solvent at θ temperature. Unlike a homopolymer, a heteropolymeric protein is made up of 20 different amino acids that interact through electrostatics or van derWaals forces to various extents. These interactions are long range in nature, which complicates the analysis of HI in protein folding. Several groups have employed computer simulations [1215] on the investigation of HI effects on protein folding with Langevin dynamics.

Up to now the outcome from coarse-grained protein folding simulations on whether HI accelerates or deters protein folding often varies by research groups. The Cieplak group and the Elcock group showed that HI moderately accelerates the folding kinetic rates by a factor of 1.2 to 3.6 [12,13]. A recent study by the Scheraga group argued that HI reduces the folding kinetic rates [14]. Furthermore, Kikuchi et al. [15] claimed that HI has accelerated kinetic rates, albeit a small effect. Noticeably, there is scarce work being done on the temperature dependence of these findings. The discussion about temperature is necessary because protein folding from an unfolded configuration to a natively compact one requires imperfect cancellation of configurational entropy loss and enthalpy gain during the course of collapse, which gives rise to a temperature-dependent activation barrier [16,17]. Without a comprehensive investigation over a wide range of temperature, it is challenging to delineate the real impact of HI on protein folding rates. Despite the confounding results from the groups mentioned above, they all might be correct at their own specific temperature range.

Our motivation is to reconcile the differences in reported influences of HI on protein folding over a wide range of temperature from the viewpoint of the folding energy landscape theory [17,18], particularly with a funnel-shaped energy landscape [19]. We used a computer protein model that is guaranteed to fold into the native state from any unfolded conformation [20]. We tracked its collective motion on a single reaction coordinate, the fraction of the native contact formation Q either on a thermodynamic free energy barrier or by kinetic trajectories. We studied the effects of HI on folding of two well-studied model proteins with distinctive topologies: one is the 64-residue α/β protein chymotrypsin inhibitor 2 (CI2) [21] shown in Fig. 1(a), and the other is the 57-residue β-barrel α-spectrin Src-homology 3 (SH3) domain [22] shown in Fig. 1(b). The two proteins fold and unfold in a two-state manner and have been used for studying folding mechanisms from other computational studies [2327]. We simulated the Brownian dynamics of particles including HI by implementing the algorithm developed by Ermak and McCammon [28]. The effects of HI are approximated through a configuration-dependent diffusion tensor D used in the Brownian equation of motion.

FIG. 1.

FIG. 1

Representations of protein models of chymotrypsin inhibitor 2 (CI2) (top row) and the α-spectrin Src-homology 3 (SH3) domain (bottom row). The protein models are in [(a), (b)] a cartoon representation, [(c), (d)] a Cα-only representation, and [(e), (f)] a protein topology cartoon created with Pro-origami [32]. Arrows are β strands and cylinders are helices. Structures from [(a), (b), (c), (d)] were created with VMD [33]. The secondary structures were assigned using DSSP [34]. The key residues for the hydrophobic core (A16, L49, and I57) and the minicore (L32, V38, and F50) are represented with green and orange beads, respectively, for CI2 in (a). Residues of the diverging turn (M20, K21, K22, G23, and D24) and distal loop (N42 and D43) are represented with green and orange for SH3 in (b).

Our study shows that the effect of HI on folding rates can both accelerate protein folding at a temperature lower than the folding temperature and retard protein folding speed at a temperature higher than the folding temperature, in comparison with the folding dynamics without HI. Since HI affects the kinetic ordering of contact formation, for a protein with multiple viable folding pathways like CI2, HI will favor a particular folding route in a complex folding energy landscape. In that sense, energy landscape theory (ELT) is short of fully predicting folding rates. From Secs. IIIB to III E, we investigate the cause of this temperature dependence of the effect of HI on folding rates and the implications for energy landscape theory. We also suggest a possible experimental design to probe the impact of HI on folding based on a temperature-dependent ϕ-value analysis.

II. MODELS AND METHODS

A. Coarse-grained protein model

We used a coarse-grained, structure-based model [20] for two well-studied model proteins: the 64-residue α/β protein chymotrypsin inhibitor 2 (CI2) (PDB ID: 1YPA) [21] in Fig. 1(a), and the 57-residue β-barrel α-spectrin Src-homology 3 (SH3) domain (PDBID: 1SHG) [22] in Fig. 1(b). A structure-based model is a toy model that provides a single global basin of attraction that corresponds to an experimentally determined configuration and smooths out the ruggedness on the funneled energy landscape [29]. This allows us to study the ideal energy landscape of a protein. In this coarse-grained model, each residue is represented by one bead placed at its α-carbon position, creating a string of beads that represents the entire protein [see Figs. 1(c) and 1(d) for CI2 and SH3, respectively]. The Hamiltonian of our system depends on the experimentally determined configuration (also known as the native state) consisting of backbone terms, attractive interactions between beads in close proximity to each other in the native state, and excluded volume, having the following form taken from the model developed by Clementi et al. [20]:

H(Γ,Γ0)=i<jkr(rij-rij0)2δj,i+1+iangleskθ(θi-θi0)2+idihedralkϕ({1-cos[ϕi-ϕi0]}+12{1-cos[3(ϕi-ϕi0)]})+i-j>3i,jnativeε[5(rij0rij)12-6(rij0rij)10]+i-j>3i,jnativeε(σrij)12, (1)

where Γ is a configuration of the set r, θ, ϕ. The rij term is the distance between ith and jth residues, θ is the angle defined by three consecutive beads, and ϕ is the dihedral angle between four consecutive beads. We define ε = 0.6 kcal/mol as the solvent-mediated interaction, and kr = 100ε, kθ = 20ε, kϕ = ε [20]. δ is the Kronecker delta function. The native state values of r0, θ0, and ϕ0 for both proteins were obtained from their crystal structures Γ0 [21,22] where Γ0 = {{r0}, {θ0}, {ϕ0}}. The nonbonded terms consist of a Lennard-Jones interaction between native pairs, and excluded volume interaction between non-native pairs. For the native pairs, we used a 10–12 potential permitting a shorter-range interaction than a 6–12 one. The former potential increases the population of the unfolded ensemble in a protein model that folds and unfolds in two states. The native contact pairs were chosen using the CSU program [30]. σ for non-native pairs is 4 Å [20,31].

A short description of both proteins is necessary for later results. CI2 has one α helix packed against three β strands (from β1 to β3), and a 310 helix (310) as shown in Fig. 1(e). The two key cores are the hydrophobic core and the minicore [the key residues are shown in Fig. 1(a)] [35,36]. SH3 has five β strands (from β1 to β5) and a 310 helix (310) as shown in Fig. 1(f). The β strands are arranged antiparallel with respect to each other. Diverging turn and distal loop are key to the formation of the transition state configurations [see Fig. 1(b)] [37,38].

B. Brownian dynamics with or without HI

Our protein folding simulations utilized a Brownian dynamics with HI (BDHI) method developed by Ermark and McCammon [28]. We used the software HIBD developed by the Skolnick group [39] using only the far-field hydrodynamic interactions without a periodic boundary condition. The equation of motion is given by

xi(t+dt)=xi(t)+jDijFjkBTdt+Gi(dt), (2)

where xi(t + dt) is the position vector of the ith Cα bead at time t + dt. Fj is the total force acting on the jth Cα bead. The diffusion tensor D is a supermatrix of 3N × 3N, where N is the number of beads. Dij is the 3 × 3 submatrix in the ith row and jth column of the diffusion tensor. The hydrodynamic coupling for all possible pairs of beads was computed for the diffusion tensor D. In the absence of HI, the off-diagonal submatrices are zero. The diagonal terms were calculated from the Stokes-Einstein relation shown in Eq. (3) where η is the viscosity of the aqueous solvent at temperature T [39], kB is the Boltzmann constant, a represents the hydrodynamic radius of the beads, and I3 is a 3 × 3 identity matrix. In the presence of HI, the elements of submatrices Dij were obtained from the equations developed by Rotne and Prager [40] from the solutions of the Navier-Stokes equation under a low Reynolds number, and Yamakawa [41] extended the expression between a pair whose distance separation is less than the size of a bead. The complete set of formulas to compute the Dij terms is shown below in Eqs. (3)(5) where the hydrodynamic radius a is 5.3 Å for each bead (obtained from [13]). For an equation of motion with HI, then,

Dii=kBT6πηaI3, (3)
Dij=kBT6πηrij[(1+23a2rij2)I3+(1-2a2rij2)(rijrijrij2)],rij2a, (4)
Dij=kBT6πηa[(1-932rija)I3+332rijrijarij],rij<2a, (5)

where ⊗ represents a tensor product between two vectors. For the simulations using Brownian dynamics in the absence of HI (BD), the diffusion matrix reduces to Eqs. (6) and (7):

Dii=kBT6πηaI3, (6)
Dij=03,ij. (7)

Gi(dt) in Eq. (2) is the random displacement that mimics the stochastic behavior on a Cα bead from the implicit solvent. The relation between the random displacement and the diffusion tensor is linked by Eq. (8), which ensures that the fluctuation-dissipation theorem is satisfied. 〈 〉 represents the ensemble average:

Gi(dt)Gj(dt)=6DijdtandGi(dt)=0. (8)

C. Equilibrium thermodynamic simulations

To evaluate the thermodynamic properties of CI2 and SH3, we utilized coarse-grained molecular simulations with BD and BDHI. The initial structures were chosen from an ensemble of unfolded structures that were annealed progressively until they reach the target temperature. The integration time step dt is 10−3 τ, where τ=mσα2/ε, where σα is the average distance between two consecutive Cα beads (3.8 Å) and m is 100 u representing the mass of a bead. The sampling rate was greater than the correlation time. We used the replica exchange method (REM) [42] in aid of coarse-grained molecular simulations to enhance the sampling efficiency of the conformational space of a protein. For each protein, 20 temperatures were chosen for a set of REM simulation. The acceptance or rejection of each exchange between replicas follows the Metropolis criterion [43], min (1, exp{[βiβj][ℋ(Γi) − ℋ(Γj)]}), where i and j are two consecutives replicas, β = 1/kBT, kB is the Boltzmann constant, T is the temperature, and ℋ is the potential energy of the system. The number of samples was determined according to the convergence of the potential energy for all temperatures. Ensembles of 2.4 × 105 and 1.6 × 105 statistically significant conformations were obtained for each replica of CI2 and SH3, respectively. The free energy profiles were obtained using the weighted histogram analysis method (WHAM) [44].

D. Nonequilibrium kinetic simulations

1. Generation of the non-Arrhenius plot without HI

We simulated folding kinetics using Brownian dynamics in the absence of HI (BD) for CI2 and SH3 over wide ranges of temperatures. We represent the temperature used for each protein in units of their corresponding folding temperature Tf, where they are defined as the temperatures when the free energy (F), with respect to the fraction of native contact formation Q, of the unfolded state is equal to the F of the folded state; i.e., F(Qu) − F(Qf) = 0 where the basin of the unfolded state Qu and folded states Qf are equal in the free energy. Thus, temperatures are represented in units of TfCI2 and TfSH3 for CI2 and SH3, respectively. We explored the range of temperatures from 0.1TfCI2 to 1.3TfCI2 for CI2 and from 0.1TfSH3 to 1.25TfSH3 for SH3. A folding kinetic simulation started from an unfolded configuration, which was chosen randomly from an ensemble at high temperature, was performed until it reached the folded state for the first time (first passage time). We considered that a protein is folded when the nonbonded potential, the sum of the last two terms of Eq. (1), is less or equal to 0.9 of the native state nonbonded potential for CI2 and SH3. The number of trajectories depends on the convergence of the mean first passage time (MFPT) at each temperature. The average folding time tfold is the MFPT. The maximum simulation time tmax is 9 × 106 τ and was chosen as the folding time for trajectories that did not reach the folded state.

2. The impact of HI on protein folding kinetics

We selected another range of temperatures for simulations with HI for each protein, from 0.95TfCI2 to 1.06TfCI2 for CI2 and from 0.91TfSH3 to 1.03TfSH3 for SH3. The number of trajectories from unfolded to the folded state depends on the convergence of tfold at each condition. The maximum simulation time tmax is 9 × 106 τ and was chosen as the folding time for trajectories that did not reach the folded state. For the analysis where the kinetics trajectories were projected on a two-dimensional energy landscape (see Sec. IIIE), the number of trajectories were increased to 1500 and 2500 to reduce statistical error for CI2 and SH3 trajectories, respectively.

E. Effective diffusion coefficient of a reaction coordinate

We expect changes in the diffusion coefficient of a reaction coordinate to reflect the changes in the predictions of folding rates from the energy landscape theory (ELT) because HI is a kinetic effect that will not alter the overall free energy profiles [i.e., the Hamiltonian is the same with or without HI; see Eq. (1)]. Thus, instead of comparing the analytically predicted folding rates, we computed an effective diffusion coefficient along a reaction coordinate, the fraction of the native contact formation Q. We used the following expression to fit Deff from mean-squared displacement (MSD) of Q over a lag time t′ [45]:

Deff=12dlimttD[Q(t0+t)-Q(t0)]2t0,Ωt=12dlimttDΔQ2t0,Ωt, (9)

where 〈 〉t0 represents the average over all simulated time t0 separated by a lag time t′ and all trajectories Ω. d is the dimension of one. tD is the time scale where the MSD shows a linear behavior with time. We followed Whitford’s method to compute Deff [45] by fitting the diffusion coefficient at the regime where MSD varies linearly with lag time.

The ensemble average of the MSD is performed over a selected number of kinetic trajectories. The MSD is measured from the unfolded basin (Qu = 0.2 and Qu = 0.1 for CI2 and SH3, respectively) to slightly above the top of the barrier (Q = 0.6 and Q = 0.5, for CI2 and SH3, respectively).

F. Data analysis

1. Differences in the probability of secondary structure formation

The probability of secondary structure formation as a function of time P(t) taken over all kinetic trajectories is estimated in the presence or in the absence of HI for both proteins. The differences in the probability of secondary structure formation between BDHI and BD [ΔP(t)] is defined as

ΔP(t)=PBDHI(t)-PBD(t). (10)

When ΔP(t) is positive, the average probability of secondary structure formation from BDHI is greater than BD, and vice versa when ΔP(t) is negative.

2. Displacement correlation

We calculated the displacement vector of a residue k, which is defined as sk(t) = xk(t) − xk(tdt), throughout the kinetic simulation. Then, we investigated the displacement correlation Cij(t) by taking the cosine of the angle formed between two displacement unit vectors ŝ(t) of a pair of residues i and j (ij) at time t and averaging over all trajectories Ω as

Cij(t)=s^i(t)·s^j(t)Ω. (11)

The translation and rotation of the center of mass of each configuration was removed before calculating the displacement correlation.

3. Chance of occurrence

The chance of occurrence Π(|ij|) at time t is defined as the ratio of number of residue pairs with a sequence separation |ij| > 0 whose magnitude of the displacement correlation is above a selected threshold, to the total number of residue pairs at that sequence separation:

Π(i-j)=i>jΘ(Cij-μ)δi-j,i-ji>jδi-j,i-j, (12)

where Θ is the Heaviside step function and δ is the Kronecker delta function. The chosen threshold μ is the average positive displacement correlation from the ensemble at time t:

μ=i>jCijΘ(Cij)i>jΘ(Cij). (13)

Negative displacement correlation is ignored because the signal is not as strong.

III. RESULTS

A. The impact of BDHI on the folding time depends on temperature

We explored the folding kinetics of CI2 and SH3 by comparing tfold with BD over a broad range of temperatures. Both proteins exhibit non-Arrhenius [46] behavior against temperature as shown in Figs. 2(a) and 2(b). At high temperatures, tfold increases because the thermal fluctuations are higher than the stability of the protein, and at low temperatures, tfold increases due to the fact that the protein is trapped in a local energy minimum [18,46,47]. The temperature that renders the fastest tfold is at 0.95TfCI2 and 0.91TfSH3 for CI2 and SH3, respectively. We computed tfold for the proteins with BDHI over a narrow range of temperatures around Tf of the proteins in Figs. 2(c) and 2(d) in dashed lines. Our study shows that the impact of HI on tfold is small within an order of magnitude, but statistically significant. What is most interesting is that HI either increases or decreases the folding time depending whether the temperature is higher or lower than Tf. This distinctive “crossover” behavior occurs in the proximity of the folding temperature of CI2(1.03TfCI2) and SH3(0.98TfSH3). Thus, the impact of HI on protein folding kinetics is temperature dependent. However, the acceleration of the folding is more prominent for CI2 than for SH3 at T < Tf. Therefore, HI effects also depend on the topology of a protein. We will further investigate the role of topology in the extent of impact from HI on protein folding in the following subsection at two temperatures for each protein: below Tf ( 0.95TfCI2 for CI2 and 0.91TfSH3 for SH3) and above Tf ( 1.06TfCI2 for CI2 and 1.03TfSH3 for SH3).

FIG. 2.

FIG. 2

The average folding time tfold in units of reduced time τ with respect to temperature for CI2 and SH3 in the presence or in the absence of hydrodynamic interactions (HI). Panels (a) and (b) show tfold over a broad range of temperatures using the Brownian dynamics without HI (BD) for CI2 and SH3, respectively. The temperature for each protein is expressed in units of their corresponding folding temperature Tf. Note the U-shaped dependence of the folding time (non-Arrhenius behavior). tfold using BD with HI (BDHI) is compared to tfold using only BD in panel (c) for CI2 and (d) for SH3. The crossover occurs when the two curves intersect. Error bars are calculated using the jackknife method.

B. The effective diffusion coefficient of Q partly accounts for the crossover behavior of the folding kinetics

Can we capture this folding behavior using a global order parameter? A theoretical estimation of the folding kinetic rate k (the rate is the inverse of folding time tfold) depends on the shape of free energy surface and the effective diffusion coefficient Deff of an order parameter on the free energy surface [4648] as such,

k=1tfold=(β2π)1/2Deffωωexp(-βΔF), (14)

where ω and ω are the curvatures of the unfolded state free energy well and barrier, respectively, β is the inverse temperature, and ΔF is the free energy barrier height with respect to the unfolded state free energy. However, since the Hamiltonian for BD and BDHI are identical rendering the same free energy profiles (see Fig. 3), the change in the folding kinetic rates should be explained by the change in the diffusion of the order parameter. Here, the order parameter is the fraction of native contact formation Q. The mean-squared displacement (MSD) of Q is obtained as a function of time as shown in Fig. 4 to estimate the effective diffusion coefficients. The trajectories longer than 2 × 105 τ at 0.95TfCI2 and 8 × 105 τ at 1.06TfCI2 are considered for CI2, and 1.6 × 105 τ at 0.91TfSH3 and 4 × 105 τ at 1.03TfSH3 for SH3. The MSD is calculated for a time shorter than the average folding time for each condition. The initial phase is a transitional sub-diffusive process characterized by MSD ~ tα with α < 1. After the memory from the initial state dissipates [49,50], the MSD reaches a normal diffusive regime. Deff of Q is estimated from the linear region of the MSD of Q as a function of lag time as shown in the insets of Fig. 4. A smaller diffusion coefficient at T > Tf than that of T < Tf is simply a thermal argument.

FIG. 3.

FIG. 3

Free energy (F) in units of kBT with respect to fraction of native contact formation Q without or with HI (BD and BDHI, respectively) for (a) CI2 and (b) SH3 at a temperature below the folding temperature (Tf), at Tf, and above Tf. The transition state region is indicated by TS. Error bars are included.

FIG. 4.

FIG. 4

The mean square displacement of the fraction of the native contact formation Q, 〈ΔQ2t0, as a function of lag time in units of reduced time τ in the absence or presence of HI [BD (solid black line) and BDHI (dash-dot red line), respectively] at T < Tf [(a) 0.95TfCI2 and (b) 0.91TfSH3 for CI2 and SH3, respectively], and T > Tf [(c) 1.06TfCI2 and (d) 1.03TfSH3 for CI2 and SH3, respectively]. 〈ΔQ2t0Ω was averaged over all initial times t0 and all trajectories Ω. Shaded width of solid lines represents the error. The inset zooms in to the range of time used for the linear fit of data. Open circles (BD) and open squares (BDHI) are placed every 100 data points for T < Tf and 200 data points for T > Tf. The unit for the effective diffusion coefficient is 1/τ since Q is dimensionless.

If Q is a perfectly good reaction coordinate that captures the collective dynamics of a complex system, the folding rates computed directly from the folding kinetic simulations should be the same as the rates predicted by the energy landscape theory. When the simulated rate deviates from the prediction, it infers that the dynamics of a complex system of many degrees of freedom might not be adequately described by using only a single reaction coordinate. We obtained kBDHI/kBD(ortfoldBD/tfoldBDHI) from the kinetic simulations for CI2 and SH3 in Table I and Table II, respectively. Additionally, because HI only influences the pre-exponential factor but not the barrier height in Eq. (14), the ratio of the predicted rates from the energy landscape theory is equivalent to DBDHIeff/DBDeff. To test this, we compare the ratio of DBDHIeff/DBDeff from MSD calculation to the ratio of tfoldBD/tfoldBDHI from kinetic simulations. We found that indeed the ratio of DBDHIeff/DBDeff is not equal to the ratio of tfoldBD/tfoldBDHI in Table I and Table II, although it shows the right trend of the crossover behavior. For CI2, the ratio of folding rates from BDHI and BD kinetic simulations is kBDHI/kBD = 1.37 at 0.95TfCI2 and kBDHI/kBD = 0.83 at 1.06TfCI2, whereas the ratio of kBDHI/kBD from fitting the MSD is 1.14 at 0.95TfCI2 and 0.15 at 1.06TfCI2. A similar trend of crossover behavior is observed for SH3. The ratio of kBDHI/kBD from fitting the MSD for either CI2 or SH3 is merely around 1 at T <Tf, while the kBDHI/kBD computed from the folding simulations is above 1.19. The analysis of the diffusivity shows a retarded dynamics due to HI at T >Tf. The results from kinetic simulation also show retarded dynamics but less so. We speculate that a mean-field description of overall folding with the collective order parameter of Q along an energy landscape may not fully grasp the kinetic principle of HI on folding. Thus, we will investigate the influence of HI on folding by the formation of the local contacts or secondary structures in the next subsection.

TABLE I.

Folding time from kinetic simulations (tfold) and the effective diffusion coefficient (Deff) of Q for CI2 using BD or BDHI.

T(inunitsofTfCI2)
From kinetic simulations From MSD analysis


tfoldBD(106τ)
tfoldBDHI(106τ)
kBDHI/kBD=tfoldBD/tfoldBDHI
DBDeff(10-91/τ)
DBDHIeff(10-91/τ)
kBDHI/kBD=DBDHIeff/DBDeff
0.95 0.52 ± 0.02 0.38 ± 0.01 1.37 ± 0.06 89.16 ± 0.08 101.22 ± 0.09 1.14 ± 0.00
1.06 3.64 ± 0.13 4.39 ± 0.14 0.83 ± 0.04 2.29 ± 0.01 0.35 ± 0.00 0.15 ± 0.00

TABLE II.

Folding time from kinetic simulations (tfold) and the effective diffusion coefficient (Deff) of Q for SH3 using BD or BDHI.

T(inunitsofTfSH3)
From kinetic simulations From MSD analysis


tfoldBD(106τ)
tfoldBDHI(106τ)
kBDHI/kBD=tfoldBD/tfoldBDHI
DBDeff(10-91/τ)
DBDHIeff(10-91/τ)
kBDHI/kBD=DBDHIeff/DBDeff
0.91 0.19 ± 0.01 0.16 ± 0.01 1.19 ± 0.10 217.72 ± 0.30 236.71 ± 0.24 1.09 ± 0.00
1.03 1.08 ± 0.05 1.65 ± 0.07 0.65 ± 0.04 6.39 ± 0.01 2.56 ± 0.01 0.40 ± 0.00

C. HI facilitates the ordering of key structural regions at T < Tf

In the previous subsection, we have shown how HI impacts folding globally to explain the crossover behavior of the folding rates; however, HI also impacts local secondary structure formation. We investigated the temperature dependence of HI and the crossover behavior by analyzing the ordering of secondary structures of the proteins along a time that is normalized by the maximum time tmax. For the selected temperatures, we calculated the differences in the probability of secondary structure formation ΔP(t) between BDHI and BD as a function of normalized time (Fig. 5). For each protein, one temperature is slightly below Tf [Figs. 5(a) and 5(b) for CI2 and SH3, respectively] and the other is slightly above Tf [Figs. 5(c) and 5(d) for CI2 and SH3, respectively]. At T <Tf the folding time with BDHI decreases with respect to BD, and vice versa at T >Tf. We are interested at analyzing the folding formation at a time before the proteins reach the transition state at the top of the folding barrier. The transition state region is indicated by TS for each protein in Fig. 3, and it is a key part in the folding process. This stage of folding occurs before the dashed, vertical lines in each panel of Fig. 5 (see Fig. 6 for a complete temporal evolution of 〈QΩ along normalized time where 〈 〉Ω represents the average over all kinetic trajectories).

FIG. 5.

FIG. 5

Impact of HI on key pairs of secondary structure interactions ΔP(t) at T <Tf [(a) 0.95TfCI2 and (b) 0.91TfSH3 for CI2 and SH3, respectively], and T >Tf [(c) 1.06TfCI2 and (d) 1.03TfSH3 for CI2 and SH3, respectively]. Time is normalized with respect to maximum simulation time (9 × 106 τ) and shown in log scale. The dashed, vertical lines correspond to the time average of 〈QΩ = 0.4 for CI2 and SH3. 〈QΩ = 0.4 is the onset of the transition state. The curves were smoothed by running averaging over every 500 data points. Legends on curves are displayed for visual guidance. Errors are included but too small to be visible. At the top of the panels, the location of secondary structure elements and unstructured regions along the sequence of CI2 (left) and SH3 (right) is indicated as visual guidance.

FIG. 6.

FIG. 6

Temporal evolution of the fraction of native contact formation Q averaged over all trajectories Ω as function of normalized time t/tmax at T <Tf and T >Tf for (a) CI2 and (b) SH3. Time is normalized with respect to maximum simulation time (9 × 106 τ). Shaded width of lines represents the error, which is calculated using a jackknife method.

The two temperatures to investigate CI2’s crossover behavior are 0.95TfCI2 and 1.06TfCI2. We grouped the native contact pairs into secondary structure segments to get a structural view of the impact of HI. Figure 5(a) shows that at T <Tf, the most positive ΔP(t) is observed for β1-β2 and β1-seg4 that form the minicore, and seg3-β2, seg3-seg4, seg4-β2 that are in the neighborhood of the minicore. It shows a modest positive ΔP(t) for 310-α (native pairs close to the N terminus) and β2-β3 (native pairs close to the C terminus). This suggests that BDHI enhances the formation of the secondary structures within the minicore and has less impact at the termini. In Fig. 5(c) at T >Tf, the native contacts that are more affected by BDHI than by BD [negative ΔP(t)] are the segments involving seg3-seg4, seg4-β2, the long-range contacts 310-seg5, and contacts in the C terminus β2-β3. This implies that the impact of HI has a longer range at T >Tf than at T <Tf.

Turning the attention to SH3, the two temperatures to investigate SH3’s crossover behavior are 0.91TfSH3 and 1.03TfSH3. At T <Tf [Fig. 5(b)], the most positive ΔP(t) is observed in the region about the RT loop (RT), the diverging turn (DT), and the distal loop (DL). These involved secondary structures are DT-β4, DT-β5, DT-310, RT-β3, RT-β4, and N-Src-β5. BDHI promotes specific contacts in SH3 that is known for its obligatory role in the formation of the transition state. In addition, the contacts in the neighborhood of the distal loop and N-Src loop are also mildly enhanced with BDHI (β2-β3, β2-β4, β3-β5, and DT-β3). On the other hand, at T >Tf [Fig. 5(d)] the formation of the previous pairs mentioned before are also negatively affected by BDHI indicated by a negative ΔP(t). However, the impact of HI on the folding time of SH3 is less than that of CI2. We will discuss the difference in the impact from the viewpoint of protein topology in the following subsections.

D. Hydrodynamic coupling of midrange and long-range contacts and their opposing impact on the general ordering of contact formation at T < Tf and T > Tf

Armed with the knowledge of the kinetic specific structural formation from the previous subsection, we hypothesize that HI influences the self-organization of a protein configuration during the course of folding from an unfolded state to the formation of a transition state. Consequently, HI creates an opposing effect on folding time at a watershed of Tf, which is the crossover behavior in the folding kinetics. The addition of HI to the equation of motion introduces the many-body coupling between all beads where their motions are correlated inversely with their spatial separation r. To test this hypothesis, we compare the configurations in terms of probability of contact formation of each native pair Qij (t) to the displacement correlation between a pair of contacts Cij (t) at a particular moment in folding kinetics. We chose this moment to be a time where the average of probability of contact formation 〈QΩ is 0.4 (see Fig. 6 for a temporal evolution of 〈QΩ) as shown in Figs. 7 and 8 for CI2 and SH3, respectively.

FIG. 7.

FIG. 7

(a) The probability of contact formation of each native pair Qij (t) (upper triangle) and the displacement correlation Cij (t) (lower triangle) at 0.95TfCI2 for CI2 at a normalized time where 〈QΩ = 0.4. (b) The distribution of native pairs from panel (a) with |ij| ≥ 10. (c) The same presentation as (a) for Qij (t) and Cij (t) at 1.06TfCI2 . (d) The distribution of native pairs from panel (c) with |ij| ≥ 10. The arrows and rectangles along (a) and (c) represent β strands and helices, respectively. The orange and purple dashed boxes highlight the long-range contacts between the N terminus and C terminus, and the contacts in neighborhood of the minicore, respectively. The black and green dashed boxes highlight the displacement correlation between the N terminus and C terminus, and between the N terminus and the α helix, respectively. (e) The displacement correlation at 0.95TfCI2 and 1.06TfCI2 for all pairs are classified into three sets based on the magnitude of positive correlations. If the magnitude of the correlation is similar at both temperatures, the pair is colored with a green edge on the left structure. If the magnitude of the correlation is greater at 0.95TfCI2 than that at 1.06TfCI2 a pair is colored with a blue edge on the right structure. If the magnitude of the correlation is greater at 1.06TfCI2 than that at 0.95TfCI2 a pair is colored with a red edge on the right structure. Only the pairs with sequence separation greater than 8 residues and magnitude of displacement correlation above the threshold μ = 0.061 are considered for this representation. The key residues for the hydrophobic core (A16, L49, and I57) and the minicore (L32, V38, and F50) are illustrated with green and orange beads, respectively. (f) Π(|ij|) for all pairs whose magnitude of the displacement correlation is above μ are organized according to the sequence separation |ij|.

FIG. 8.

FIG. 8

(a) The probability of contact formation of each native pair Qij (t) (upper triangle) and the displacement correlation Cij (t) (lower triangle) at 0.91TfSH3 for SH3 a normalized time where 〈QΩ = 0.4. (b) The distribution of native pairs from panel (a) with |ij| ≥ 10. (c) The same representation as (a) for Qij (t) and Cij (t) at 1.03TfSH3. (d) The distribution of native pairs from panel (c) with |ij| ≥ 10. The arrows and rectangles along (a) and (c) represent β strands and helices, respectively. The orange and purple dashed boxes highlight the long-range contacts between the N terminus and C terminus, and the contacts between the diverging turn (DT) and the distal loop (DL), respectively. The black and green dashed boxes highlight the displacement correlation between the N terminus and C terminus, and between the N terminus and both DT and β2, respectively. (e) The displacement correlation at 0.91TfSH3 and 1.03TfSH3 for all pairs are classified into three sets based on the magnitude of positive correlations. The coloring rules are the same as Fig. 7(e) with pairs with sequence separation greater than 7 residues and magnitude of displacement correlation above the threshold μ = 0.095. (f) Π(|ij|) for all pairs whose magnitude of the displacement correlation is above μ are organized according to the sequence separation |ij|.

The values of the displacement correlation Cij (t) at a particular time are shown in the lower triangles in Figs. 7 and 8 for CI2 and for SH3, respectively. The comparison of Cij (t) with Qij (t) (upper triangles in Figs. 7 and 8) establishes a causal relationship between the dispersity of Qij (t) and the spatial pattern of Cij (t). At T <Tf, the probability of contact formation among midrange to long-range pairs (|ij| ≥ 10) is disperse [upper triangle in Figs. 7(a) and 8(a)]. Their Fano factors (variance over mean) are 0.04 for CI2 [Fig. 7(b)] and 0.06 for SH3 [Fig. 8(b)], whereas at T >Tf, the probability of contact formation for pairs with |ij| ≥ 10 has a narrower distribution [upper triangle in Figs. 7(c) and 8(c)] evident by a lower Fano factor: 0.01 for CI2 [Fig. 7(d)] and 0.02 for SH3 [Fig. 8(d)]. A rather high Fano factor implies the existence of certain localized contact formation when a protein folds from an unfolded state at T <Tf. A narrow dispersion [Figs. 7(d) and 8(d)] shows the contact formation is quite random as the protein folds and unfolds at T >Tf.

1. CI2

For CI2, the localized contacts are around the minicore (highlighted in purple boxes). As shown in the lower triangle of Figs. 7(a) and 7(c) (at T <Tf and T >Tf, respectively), HI alters the pattern of motions for long-range contacts (black boxes) and thus impacts the dynamics of midrange contacts (green boxes). At T >Tf, the paired residues move cooperatively in the same direction, thus adversely affecting the formation of the midrange contacts around the minicore.

Noticeably, in addition to the native pairs, the surrounding non-native pairs nearby are also correlated, which is not observed in the simulations with BD (Fig. 9). Several research groups have shown the importance of non-native pairs dictating protein kinetics [5153]. To better visualize the whereabouts of the midrange to long-range contact pairs involving both native and non-native pairs, we projected the pairs with sequence separation of |ij| > 8 on the native structure of CI2 in Fig. 7(e) with colored edges. The contact pairs with a similar range of positive correlation at both temperatures ( 0.95TfCI2 and 1.06TfCI2) are shown with green edges. Most of these are located in the two regions formed by β1, β2, and β3, and the C terminus with the connecting loop of β2 and β3. The pairs in which the magnitude of the correlation is greater at 0.95TfCI2 than that of 1.06TfCI2 are shown with blue edges, which are located mostly between the α helix and the N terminus, and between the connecting loop of β2 and β3 and the C terminus. The pairs in which magnitude of the correlation is greater at 1.06TfCI2 than that of 0.95TfCI2 are shown with red edges. They are present between the N terminus and the C terminus, the α helix and the following loop, and the region of the minicore.

FIG. 9.

FIG. 9

The Brownian motion of residues without HI (BD) shows small and random displacement correlation for native and non-native pairs of CI2 and SH3. Upper and lower triangles represent the probability of contact formation for each native pair Qij (t) and displacement correlation Cij (t), respectively. The panels are plotted at T <Tf [(a) 0.95TfCI2 and (b) 0.91TfSH3 for CI2 and SH3, respectively], and T >Tf [(c) 1.06TfCI2 and (d) 1.03TfSH3 for CI2 and SH3, respectively] at a normalized time where 〈QΩ = 0.4. The arrows and rectangles represent β strands and helices, respectively.

As hinted in the previous paragraph, we speculate that the sequence separations between contacts, involving both native and non-native contacts, play a significant role in the crossover behavior in the presence of HI. To extend our analysis and justify our speculation, we plotted the chance of occurrence Π(|ij|) (see definition in Sec. IIF) in Fig. 7(f) along the sequence separation |ij|. There is a stronger signal at midrange contacts (10 < |ij| < 30) at T <Tf than that of T >Tf. Most noticeably, there is a strong signal at |ij| ≈ 60 that shows that long-range contacts are indeed correlated at T >Tf.

2. SH3

We found that HI affects SH3 (Fig. 8) in a similar way to CI2; however, the effect is not as strong. This is evident by the data collected at the time that corresponds to the transition state (〈QΩ = 0.4). The contact formation is localized at midrange contacts between the diverging turn (DT) and the distal loop (DL) (in purple boxes), which is known to be critical to the formation of transition state ensemble experimentally [37,38]. Similarly to Fig. 7(e), we projected the pairs with displacement correlations greater than the average positive correlation on the native structure in Fig. 8(e). Any pairs with a sequence separation of |ij| > 7 are grouped in colored edges. The green ones correspond to the similar magnitude of pair correlation at a temperature either higher or lower than Tf. The pairs in which the magnitude of the correlation is greater at 0.91TfSH3 than that of 1.03TfSH3 are shown with blue edges, which are the pairs in the region of the RT and DT loop, 310 helix, and β3. Furthermore, the pairs in which the magnitude of the correlation is greater at 1.03TfSH3 than that of 0.91TfSH3 are shown with red edges, which are the pairs between β2 and the N terminus (from seg1 to DT), and the long-range pairs between seg1 and seg2. Again, we plotted the Π(|ij|) in Fig. 8(f) along sequence separation for SH3. Similarly to CI2, there is a stronger signal at midrange contacts (10 < |ij| < 20) at T <Tf than that of T >Tf. At |ij| ≈ 56, there are long-range contacts that are correlated at 1.03TfSH3.

The previous analysis is compared to the same plots without HI (BD) in Fig. 9. Although the contact maps are similar to their corresponding 〈QΩ for both proteins, there is no clear pattern in the displacement correlation map for BD. The displacement correlation randomly fluctuates around zero. The Π(|ij|) shown in Figs. 7(f) and 8(f) suggest that the crossover behavior in the presence of HI is due to the displacement correlation between the midrange contacts and long-range contacts. The midrange contacts for CI2 are the ones that form the minicore, and for SH3 are the ones between the diverging turn and distal loop. Although we employed a structure-based model where the native pairs are energetically attractive, we identifed the importance of dynamic correlation between residues that form a native pair and their neighboring non-native pairs particularly between the α helix and the N terminus for CI2, and between both DT and β2 and N terminus for SH3, for the retardation of the folding time at T >Tf.

E. HI can kinetically alter folding routes from multiple pathways

To further investigate the molecular underpinning of the crossover behavior that cannot be simply explained by the ratio of the effective diffusion coefficients from Sec. IIIB, we explored the possible changes in the pathways due to HI by projecting the kinetic trajectories on a two-dimensional free energy landscape. An additional reaction coordinate QT involving a selected group of midrange contacts from Figs. 7 and 8 (for CI2 and SH3, respectively) is employed to describe the folding process because we speculate on the presence of hidden pathways that are not visible by a global parameter Q [31].

1. CI2

For CI2, QT is defined as a set of the native contacts that are located in the neighborhood of the minicore (contacts enclosed in the purple dashed rectangle of Fig. 7). Figure 10 reveals two distinct paths: one involves a high QT (0.8) and the other involves a low QT (0.2) both at about Q ≈ 0.5.

FIG. 10.

FIG. 10

Two representative kinetic pathways are projected on a two-dimensional free energy landscape of the fraction of native contact formation Q and the fraction of native contact formation in the region of the minicore QT for CI2 at 0.95TfCI2 . Panel (a) shows a major pathway of folding kinetics (route I), representing a fast route, in the presence of HI. Panel (b) shows a minor pathway (route II). The folding free energy was colored in gray scale in units of kBT. The kinetic trajectories were colored by a normalized time (time divided by their respective first passage time) and projected on the folding free energy. Key conformations were selected for visual guidance. The significant residues for the hydrophobic core (Ala16, Leu49, and Ile57) and minicore (Leu32, Val38, and Phe50) are illustrated with green and orange beads, respectively. The minicore forms before the hydrophobic core in panel (a), whereas the opposite occurs in panel (b). Structures were created with VMD [33].

We projected two representative kinetic trajectories over the two-dimensional free energy surface as a function of Q and QT. A similar method has been used to describe protein flows on a 2-dimensional free energy surface [54]. In Fig. 10(a), the kinetic trajectory, named route I, began from an unstructured chain. As time increases the α helix and most of the contacts in QT are formed before reaching Q ≈ 0.5, which is the top of the barrier of the one-dimensional free energy profile as a function of Q. This involves the formation of the minicore and contacts in the C terminus before the formation of the hydrophobic core. After crossing the top of the barrier the hydrophobic core starts to form. Figure 10(b) illustrates another kinetic trajectory, called route II, started from another unfolded structure. As time increases, the contacts of QT have not completely formed at Q ≈ 0.5 while the hydrophobic core is formed before the minicore. Then the contacts of the minicore start to form to reach the folded state. Table III shows the number of trajectories that visit route I and II, and their corresponding average folding time. Route II is slower than route I for both BD and BDHI. BDHI accelerates the folding of both routes, and it reduces the number of trajectories that visit route II from 10.53% to 7.26%. BDHI not only reduces effective diffusivity; it also alters the folding route to favor a faster one than a slower one.

TABLE III.

Number of trajectories and their folding time (tfold) from a set of 1500 kinetic simulations that visit route I and II for CI2 at 0.95TfCI2.

BD BDHI


Route Number of trajectories
tfoldBD(106τ)
Number of trajectories
tfoldBDHI(106τ)
kBDHI/kBD=tfoldBD/tfoldBDHI
I 1342 (89.47%) 0.51 ± 0.01 1391 (92.73%) 0.38 ± 0.01 1.34 ± 0.04
II 158 (10.53%) 0.76 ± 0.05 109 (7.26%) 0.51 ± 0.04 1.49 ± 0.15

2. SH3

As for SH3, QT is defined as a set of the native contacts that are located in the neighborhood of the diverging turn (DT) and the distal loop (DL) (contacts enclosed in the purple dashed rectangle in Fig. 8). We created a two-dimensional free energy landscape as a function of QT and Q in Fig. 11. There is one dominant folding path. In fact, we checked whether a rare event second path occurs by raising the number of folding trajectories to 2500.

FIG. 11.

FIG. 11

One representative kinetic pathway is projected on a two-dimensional free energy landscape of the fraction of native contact formation Q and the fraction of native contact formation in the region formed by the diverging turn (DT) and the distal loop (DL) QT for SH3 at 0.91TfSH3 . It shows a dominant pathway of folding kinetics. The folding free energy was colored in gray scale in units of kBT. The kinetic trajectory was colored by a normalized time (time divided by their respective first passage time) and projected on the folding free energy. Key conformations were selected for visual guidance. The residues of the diverging turn (M20, K21, K22, G23, and D24) and distal loop (N42 and D43) are illustrated with green and orange beads, respectively. The native contacts between DT and DL are formed before reaching the top of the barrier. Structures were created with VMD [33].

We projected a representative kinetic folding trajectory on the landscape. Figure 11 shows the pathway where the contacts between DL (orange beads) and DT (green beads) are formed before reaching Q ≈ 0.4, which is the top of the barrier of the one-dimensional free energy as a function of Q. Then QT increases along Q and the rest of the protein forms to achieve the folded state. The formation of the contacts of DL and DT characterizes the selectivity of the transition state for SH3. This topological constraint may reduce the effect of HI on the folding kinetic rates at T < Tf.

IV. DISCUSSION AND CONCLUSION

A. Crossover behavior of folding kinetics on the non-Arrhenius curve

It has been shown extensively that protein-folding rates are temperature dependent. Folding rates with respect to temperature renders a U-shaped, non-Arrhenius curve where the rates are low at both low and high temperatures [46], and folding rates are fastest at a narrow range of temperature near Tf. Here, we show the additional impact of HI nontrivially affects folding in that it accelerates folding rates more than that of BD without HI at T < Tf. On the other hand, HI retards protein folding rates more than that of BD without HI at T > Tf. To our knowledge, this crossover behavior of the folding times shown in Figs. 2(c) and 2(d) has never been observed or theoretically predicted. Tanaka previously speculated that HI might accelerate or retard protein folding kinetics depending on the stage of folding [55] without highlighting the extensive role of temperature in folding kinetics. The temperature dependence of the effect of HI on folding and the crossover behavior might explain the mixed results of the HI influences on protein folding rates from several computational studies in the literature. In these previous studies, it was not clear whether the temperatures used are higher or lower than Tf. Rather, most simulation temperatures were justified by matching the experimentally measured diffusivity of a protein model. We will discuss previous work below.

Several groups used similar coarse-grained molecular simulations with a structure-based model for probing the impact of HI on protein folding. Their results vary: Kikuchi et al. [15] found that there is no clear difference in the folding kinetics with or without HI of a protein CI2 and two secondary structures, an α helix and β hairpin. Their Tf values were not reported in their study, which makes it difficult to judge their results in an appropriate temperature regime where HI can accelerate or retard folding dynamics. Frembgen-Kesner and Elcock [13] studied 11 small proteins and also two secondary structures, an α helix and β hairpin. The folding time decreased with HI for all studied proteins, but it has the opposite effect for the secondary structures. It is inferred that they launched simulations at room temperature for all their systems. It may be that the folding temperatures of all proteins are greater than the room temperature used in the simulations, but it may not be the case for the secondary structures. Another study performed by Cieplak and Niewieczerzał [12] showed the folding time of three proteins (1CRN, 1BBA, and 1L2Y) over a range of temperatures. Although the differences of the folding time between their BD model with or without HI decreases at high temperature (above room temperature), there is no indication of a crossover behavior from their study. We speculate that their simulating temperatures are not close to Tf because for a structure-based model that folds and unfolds in a two-state manner, the free energy barrier of protein folding is typically a few kBT at Tf; the folding time at Tf is exponentially longer than the fastest folding time with a minimal free energy barrier. In addition, there is no clear evidence that the protein models remained thermally unfolded at the maximum temperature studied with HI. Our work shows that the justification of the simulation temperature against the folding temperature is a criterion to assess the impact of HI on protein folding dynamics.

Additionally, Lipska et al. [14] argued that a structure-based model with only favorable attraction between native contacts is the reason why the studies mentioned above [12,13,15] have not observed retarded dynamics under HI from their simulations. They argued that the presence of intermediate states is key to a retarded dynamics by studying the effects of HI on two proteins (1BDD and 1EOL) at distinct temperatures with a coarse-grained molecular simulation using the UNRES force field. Indeed, HI can alter kinetic paths to favor nonproductive intermediates at a temperature lower than the collapsed temperature Tθ as asserted by Tanaka [7]. Lipska’s work has not investigate folding at T > Tf; thus, their conjecture is compatible with our work that HI can retard the folding dynamics at T > Tf.

B. Underlying kinetic principles of the crossover behavior

The impact of HI on protein folding that gives rise to the crossover behavior is subtle over a wide range of temperatures because HI affects the folding mechanism in three thrusts that are not necessarily of equal prominence: (1) the dynamics of crossing over an activation barrier, (2) the choice of folding pathways, and (3) the motions between beads in viscous solvents. HI is a kinetic effect that is expressed from a diffusion tensor in the equation of motion. It does not shape a folding energy landscape but it governs the ordering of contact pairs across a complex folding energy landscape particularly when more than one pathway from the unfolded state to the folded state exists. We argued that at T < Tf, the first two factors dominate the kinetic principle that HI accelerates folding creating a backflow. In addition, the extent of acceleration is dictated by a protein’s topology. At T > Tf, the third factor from non-native contacts arises and HI retards folding.

First, the development of an appropriate reaction coordinate that best describes the profile of a free energy barrier often relies on the characteristics of the energy landscape [56]. For a minimally frustrated energy landscape that resembles a funnel [19], it has been shown by the use of structure-based models, like the one we employed, that the fraction of the native contact formation Q is a reasonably “good” 1-D reaction coordinate [29] to predict protein folding rates simply from the features of an activation barrier and the shape of the unfolded basin [46]. As a protein model becomes complex, these features become less harmonic and the diffusivity of Q, D(Q), becomes dependent on Q itself. Wang et al. [31] have shown that prediction of rates improves by including a Q-dependent diffusivity. However, their procedure to compute D(Q) is to constrain selected structures that diffuse with a specific Q under a harmonic potential well. Whitford et al. showed that the computation of an effective diffusivity Deff is sufficient for the computation of rates when it is fitted from a reasonably linear region of a mean-squared displacement versus time plot [45] for a specific site-tRNA movement in a ribosome. In our study, this projection was performed onto a single reaction coordinate, the fraction of all native contact formation Q, to obtain the one-dimensional free energy landscape. In a qualitative sense the diffusion coefficient of Q can describe the ratio of the folding rates between BD and BDHI at T < Tf and T > Tf for CI2 and SH3. However, the ratio of the folding rates in the presence or absence of HI cannot be fully explained by the ratio of effective diffusion coefficients (Table I and II). Even when we computed Deff from selective contact pairs that are pertinent in the formation of the transition states (QT), DBDHIeff/DBDeff based on QT (1.21 ± 0.00 for CI2 and 1.10 ± 0.00 for SH3) becomes closer to kBDHI/kBD than the values from Tables I and II at T < Tf. However, at T > Tf, it becomes much worse (0.11 ± 0.00 for CI2 and 0.22 ± 0.00 for SH3). It is indicative that there exist at least two competing mechanisms requiring more than one reaction coordinate to fully describe an energy landscape as discussed by Yang and Gruebele [57]. They showed at least two reaction coordinates that are opposing to one another are required for describing folding of a small protein over a full temperature range by mutagenesis. This supports our speculation that the search of a single perfect order parameter may not fully solve the mystery of the crossover behavior under HI.

Second, since folding is a complex process (NP-complete [58,59]) moving on a high-dimensional energy surface, we lose information when projecting the landscape onto a single order parameter. We argued that a single reaction coordinate, no matter how optimally it is defined, will still fall short of addressing a folding mechanism that shows competing folding pathways. As shown by the use of QT, the folding energy landscape is extended into another dimension to capture the folding kinetics. For CI2, there are two distinct pathways. One pathway forms the minicore first, while the other pathway forms the hydrophobic core first. At T < Tf, HI will change the kinetic pathway that favors the formation of the minicore (route I), which is the faster of the two pathways. In the presence of HI, there is a flow of configurational space towards the folded state that increases the number of trajectories through route I. This, in turn, increases the folding rate. HI may guide the folding process and prevent kinetic traps from less populated routes. The presence of a “hidden” pathway for CI2 has also been addressed by Wang et al. as a justification of using kinetically determined variables to calibrate the equations from the energy landscape theory to predict folding rates [31]. The Weeks group used a simple HP model of a protein, consisting of only hydrophobic (H) and hydrophilic (P) spherical monomers to show two reaction coordinates with distinct diffusion coefficients necessary for describing a meaningful collapse mechanism [60]. As for SH3, folding follows a specific nucleation site as an obligatory step for folding. It folds through the formation of high ϕ-value amino acids (DT and DL) [37,38]. We argued that because SH3 has only one dominant pathway, HI has less of an effect on the folding rate as that of CI2 at T < Tf. Additionally, several single molecule pulling experiments have also shown that the addition of more than one order parameter is necessary to describe folding [61,62]. Even though Woodside and coworkers have shown that it is possible to use a single reaction coordinate to characterize the entire folding landscape for a specific protein, they are not able to claim that it is possible for all proteins [63]. In general, a multidimensional projection is necessary to capture all the information of the energy landscape.

Lastly, at T > Tf, protein polymers are in a good solvent where the beads favor interaction with solvents; thus a protein model unfolds. Without a defined folding nucleus, the dispersion of the probability of contact formation among native contacts becomes narrow. In other words, the probability of contact formation between long-range contacts [Figs. 7(c) and 8(c), upper triangle] is more probable at T > Tf than at T < Tf. For BDHI, we noticed a strong displacement correlation between non-native pairs at the proximity of native pairs [Figs. 7(c) and 8(c), lower triangle]. Under HI at T > Tf, a protein takes more time to fold, reflecting excess hydrodynamic friction for fluid drainage due to the close proximity between patches of native and neighboring non-native contacts separated at a long distance in a sequence space [Figs. 7(f) and 8(f)].

C. Possible experimental validation

Can we experimentally determine the effect of HI on protein folding? Or is there evidence to support our investigation? We argue that the experiment of temperature-dependent mutagenesis of protein folding (ϕ-value experiments) by the Gruebele group may provide experimental validation of our predictions. Experimentalists mutate amino acids on a protein one by one and then measure the change in the folding rates. With this difference compared against the change in protein stability, experimentalists map out the information of the transition state of protein folding from CI2 decades ago [64]. Normally it is experimentally determined at a room temperature. There can be a ϕ value for each amino acid that ranges from 0 to 1. A ϕ value at midrange denotes the importance of that residue forming transition states. However, Gruebele’s group has measured the ϕ values for a simple λ repressor over a full range of temperatures, which is a notable departure from the single temperature measurements [57]. As expected, they noticed a quadratic curve of the logarithm of folding rates with respect to inverse temperature (shown in Fig. 3 of [57]). Most of his mutations were done for amino acids at the hydrophobic core such that the mutation can impact both folding kinetics and stability. We noticed, however, that a few mutations (e.g., λsQ33Y, λsA37G, and λsA81G) close to the loop or the end of the protein show an interesting behavior: they fold faster than the wild type (WT) at low temperature and become slower than the WT at a high temperature. We speculate this can be a signature of the impact of HI since these mutations do not directly affect the protein’s thermodynamic stability, as these mutations are not in the hydrophobic core of the protein. To show a full impact from HI on folding, mutations should be performed on the loop regions that affect the kinetics but not necessarily the stability of folding. We predict that the outer loop mutations will show a crossover behavior at a full range of temperatures when compared to the wild-type protein due to the impact of HI and not because of thermodynamic stability.

D. Concluding Remarks

In summary, we have settled the controversy of the extent of the impact of HI on folding kinetics by comparing the effect of HI over a range of temperatures instead of a single temperature. We found that HI can both accelerate protein folding at a temperature lower than the folding temperature and retard protein folding at a temperature higher than the folding temperature, in comparison with the folding dynamics without HI. Through this result, we have explored three different causal mechanisms: (1) the effective diffusive dynamics of Q over the free energy barrier crossing, (2) a multidimensional landscape that gives rise to a choice of multiple folding pathways, and (3) the kinetic ordering and hydrodynamically correlated motion between beads in viscous solvents. Finally, we have proposed experiments to test our predicted results. Our findings will provide theoretical insight to future protein folding kinetic experiments, and guide simulation design for coarse-grained models.

Acknowledgments

We thank the members from the Center for Theoretical Biological Physics (CTBP) at Rice University and Dr. Greg Morrison for stimulating discussions. M.S.C. thanks Dr. Jeff Skolnick for the software HIBD. We thank Basilio Cieza and Mohammadmehdi Ezzatabadipour for their early participation in this project. We thank the Center for Advanced Computing and Data Science (CACDS) and Research Computing Center (RCC) at the University of Houston for the computational resources. We thank the National Science Foundation for their funding support (MCB: 1412532, ACI: 1531814, PHY: 1427654).

References

  • 1.Levy Y, Onuchic JN. Annu Rev Biophys Biomol Struct. 2006;35:389. doi: 10.1146/annurev.biophys.35.040405.102134. [DOI] [PubMed] [Google Scholar]
  • 2.Goldtzvik Y, Zhang Z, Thirumalai D. J Phys Chem B. 2016;120:2071. doi: 10.1021/acs.jpcb.5b11153. [DOI] [PubMed] [Google Scholar]
  • 3.Chiricotto M, Melchionna S, Derreumaux P, Sterpone F. J Chem Phys. 2016;145:035102. doi: 10.1063/1.4958323. [DOI] [PubMed] [Google Scholar]
  • 4.Mikhailov AS, Kapral R. Proc Natl Acad Sci USA. 2015;112:E3639. doi: 10.1073/pnas.1506825112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Pitard E. Eur Phys J B. 1999;7:665. [Google Scholar]
  • 6.Kuznetsov YA, Timoshenko EG, Dawson KA. J Chem Phys. 1996;104:3338. [Google Scholar]
  • 7.Kamata K, Araki T, Tanaka H. Phys Rev Lett. 2009;102:108303. doi: 10.1103/PhysRevLett.102.108303. [DOI] [PubMed] [Google Scholar]
  • 8.Chang RW, Yethiraj A. J Chem Phys. 2001;114:7688. [Google Scholar]
  • 9.Das S, Chakraborty S. J Chem Phys. 2010;133:174904. doi: 10.1063/1.3495479. [DOI] [PubMed] [Google Scholar]
  • 10.Kikuchi N, Gent A, Yeomans JM. Eur Phys J E. 2002;9:63. doi: 10.1140/epje/i2002-10056-6. [DOI] [PubMed] [Google Scholar]
  • 11.Pham TT, Bajaj M, Prakash JR. Soft Matter. 2008;4:1196. doi: 10.1039/b717350d. [DOI] [PubMed] [Google Scholar]
  • 12.Cieplak M, Niewieczerzal S. J Chem Phys. 2009;130:124906. doi: 10.1063/1.3050103. [DOI] [PubMed] [Google Scholar]
  • 13.Frembgen-Kesner T, Elcock AH. J Chem Theory Comput. 2009;5:242. doi: 10.1021/ct800499p. [DOI] [PubMed] [Google Scholar]
  • 14.Lipska AG, Seidman SR, Sieradzan AK, Gieldon A, Liwo A, Scheraga HA. J Chem Phys. 2016;144:184110. doi: 10.1063/1.4948710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kikuchi N, Ryder JF, Pooley CM, Yeomans JM. Phys Rev E. 2005;71:061804. doi: 10.1103/PhysRevE.71.061804. [DOI] [PubMed] [Google Scholar]
  • 16.Onuchic JN, Nymeyer H, Garcia AE, Chahine J, Socci ND. Adv Protein Chem. 2000;53:87. doi: 10.1016/s0065-3233(00)53003-4. [DOI] [PubMed] [Google Scholar]
  • 17.Oliveberg M, Wolynes PG. Q Rev Biophys. 2005;38:245. doi: 10.1017/S0033583506004185. [DOI] [PubMed] [Google Scholar]
  • 18.Bryngelson JD, Wolynes PG. Proc Natl Acad Sci USA. 1987;84:7524. doi: 10.1073/pnas.84.21.7524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Leopold PE, Montal M, Onuchic JN. Proc Natl Acad Sci USA. 1992;89:8721. doi: 10.1073/pnas.89.18.8721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Clementi C, Nymeyer H, Onuchic JN. J Mol Biol. 2000;298:937. doi: 10.1006/jmbi.2000.3693. [DOI] [PubMed] [Google Scholar]
  • 21.Harpaz Y, Elmasry N, Fersht AR, Henrick K. Proc Natl Acad Sci USA. 1994;91:311. doi: 10.1073/pnas.91.1.311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Musacchio A, Noble M, Pauptit R, Wierenga R, Saraste M. Nature (London) 1992;359:851. doi: 10.1038/359851a0. [DOI] [PubMed] [Google Scholar]
  • 23.Dokholyan NV, Li L, Ding F, Shakhnovich EI. Proc Natl Acad Sci USA. 2002;99:8637. doi: 10.1073/pnas.122076099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hoang T, Cieplak M. J Chem Phys. 2000;113:8319. [Google Scholar]
  • 25.Weikl TR, Dill KA. J Mol Biol. 2003;332:953. doi: 10.1016/s0022-2836(03)00884-2. [DOI] [PubMed] [Google Scholar]
  • 26.Wu L, Li W, Liu F, Zhang J, Wang J, Wang W. J Chem Phys. 2009;131:065105. doi: 10.1063/1.3200952. [DOI] [PubMed] [Google Scholar]
  • 27.Camilloni C, Sutto L, Provasi D, Tiana G, Broglia RA. Protein Sci. 2008;17:1424. doi: 10.1110/ps.035105.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ermak DL, McCammon JA. J Chem Phys. 1978;69:1352. [Google Scholar]
  • 29.Nymeyer H, Socci ND, Onuchic JN. Proc Natl Acad Sci USA. 2000;97:634. doi: 10.1073/pnas.97.2.634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M. Bioinformatics. 1999;15:327. doi: 10.1093/bioinformatics/15.4.327. [DOI] [PubMed] [Google Scholar]
  • 31.Xu WX, Lai ZZ, Oliveira RJ, Leite VBP, Wang J. J Phys Chem B. 2012;116:5152. doi: 10.1021/jp212132v. [DOI] [PubMed] [Google Scholar]
  • 32.Stivala A, Wybrow M, Wirth A, Whisstock JC, Stuckey PJ. Bioinformatics. 2011;27:3315. doi: 10.1093/bioinformatics/btr575. [DOI] [PubMed] [Google Scholar]
  • 33.Humphrey W, Dalke A, Schulten K. J Mol Graphics Modell. 1996;14:33. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  • 34.Kabsch W, Sander C. Biopolymers. 1983;22:2577. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  • 35.Itzhaki LS, Otzen DE, Fersht AR. J Mol Biol. 1995;254:260. doi: 10.1006/jmbi.1995.0616. [DOI] [PubMed] [Google Scholar]
  • 36.Kazmirski SL, Wong KB, Freund SMV, Tan YJ, Fersht AR, Daggett V. Proc Natl Acad Sci USA. 2001;98:4349. doi: 10.1073/pnas.071054398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Martinez JC, Pisabarro MT, Serrano L. Nat Struct Biol. 1998;5:721. doi: 10.1038/1418. [DOI] [PubMed] [Google Scholar]
  • 38.Grantcharova VP, Riddle DS, Santiago JV, Baker D. Nat Struct Biol. 1998;5:714. doi: 10.1038/1412. [DOI] [PubMed] [Google Scholar]
  • 39.Ando T, Skolnick J. Proc Natl Acad Sci USA. 2010;107:18457. doi: 10.1073/pnas.1011354107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rotne J, Prager S. J Chem Phys. 1969;50:4831. [Google Scholar]
  • 41.Yamakawa H. J Chem Phys. 1970;53:436. [Google Scholar]
  • 42.Sugita Y, Okamoto Y. Chem Phys Lett. 1999;314:141. [Google Scholar]
  • 43.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. J Chem Phys. 1953;21:1087. [Google Scholar]
  • 44.Chodera JD, Swope WC, Pitera JW, Seok C, Dill KA. J Chem Theory Comput. 2007;3:26. doi: 10.1021/ct0502864. [DOI] [PubMed] [Google Scholar]
  • 45.Whitford PC, Blanchard SC, Cate JHD, Sanbonmatsu KY. PLoS Comput Biol. 2013;9:e1003003. doi: 10.1371/journal.pcbi.1003003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Socci ND, Onuchic JN, Wolynes PG. J Chem Phys. 1996;104:5860. [Google Scholar]
  • 47.Bryngelson JD, Wolynes PG. J Phys Chem. 1989;93:6902. [Google Scholar]
  • 48.Hanggi P, Talkner P, Borkovec M. Rev Mod Phys. 1990;62:251. [Google Scholar]
  • 49.Krivov SV. PLoS Comput Biol. 2010;6:e1000938. doi: 10.1371/journal.pcbi.1000921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Neusius T, Daidone I, Sokolov IM, Smith JC. Phys Rev Lett. 2008;100:188103. doi: 10.1103/PhysRevLett.100.188103. [DOI] [PubMed] [Google Scholar]
  • 51.Li L, Mirny LA, Shakhnovich EI. Nat Struct Biol. 2000;7:336. doi: 10.1038/74111. [DOI] [PubMed] [Google Scholar]
  • 52.Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. Science. 2011;334:517. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
  • 53.Pande VS. Phys Rev Lett. 2010;105:198101. doi: 10.1103/PhysRevLett.105.198101. [DOI] [PubMed] [Google Scholar]
  • 54.Chekmarev SF, Palyanov AY, Karplus M. Phys Rev Lett. 2008;100:018107. doi: 10.1103/PhysRevLett.100.018107. [DOI] [PubMed] [Google Scholar]
  • 55.Tanaka H. J Phys: Condens Matter. 2005;17:S2795. [Google Scholar]
  • 56.Best RB, Hummer G. Proc Natl Acad Sci USA. 2010;107:1088. doi: 10.1073/pnas.0910390107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Yang WY, Gruebele M. Biochemistry. 2004;43:13018. doi: 10.1021/bi049113b. [DOI] [PubMed] [Google Scholar]
  • 58.Berger B, Leighton T. J Comput Biol. 1998;5:27. doi: 10.1089/cmb.1998.5.27. [DOI] [PubMed] [Google Scholar]
  • 59.Ngo JT, Marks J, Karplus M. In: The Protein Folding Problem and Tertiary Structure Prediction. Merz K, LeGrand SM, editors. Birkhäuser; Boston: 1994. p. 433. [Google Scholar]
  • 60.Denesyuk NA, Weeks JD. Phys Rev Lett. 2009;102:108101. doi: 10.1103/PhysRevLett.102.108101. [DOI] [PubMed] [Google Scholar]
  • 61.Morrison G, Hyeon C, Hinczewski M, Thirumalai D. Phys Rev Lett. 2011;106:138102. doi: 10.1103/PhysRevLett.106.138102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Zhuravlev PI, Hinczewski M, Chakrabarti S, Marqusee S, Thirumalai D. Proc Natl Acad Sci USA. 2016;113:E715. doi: 10.1073/pnas.1515730113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Neupane K, Manuel AP, Woodside MT. Nat Phys. 2016;12:700. [Google Scholar]
  • 64.Jackson SE, elMasry N, Fersht AR. Biochemistry. 1993;32:11270. doi: 10.1021/bi00093a002. [DOI] [PubMed] [Google Scholar]

RESOURCES