Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2016 Aug 23;111(4):756–767. doi: 10.1016/j.bpj.2016.07.013

Solvation Thermodynamics of Oligoglycine with Respect to Chain Length and Flexibility

Justin A Drake 1, Robert C Harris 2, B Montgomery Pettitt 1,
PMCID: PMC5002085  PMID: 27558719

Abstract

Oligoglycine is a backbone mimic for all proteins and is prevalent in the sequences of intrinsically disordered proteins. We have computed the absolute chemical potential of glycine oligomers at infinite dilution by simulation with the CHARMM36 and Amber ff12SB force fields. We performed a thermodynamic decomposition of the solvation free energy (ΔGsol) of Gly2–5 into enthalpic (ΔHsol) and entropic (ΔSsol) components as well as their van der Waals and electrostatic contributions. Gly2–5 was either constrained to a rigid/extended conformation or allowed to be completely flexible during simulations to assess the effects of flexibility on these thermodynamic quantities. For both rigid and flexible oligoglycine models, the decrease in ΔGsol with chain length is enthalpically driven with only weak entropic compensation. However, the apparent rates of decrease of ΔGsol, ΔHsol, ΔSsol, and their elec and vdw components differ for the rigid and flexible models. Thus, we find solvation entropy does not drive aggregation for this system and may not explain the collapse of long oligoglycines. Additionally, both force fields yield very similar thermodynamic scaling relationships with respect to chain length despite both force fields generating different conformational ensembles of various oligoglycine chains.

Introduction

The protein trinity hypothesis says that proteins may exist in an ordered, collapsed-disordered (molten globule), or extended-disordered (random coil) conformational state, and that these states give rise to particular, biological functions (1, 2). Intrinsically disordered proteins (IDPs) or regions (IDRs) within proteins present a diverse ensemble of structures with a range of molecular dimensions. To capture this diversity, Uversky (3) proposed the addition of an intermediate premolten globule thermodynamic state characterized with a compactness and residual structure between that of the collapsed and extended states. An IDP or IDR’s occupancy in and interconversion between these states depends, in part, on sequence composition and chain length (2, 4), while other processes (e.g., binding, allostery, posttranslational modifications) shift the equilibrium between these states. Here we are interested in how the innate properties of a short, model IDP may affect the competition between intrapeptide and peptide-solvent interactions and thus result in extended or compact conformational ensembles of relevance to longer IDPs.

Extended IDPs tend to lack hydrophobic side chains and often have a large overall net charge (3, 4, 5, 6, 7). Many IDPs collapse or aggregate despite the absence of hydrophobic side chains and/or the presence of a large, net charge (8, 9, 10, 11). Oligoglycine (i.e., the common protein backbone) is a particularly interesting example, as it is found in many IDRs (12, 13), exhibits IDR qualities, and has been shown to collapse in a length-dependent manner (14, 15, 16, 17). High glycine content has also been associated with compact IDRs (5). Evidence suggests that as chain length increases, favorable protein backbone interactions out-compete backbone-solvent interactions resulting in oligoglycine collapse (15, 16). The fact that the solubility of oligoglycine decreases dramatically with chain length (N = 1–5) further supports this idea (18). However, whether favorable intrapeptide interactions are due to H-bonding, electrostatic (elec), and/or van der Waals (vdw) interactions, is a subject of debate (14, 15, 16, 19). Interestingly, the solvation free energy of fixed conformations of oligoglycine in infinite dilution is negative and continues to decrease with chain length (20, 21)—further suggesting that collapse may not be a consequence of unfavorable solvent interactions, but due to the availability of more potential intrapeptide interactions. Few studies have considered the entropic contributions to the solvation free energy of oligoglycines in particular—or IDPs in general—as a function of chain length. While the hydrophobic effect (i.e., solvation entropic penalty) is known to play a key role in protein folding, its importance to aggregation and collapse of IDPs is less clear.

Using computational free energy methods, we performed a decomposition of the solvation thermodynamics for successively longer oligoglycine polypeptides (Gly2-Gly5). We use multistage free energy perturbation (FEP) to calculate solvation free energy (ΔGsol) and its vdw (ΔGvdw) and elec (ΔGelec) components. We further decomposed ΔGsol, ΔGvdw, and ΔGelec into their entropic and enthalpic components. FEP simulations were performed with oligoglycine models either constrained to a rigid-extended conformation or completely flexible (i.e., without positional constraints) using both the CHARMM36 (C36) (22) and AMBER ff12SB (23) force fields. We consider rigid and flexible oligoglycine models because we have previously shown how ΔGsol depends on flexibility and geometry (20, 24, 25). C36 and ff12SB are used to discriminate and control for force field effects and to investigate how the differences observed in the structural properties of oligoglycine predicted by these two force fields (17) affect the solvation thermodynamics.

We find that ΔGvdw and ΔGelec decrease with chain length but their magnitude and rate of decrease depends on conformational flexibility. Despite force-field-dependent structural properties of oligoglycine, both force fields yield very similar thermodynamic scaling profiles with respect to chain length. As chain length increases, TΔSsol becomes more unfavorable but is offset by a more favorable ΔHsol for both rigid and flexible oligoglycines. Differences in TΔSsol and ΔHsol between flexible and rigid oligoglycines is pronounced; however, due to their compensatory nature, they yield small differences in ΔGsol. Significant compaction of oligoglycine is predicted for chains containing >10 glycine residues (14, 15, 16). The question remains whether the solvation thermodynamics of short oligoglycines can predict (i.e., group-additivity) the solvation free energy of longer oligoglycine chains that can form complex, compact structures.

Both free energy error estimates and the necessary computational resources increase with chain length and limit this study to relatively short oligoglycines. However, our results help to provide mechanistic insight into the forces that may drive the collapse of longer oligoglycines.

Materials and Methods

FEP simulations were performed for a rigid-extended conformation of Gly2, Gly3, and Gly4 as well as for completely flexible Gly2–5 using NAMD 2.10 (26) with the CHARMM36 (C36) (22) and AMBER ff12SB force fields (23) at 300 and 320 K. Solvation free energy, ΔGsol, was calculated by first scaling the vdw interactions with the solvent followed by elec charging by means of a coupling parameter, λ. Below we describe the setup of the simulations, common simulation parameters, parameters specific to FEP, and the analysis methods.

System

For simulations with C36, oligoglycines were built in a fully extended state using the VMD plugin MoleFacture (27) and capped with neutral acetyl (ACE) and N-methylamide (NME) groups. They were similarly built using XLeap in AmberTools (23) for simulations with ff12SB. Systems were solvated with TIP3P water using either VMD’s Solvate plugin or XLeap ensuring at least a 10 Å padding between oligoglycine and the sides of the simulation box.

General simulation parameters

All simulations were performed at constant temperature (either 300 or 320 K) and pressure (1 atm) using a Langevin thermostat and barostat. NAMD’s default switching function at 1.0 nm was used to smoothly truncate nonbonded interactions at 1.2 nm. Electrostatic forces were calculated using particle mesh Ewald on a grid with a 1.0 Å spacing. The 1–4 scaling was set to 0.8333 for simulations with ff12SB to match AMBER’s nonbonded exclusion convention. The velocity Verlet algorithm with a 2 fs time step was used to integrate the equations of motion. Coordinates and energies were saved every 1 ps.

Free energy perturbation simulations

To set up the FEP simulations, we first performed a steepest descent minimization for each system, allowing only water coordinates to change for simulations with fixed oligoglycine. Configurations for FEP simulations of flexible oligoglycines were taken either from a short equilibration run or from the final configurations from long molecular dynamics (MD) simulations from previous studies in our lab at constant temperature and pressure. These configurations were used to initiate the FEP simulations at 300 K.

The process of solvating each oligoglycine was decomposed into two pathways following a protocol similar to that of Kokubo et al. (24). First, the vdw interactions between oligoglycine and the solvent are gradually turned on via a coupling parameter, λvdw, that takes on values between 0 (i.e., gas phase) and 1 (i.e., vdw interactions fully on). For intermediate λvdw values, the vdw interactions are scaled according to a soft-core potential (28). Then, with λvdw=1, the elec interactions are linearly scaled with λelec such that when λelec=1, oligoglycine is fully solvated. The interactions between atoms in oligoglycine were not scaled. For the vdw pathway, a Δλvdw value of 0.02 (50 windows) was used for rigid/fixed and flexible Gly2, Gly3, and Gly4 simulations, while a spacing of 0.04 was used for flexible Gly5. A Δλelec value of 0.0625 (16 windows) was used for all electrostatic pathways. Independent, concurrent MD simulations at 300 K were performed at each λ for 50 ns for fixed and flexible Gly2–4 and 80 ns for flexible Gly5. The change in free energy was computed from these simulations at each λvdw and λelec and is discussed in the following section. The simulations were then repeated at 320 K, using the final configuration from the simulations at 300 K, to measure the change in solvation entropy (29, 30). One nanosecond at each λ was attributed to equilibration. In total, 1748 simulations were performed for an aggregate simulation time of 92.3 μs. While the overall solvation free energy is a state function, its vdw and elec components, as defined here, are path/protocol-dependent.

Solvation free energy

The vdw and elec contributions to solvation free energy are computed separately according to the exponential form of the FEP formula (31, 32):

ΔGvdw=1βi=1nvdwlnexp(βΔUivdw)i, (1)
ΔGelec=1βi=1neleclnexp(βΔUielec)i, (2)
ΔGsol=ΔGvdw+ΔGelec, (3)

where i is used to index a particular λivdw or λielec that take on values between 0 and 1; β is the inverse temperature; and ΔUi=ΔU(λi+1)ΔU(λi), which is the difference in either the vdw or elec potential energy for a configuration generated in the λivdw or λielec ensemble and those for the same configuration if it were in the λi+1vdw or λi+1elec ensemble, respectively. i is an ensemble average and the summation is either over the λvdw or λelec values. Simulations at each λ yield trajectories of ΔUivdw or ΔUielec values. Note that the small contributions from pressure and volume are not included in the estimates of ΔG. For a detailed review on multistage FEP calculations, we refer the reader to Kokubo et al. (24) and Pohorille et al. (31). The total solvation free energy, ΔGsol, is the sum of ΔGvdw and ΔGelec.

Solvation entropy and enthalpy

Two approaches were used to decompose ΔGsol into its enthalpic and entropic components. In the first method, referred to as the finite difference (FD) approach, we further decompose ΔGvdw and ΔGelec into their enthalpic and entropic components. In this approach, the vdw and elec contributions to solvation entropy are given by a central finite difference approximation (29, 30):

ΔSvdw(310K)=ΔGvdwT|310KΔGvdw(310K+ΔT)ΔGvdw(310KΔT)2ΔT, (4)
ΔSelec(310K)=ΔGelecT|310KΔGelec(310K+ΔT)ΔGelec(310KΔT)2ΔT, (5)

where ΔT is 10 K. Similarly, ΔSsol is the sum of ΔSvdw and ΔSelec. To calculate solvation enthalpy and its components, we estimate ΔGsol, ΔGvdw, and ΔGelec at 310 K as the average of those measured at 300 and 320 K, and use the fact that ΔG=ΔHTΔS.

Alternatively, we directly calculate solvation enthalpy at 300 and 320 K as the difference in the average, total potential energy of the fully solvated state (aqueous), and the gas phase reference state (referred to as the end point energy (EP) approach):

ΔHsolΔUsol=UaqUgas, (6)
ΔHsolUtotal(λvdw=1λelec=1)Utotal(λvdw=0λelec=0). (7)

Here, Utotal is the sum of all bonded and nonbonded peptide-peptide, peptide-solvent, and solvent-solvent potential energy terms. Whereas the solvation free energy depends only on the peptide-solvent potential energy, the solvation enthalpy and entropy include contributions from all bonded and nonbonded interactions in the system (33). ΔSsol at 300 and 320 K is calculated as TΔS=ΔHΔG. Note that ΔSsol was not further decomposed into its vdw and elec components as in the FD approach. To directly compare to the FD approach, ΔSsol and ΔHsol were approximated at 310 K as the average of those measured at 300 and 320 K. The change in the heat capacity of solvation is minor and indicates that both the FD approach and the averaging of the thermodynamic quantities to yield estimates at 310 K are reasonable approximations for the temperature range considered.

Error analysis

We use two independent methods to estimate errors (i.e., statistical uncertainties) in ΔGvdw and ΔGelec at 300 and 320 K. We briefly describe these methods but a more detailed description can be found in Appendix A. The first method involves estimating the correlation times from autocorrelation functions (ACFs) of exp(βΔUi(t)) for each of the 1748 λ-simulations. We note that ACFs of ΔUi(t) yielded very similar autocorrelation times. Error propagation, corrected for the number of independent observations using the correlation times, is applied to the exponential form of the FEP equation (31) to estimate errors, σACFvdw and σACFelec, in ΔGvdw and ΔGelec, respectively. In the second method, we use a blocking approach (34, 35) in which trajectories of ΔUivdw and ΔUielec are broken up into blocks that span λ-space. ΔGvdw and ΔGelec are calculated using Eq. 1 and Eq. 2, respectively, for each block and the variance of these block estimates is used to calculate the block standard error (BSE). The BSEs are calculated as a function of block length and the point at which the BSE stops varying with block length will yield error estimates, σBSEvdw and σBSEelec, of ΔGvdw and ΔGelec, respectively. The errors measured by these two methods are then propagated, separately, to estimate errors in ΔGsol.

For the FD approach, ACF and BSE errors in ΔGsol, ΔGvdw, and ΔGelec are propagated to estimate errors in ΔSsol, ΔSvdw, ΔSelec, ΔHsol, ΔHvdw, and ΔHelec. For the EP approach, autocorrelation times of Uaq(t) and Ugas(t) were estimated from their respective ACFs similarly detailed in Appendix A. Errors in Uaq and Ugas were corrected for the number of independent observations and then propagated to estimate the error in ΔHsol. Additionally, BSEs were calculated in a similar manner detailed above but with Uaq or Ugas estimated within each block. The error in the endpoint energy estimate of ΔHsol calculated via the ACF or BSE approaches were combined with the corresponding errors in ΔGsol, to estimate the errors in ΔSsol.

Results

Toward understanding the solvation thermodynamic mechanisms that dictate, in part, the structural properties of disordered oligoglycine chains, we have decomposed the solvation free energy (ΔGsol) of oligoglycine into its vdw (ΔGvdw) and elec (ΔGelec) components as a function of chain length and force field using stratified FEP. The solvation free energy was further decomposed into its entropic (ΔSsol) and enthalpic (ΔHsol) contributions using two independent approaches (i.e., the FD and EP approaches as discussed in the Materials and Methods). The FD approach also yields the vdw and elec components of ΔSsol and ΔHsol. Because the FD approach provides a more accurate estimate of ΔS at the midpoint (i.e., 310 K) and to compare results from both approaches, ΔSsol and ΔHsol calculated via the EP approach were approximated at 310 K as the average of those measured at 300 and 320 K. We also investigate the effects of conformational flexibility on these various thermodynamic quantities. Below we present the results for each quantity at 310 K, and provide the results at 300 and 320 K in the Supporting Material.

Solvation free energy

Fig. 1 depicts the scaling of ΔGsol, ΔGvdw, and ΔGelec with chain length (N) for fixed, rigid-extended (solid line) and flexible (dashed line) oligoglycines from FEP simulations with C36 and ff12SB at 310 K. The subscripts “fix” and “flex” are used to denote solvation free energies from simulations in which oligoglycine is fixed in an extended, rigid conformation or allowed to freely explore conformation space, respectively. We used two independent approaches to estimate errors. These approaches are detailed in the Materials and Methods and Appendix A. Both methods yield errors ∼1 kcal/mol or less for the ΔG values, suggesting our estimates have converged to sufficiently high precision to infer mechanism. Table S1 in the Supporting Material provides ΔGvdw, ΔGelec, and ΔGsol for each oligoglycine at 300 and 320 K, which were averaged to yield estimates of those at 310 K.

Figure 1.

Figure 1

Solvation free energy of rigid/extended (solid) and flexible (dashed) oligoglycines at 310 K as a function of chain length from simulations with the C36 (left) and ff12SB (right) force fields. The vdw, elec, and overall solvation free energy are shown in order from top to bottom. Errors associated with each quantity can be found in Table S1.

Both force fields yield similar trends in the scaling of ΔGfixsol, ΔGflexsol, and their vdw and elec components with N. For example, ΔGfixsol is negative and continues to decrease linearly as the number of residues increase. The average difference, or slope, in ΔGfixsol between Glyn and Glyn+1 is −5.12 and −5.44 kcal/mol per peptide unit for C36 and ff12SB, respectively. This scaling is consistent with the −5.00 and −5.48 kcal/mol per peptide unit reported by Tomar et al. (36) and Hu et al. (21) using different force fields and conformations of oligoglycine. Compared to ΔGfixsol, ΔGflexsol decreases at a slower rate with respect to N with an average contribution per peptide of −4.27 kcal/mol for C36 and −3.54 kcal/mol for ff12SB. There appears to be a transition occurring at Gly3 that estimates the scaling of the longer oligoglycines. In terms of the free energy components, both ΔGelec and ΔGvdw are negative for rigid and flexible oligoglycines and also depend on conformational flexibility. ΔGflexvdw is consistently, slightly less favorable than ΔGfixvdw, but their contributions to ΔGsol are small in comparison to their dominant electrostatic counterparts. The differences in ΔGelec between force fields (Fig. S1 in the Supporting Material) are ∼2 kcal/mol or less with those for ΔGsol at ∼1 kcal/mol at both temperatures.

Solvation entropy

In the FD approach, solvation entropy was calculated from the numerical derivative with respect to temperature using the free energy at 300 and 320 K. Fig. 2 (solid triangles) shows the scaling of ΔSsol and its elec (ΔSelec) and vdw (ΔSvdw) components for rigid and flexible oligoglycine. Data are provided in Table S2. Similar to the trends in solvation free energy, we find ΔSfixsol, ΔSfixvdw, and ΔSfixelec to be reasonably linear with respect to the number of residues with average contributions per peptide being −13.55, −8.68, and −4.90 cal/mol/K, respectively, for C36 and −13.48, −7.85, and −5.58 cal/mol/K for ff12SB. ΔSflexsol, while still unfavorable, is more positive than ΔSfixsol, indicating that conformational flexibility reduces the entropic penalty of solvating these short oligoglycines. The separation between ΔSflexsol and ΔSfixsol is largely due to the differences in the scaling of the vdw component, which contributes more to the overall solvation entropy than the elec component.

Figure 2.

Figure 2

Solvation entropy of rigid/extended (solid) and flexible (dashed) oligoglycine as a function of chain from simulations with the C36 (left) and ff12SB (right) force fields using the FD (solid triangle) and EP (open triangle) approaches. The vdw, elec, and overall solvation entropy are shown in order from top to bottom. Error estimates are provided in Tables S2 and S3.

In the EP approach, ΔSsol at 300 and 320 K was calculated first by directly estimating ΔHsol as the difference in the average potential energy of the system in the fully solvated state relative to the gas phase at both temperatures, then using ΔGsol=ΔHsolTΔSsol to solve for ΔSsol (Fig. S2; data provided in Table S3). To compare to the FD approach, ΔSsol at 310 K was approximated as the average of ΔSsol at 300 and 320 K. ΔSfixsol and ΔSflexsol calculated by the EP approach scale in a manner consistent with that observed from the FD approach (Fig. 2, open triangle). Again, both force fields yield a similar scaling of ΔSsol, ΔSelec, and ΔSvdw for rigid and flexible oligoglycine with N (Fig. S3).

Solvation enthalpy

Next, for the FD approach, we calculated solvation enthalpy (ΔHsol) and its components at 310 K using ΔH=ΔG+TΔS. We selected a temperature of 310 K because the finite difference used to approximate ΔS is more precise at the midpoint than at 300 or 320 K. ΔG at 310 K was estimated as the average of ΔG at 300 and 320 K. For the EP approach, ΔHsol was directly calculated at 300 and 320 K, then averaged to give ΔHsol at 310 K. Fig. 3 shows ΔHvdw, ΔHelec, and ΔHsol as a function of chain length, force field, and approach for rigid and flexible oligoglycines. Data are provided in Tables S4 and S5. Both ΔHfixsol and ΔHflexsol are large and negative, but the rate of decrease with respect to length of ΔHflexsol is slightly less. The scaling of ΔHsol with chain length is predominantly determined by ΔHelec. The average rate of decrease of ΔHflexvdw is −1.09 and −1.25 kcal/mol/unit for C36 and ff12SB, respectively, which is less than half of those observed for ΔHfixvdw. Differences between force fields were slightly larger for ΔH than ΔS and ΔG, with differences ranging between −4 and 4 kcal/mol (Fig. S5).

Figure 3.

Figure 3

Solvation enthalpy of rigid/extended (solid) and flexible (dashed) oligoglycine at 310 K as a function of chain from simulations with the C36 (left) and ff12SB (right) force fields calculated by the FD (solid triangle) and EP (open triangle) approaches. The vdw, elec, and overall solvation enthalpy are shown in order from top to bottom.

Whereas ΔGsol depends only on peptide-solvent interaction energy, ΔHsol and ΔSsol include bonded and nonbonded intrapeptide contributions as well as a solvent reorganization energy (33). To parse out these contributions, we decomposed ΔUsol (i.e., the EP approximation of ΔHsol) into its various peptide-peptide (ΔUu,uvdw, ΔUu,uelec), peptide-solvent (ΔUu,vvdw, ΔUu,velec), solvent-solvent (ΔUv,vvdw, ΔUv,velec), and intrapeptide-bonded (ΔUb) average energies for flexible Gly2 and Gly5. Fig. 4 depicts this energetic breakdown. We find that ΔHsol is favorable and continues to decrease with chain length due primarily to a favorable, decreasing change in the elec peptide-solvent interaction energy. As expected, this is slightly offset by a positive ΔUv,vvdw and much more so by ΔUv,velec, or a solvent reorganization penalty. For rigid oligoglycines, bonded and nonbonded intrapeptide energies cancel and favorable peptide-peptide interactions yield a favorable, decreasing ΔHsol with chain length. However, for flexible oligoglycines, solvation has the effect of reducing intrapeptide electrostatic energy, resulting in a positive ΔUu,uelec, and, to a lesser extent, ΔUu,uvdw, in a length-dependent manner. The bonded, intrapeptide potential energy contributes little to the solvation enthalpy. That the electrostatics dictates the scaling of ΔHsol with N is consistent with what we observe by the FD approach.

Figure 4.

Figure 4

Differences in the average energies of the components contributing to ΔHsolΔUsol for flexible Gly2 (left) and Gly5 (right). The subscript u denotes the solute/peptide and v denotes the solvent. ΔUb is the difference between the average bonded intrapeptide energies (e.g., dihedral, bond, angle energies) of oligoglycine in the final, solvated state and the gas phase.

Error analysis

Calculation of entropy differences from free energies requires strict control of errors. Accurately estimating errors in free energy calculations continues to be a challenging problem (31). Here, we used two independent methods to estimate errors, the details of which can be found in the Materials and Methods and Appendix A. Table S1 provides errors in the free energy and its components from the autocorrelation function (ACF) and the BSE approaches. Both approaches demonstrate that the error in ΔGflexsol is largely due to the error in ΔGflexelec whereas the errors in ΔGfixvdw and ΔGfixelec are of similar magnitudes. Errors also increase with chain length. However, the ACF errors are systematically larger than those calculated using the BSE approach. Two reasons may explain, in part, this observation.

First, the procedure used to fit the ACFs to estimate the correlation times appears to be sensitive to weak, long-time correlations (<5% self-similarity and oscillation), which are predominantly present in simulations of flexible oligoglycines in which the electrostatic interactions are turned on. The correlation times across the λelec simulations for flexible oligoglycines ranged from 1 ps to 34 ns and increased with chain length. The BSE approach appears to be less sensitive to these weak, long-time correlations. To test this, we calculated the BSE at each λvdw and λelec at 300 and 320 K for flexible Gly5 and calculated the correlation times using BSE=στ/N, where σ is the variance, τ is the correlation time, and N is the number of samples. While the profiles of correlation times across λelec were similar for both approaches, the correlation times calculated with the BSE approach were orders-of-magnitude (data not shown) less than those reported by the ACF approach. Others have noted that this BSE approach typically underestimates the true correlation time of dynamic observables (34, 37). In contrast, the vdw correlation times differed, at most, by a factor of 2.

Second, we took a conservative approach in estimating the correlation times as four times the decay constant from a best fit of the ACFs to an exponentially decaying function. For these reasons, we interpret the errors from the BSE and ACF approaches as loose lower and upper bounds, respectively, on the true error. The ACF errors, although larger, still indicate that our estimates of ΔGvdwΔGelec, and ΔGsol have converged to a precision <1 kcal/mol for rigid and flexible Gly2–4 (Table S1) and slightly larger for flexible Gly5. Due to the propagation of uncertainty and the fact that we are taking a finite derivative in the FD approach, the ACF and BSE errors are larger for ΔSsol, ΔHsol and their vdw and elec components (Tables S2 and S4). Interestingly, persistent, weak correlations were not present in the ACFs of Uaq(t) and Ugas(t) as compared to the ACFs of exp(βΔUi(t)). This led to errors (Tables S3 and S5) in ΔHsol and ΔSsol that were systematically less than those calculated via the FD approach. The BSEs reported by both approaches are more similar. We also do not observe as strong of a length dependence of the errors estimated by the EP approach.

Discussion

Scaling of solvation free energy, enthalpy, and entropy with chain length and the effects of flexibility

Computational free energy methods can be used to measure the thermodynamic properties of disordered polypeptides that are otherwise difficult to access with experiment. Here, we have performed a complete thermodynamic decomposition of the solvation free energy of oligoglycine as a function of chain length (N), force field, and positional constraints. Few studies have considered the scaling of solvation entropy of the protein backbone with the number of residues and the effects of flexibility on this scaling relationship. Accurately estimating the solvation free energy and entropy even for these short oligoglycine models required extensive sampling, particularly for flexible oligoglycines. This study required a total of 1748 simulations for an aggregate simulation time of 92.3 μs.

We find that both force fields yield very similar trends in the scaling of ΔGsol, ΔHsol, and ΔSsol and their vdw and elec components with N (Figure 1, Figure 2, Figure 3). Fig. 5 shows ΔGsol, ΔHsol, and TΔSsol for rigid-extended and flexible oligoglycines as a function of chain length and force field at 310 K (Tables S6 and S7) calculated by the FD and EP approaches. ΔGsol is negative and continues to decrease at a rate similar to that observed in previous studies of oligoglycine fixed in various conformations (21, 36, 38). Unlike for successively larger alkanes (39), the scaling of ΔGsol is enthalpically driven and only moderately compensated by unfavorable entropy changes. This is consistent with the transfer free energy of N-methylacetamide (NMA), a peptide group analog, from the gas to solvent phase (14, 40). However, ΔGsol of NMA is −10 kcal/mol at 298 K, which is nearly twice as large as the glycine peptide contribution per monomer we find for rigid oligoglycines and larger yet for flexible oligoglycines.

Figure 5.

Figure 5

Thermodynamic decomposition of solvation free energy for rigid/extended (solid) and flexible (dashed) oligoglycines as a function of chain length and force field at 310 K calculated by the FD and EP approaches. (Solid circle) ΔGsol calculated by free energy perturbation; (solid and open triangles) ΔHsol and TΔSsol calculated by the FD and EP approaches, respectively. Note that ΔHsol and TΔSsol at 310 K was approximated as their averages at 300 and 320 K.

When considering the effects of flexibility, we find that TΔSsol increases and ΔHsol decreases at a slower rate than their rigid counterparts. However, the increase in entropy associated with solvating flexible oligoglycines offsets the reduction in enthalpic interactions relative to that observed for rigid oligoglycines, and, as a result, differences in ΔGflexsol and ΔGfixsol are less pronounced. The FD and EP approaches predict very similar thermodynamic scaling profiles for the rigid and flexible oligoglycine models. However, the EP approach yields smaller statistical uncertainties and requires half as many FEP simulations to obtain ΔHsol and ΔSsol as the FD approach.

Given the trends in the scaling relationships established here for short oligoglycines, we expect the differences in ΔGflexsol, and their entropic and enthalpic components, to be amplified for longer oligoglycines. As we will discuss below, this will have implications for the validity of group-additive models to predict the thermodynamics of long, disordered polypeptides from the thermodynamic properties of its isolated chemical groups. Future work will address the predictive capabilities of these scaling relationships for longer oligoglycines.

Decomposing ΔGsol, ΔHsol, and ΔSsol into vdw and elec components provides insight into the mechanisms that yield their scaling relationships with N. ΔGelec dominates ΔGvdw and largely dictates the scaling of ΔGsol, as has been previously noted for various oligoglycine and oligoalanine chains (21, 24). On the other hand, ΔSvdw is of the same order of magnitude of ΔSelec but contributes more to ΔSsol. This is consistent with the idea that inserting a large vdw cavity in water reduces the translational entropy of the solvent and disrupts the water-water interaction network more so than the subsequent charging of the cavity (41, 42). The inverse relationship between enthalpy and entropy is also apparent in their vdw and elec components with ΔHelec being ∼1.5–3 times more favorable than ΔHvdw for rigid and flexible oligoglycines. In general, both the absolute value and the rate of decrease of the vdw and elec components of ΔG, ΔH, and ΔS change when positional constraints are not imposed on the oligoglycine chain. Most notably, ΔGflexvdw, ΔHflexvdw, and ΔSflexvdw are consistently more positive than their fixed counterparts. We note that in a previous study in which oligomers of alanine were decomposed in a similar fashion, vdw solvation free energies were found to be consistently more positive for flexible alanine polypeptides than their rigid-extended counterparts (24). Flexibility has the consequence of giving less instantaneous vdw exposure to the solvent for most conformations as favorable peptide-solvent interactions may be screened. This is understandable, given the range of vdw attractions to solvent. Note that, for flexible alanine chains, the slope of ΔGflexvdw turned positive. But just as in this study, the overall solvation free energy was dominated by the elec component.

Force-field comparison

Figs. S1, S3, and S5 show the differences in ΔGsol, ΔHsol, and ΔSsol and their components between force fields. Differences in ΔGsol between C36 and ff12SB are within 1–2 kcal/mol for rigid and flexible Gly2–5. Interestingly, there appears to be a compensatory relationship between force fields in that ff12SB consistently underestimates ΔGvdw relative to C36, but overestimates ΔGelec for Gly2–4, resulting in smaller differences in ΔGsol. These cancellations are also observed for flexible Gly5, but do not follow the aforementioned trend. Hu et al. (21) previously measured ΔGsol for various fixed conformations of Gly2–5 using the CHARMM27 (C27) force field and found an average contribution per peptide of −5.48 kcal/mol, which is consistent with the −5.26 and −5.57 kcal/mol/unit we find with C36 and ff12SB, respectively. However, the absolute values of ΔGsol were different. For example, they reported a ΔGsol of Gly5 that was roughly 10 kcal/mol less than what we find with C36 and ff12SB, which further highlights the effects of flexibility on solvation free energy.

While employing different strategies, C36 and ff12SB were parameterized to match experimental and theoretical structural and thermodynamic data (22, 23). We have previously shown that C36 generates structural ensembles of Gly3 and Gly10 to a greater extent, which are more extended and less compact than ff12SB with significant differences in individual dihedral angle populations (17). The small differences we observe in ΔGsol between C36 and ff12SB may suggest either that the solvation thermodynamic properties of oligoglycine are robust to differences in force-field structural parameters or that the chain lengths considered here are too short to realize significant differences in structural ensembles generated by C36 and ff12SB. The former implies a degenerate relationship between the structural manifold of oligoglycine and its solvation thermodynamics. Even for Gly5, differences in force fields are pronounced, with ff12SB sampling more compact conformations than C36 (data not shown). Given the striking differences in the structural ensemble of Gly10 produced by these two force fields (17), we expect differences in the scaling of ΔGsol to become more apparent for larger N. While this clearly needs to be tested, we note that caution should be exercised when attempting to connect structural ensembles to biological mechanisms from such potential models (i.e., force fields) even when those models are in thermodynamic agreement.

Implications for group-additivity

The notion of group-additivity posits that the solvation free energy of a macromolecule is the result of additive contributions of the chemical groups that comprise it. Additive principles have been applied to and observed in numerous studies of hydrocarbon and protein systems (18, 21, 43, 44, 45, 46, 47, 48, 49). However, evidence continues to mount that additivity is context-specific, and depends on local chemical environment and molecular conformation/geometry. While rough estimates are possible, the method may not accurately predict the solvation free energy of amino acids and proteins (20, 25, 36, 38, 50, 51, 52, 53, 54, 55). From a study of 50 conformations of decaglycine, we have shown that even an individual atom’s contribution to ΔGvdw and its repulsive component are not well defined and vary with respect to conformation; the result depends on local chemical environment and force field, which is contrary to what one would expect if additivity held (20).

While chemical context and conformation matter, the apparent linear scaling of ΔGfixsol with N and the relatively small differences between ΔGflexsol and ΔGfixsol observed here are consistent with an additive contribution of the glycine peptide unit to the overall solvation free energy. However, our data also indicate different rates of decrease of ΔGflexsol and ΔGfixsol with N. A recent study (36) has shown that the binding energy of a glycine peptide with the solvent depends on or is correlated with that of its neighboring peptides for a series of extended, fixed oligoglycines. They note that correlated fluctuations can change the free energy of the solute and this effect occurs over spatially distant moieties. For longer, flexible oligoglycines, structural fluctuations and long-range peptide-peptide interactions alter the local solvent structure around any one glycine peptide in the chain from that of the linear peptide. As a result, the contribution of each glycine peptide to the overall solvation free energy of the entire chain is conditioned by both its near and spatially more distant neighbors. These correlations or cooperative effects could not have been predicted from the solvation free energy of an isolated glycine and, thus, leads to a departure from group-additivity. This may explain why ΔGflexsol and ΔGfixsol decrease at different rates initially for short oligoglycines and oligoalanines (24) and suggests that longer chain lengths will be needed to better resolve the effects of flexibility and structural correlations on the solvation free energy of each peptide unit.

Implications for aggregation of short oligoglycines and collapse of long oligoglycines

If the scaling relationships we establish here hold, then favorable enthalpic interactions with the solvent will continue to drive the decrease in ΔGsol for long oligoglycines. However, long oligoglycines (N ≥ ∼15) collapse in relatively dilute or infinitely dilute solutions as observed in both experiment and simulation (14, 15, 16). In addition, short oligoglycines become exceedingly insoluble with visible aggregates forming for Gly5 at finite (millimolar) concentrations (18, 19). Thus, solvation does not drive the aggregation of short oligoglycines and the collapse of long oligoglycines.

We previously investigated the hydrophobic effect as a potential force driving the collapse of decaglycine (20). ΔGvdw is often defined as the hydrophobic solvation free energy, and, as seen for macromolecules and various alkanes (39), it is expected to increase roughly linearly with solvent-exposed surface area ΔGvdw=γvdwA, where γvdw is the implied surface tension, independent of the chemical properties of the solute and A is the exposed surface area (56, 57). Others have suggested that ΔGvdw should be split into an attractive (ΔGatt) and repulsive (ΔGrep) component and that ΔGrep should increase linearly with A(ΔGrep=γrepA) (58, 59). For a detailed review on various theories of hydrophobicity, we refer to Harris and Pettitt (54) and Ben-Amotz (60). A solvation free energy that increases with A would favor the initial folding or collapse of proteins by limiting the surface area exposed to solvent. Interestingly, from our study of 50 conformations of decaglycine, we found that ΔGvdw actually favors extended conformations with γvdw<0 because γatt>γrep and that γrep was weakly correlated, if at all, with A and depended on the conformation of decaglycine (20). Taking N as a proxy for A, in this study we also find that γvdw<0, but it depends both on force field and flexibility. While we did not further decompose ΔGvdw here, it is reasonable to assume that the attractive component outweighs the repulsive component. Tomar et al. (61) similarly find that attractive peptide-solvent dispersion interactions can prevent the collapse of deca-alanine. This seemingly counterintuitive observation that water-mediated, hydrophobic interactions promotes chain expansion rather than collapse as the surface area (i.e., solute size or chain length) increases is discussed in detail in Ben-Amotz (62).

A reasonable explanation for the observed collapse of long oligoglycines and their insolubility is that peptide-peptide interactions outcompete the still favorable peptide-solvent interactions—an observation that could not have been predicted from the solvation free energies of short, isolated oligoglycines. Using single molecule fluorescence correlation spectroscopy, Teufel et al. (15) found the hydrodynamic radius of gyration (Rh) of Gly20 to be significantly less than that of NME-Gly20 (wherein the amide hydrogens of Gly20 were replaced with methyl groups). They conjectured that the methyl groups preclude the formation of backbone H-bonds resulting in the observed increase in solubility and Rh of NME-Gly20 relative to Gly20. In another study using MD simulations, Karandur et al. (16) found that the collapse of Gly15 and Gly25 may be explained by nonhydrogen-bonding, intrapeptide dipole-dipole interactions. In a similar manner, these dipole-dipole correlations and charge layerings were also observed to stabilize clusters or aggregates of Gly5 (19). It appears that as chain length or concentration increase, and subsequently the number of available intra- or interpeptide interactions increase, these interactions are favored over peptide-solvent interactions. Uversky et al. (4) has noted that the extent of collapse or compaction of IDPs is, in part, a length-dependent property and that “…weak, intramolecular interactions can be unspecifically amplified by an increase in protein size.”

Alternatively, if the thermodynamic scaling relationships presented here do not hold, then at some longer chain length, TΔSsol may overtake ΔHsol and the initial collapse of long oligoglycines would be entropically driven, as others have noted (14). In this situation, collapsed states may be further favored via backbone interactions. So group-additive models would fail to predict the solvation thermodynamics of long oligoglycines from short ones. However, the scaling of ΔGsol, ΔHsol, and ΔSsol with chain length for shorter oligoglycines may provide upper bounds on these quantities for longer oligoglycines by explicitly neglecting structural correlations between short segments that comprise the larger chain. Further work will need to assess the predictive capabilities of the scaling relationships we established, and whether a thermodynamic transition exists in which a negative entropic component to the free energy drives collapse of longer oligoglycines.

Conclusion

The structural ensemble of an IDP/IDR is the result of a complex competition between intrapeptide and peptide-solvent interactions (63, 64, 65). A number of determinants, like chain length (4), sequence composition (5), and charge patterning (66), influence the propensity of an IDP/IDR to adopt compact or extended structures. The protein trinity hypothesis (1, 2), and its variants (3), have been useful in classifying or characterizing the structural properties of IDPs/IDRs and relating those properties to biological function. Here, we sought to gain insight into the solvation thermodynamics of long chains by considering isolated, short oligoglycine chains more amenable to computational free energy methods. IDRs, especially those found in the disordered regulatory domains of nuclear transcription factors (13), are enriched with oligoglycine tracts and glycine content (5, 12).

We have performed a complete thermodynamic decomposition of the solvation free energy into its enthalpic and entropic components as well as each of their vdw and elec contributions as functions of chain length. For both rigid and flexible oligoglycine models, the decrease in ΔGsol with chain length is enthalpically driven with only a weak entropic compensation. Solvation free energies of alkanes and polypeptides are often measured by imposing positional constraints on their structures. Here we have shown that the rate of decrease of ΔGsol, ΔSsol, ΔHsol, and their vdw and elec components with N depends on conformational flexibility; however, longer chain lengths may be needed to better resolve differences between ΔGflexsol and ΔGfixsol. The differences in the initial rate of decrease of ΔGflexsol and ΔGfixsol for shorter oligoglycines may be due to local solvent reorganization around individual peptide groups brought on by correlated, structural fluctuations (i.e., near and spatially distant neighbor effects) as proposed by Tomar et al. (36). For these reasons, we anticipate differences ΔGflexsol and ΔGfixsol to become more pronounced for longer chain lengths where long-range interactions (e.g. excluded volume) are more prominent and weak peptide-peptide interactions are amplified.

Previously, we have shown that the C36 and ff12SB fields yield considerably different conformational ensembles of Gly3 and Gly10 with C36 sampling more extended oligoglycine structures (17). Even for Gly5 we find ff12SB to sample more compact structures than C36 (data not published). Interestingly, these force fields, both of which were parameterized with different structural and thermodynamic target data, yield remarkably similar scaling relationships of ΔGflexsol, ΔSflexsol, and ΔHflexsol with chain length. This seems to imply a degenerate relationship between conformational ensemble and solvation thermodynamics, and complicates the ability to make consistent mechanistic inferences.

A favorable, enthalpically driven solvation free energy that continues to decrease with chain length (or surface area) does not account for the observed insolubility of Gly5 (18, 19) or the preferred collapse of longer oligoglycines (14, 15, 16). Alternatively, our work supports the idea that peptide-peptide interactions outcompete favorable peptide-solvent interactions. Whether solvation entropy promotes the collapse of longer oligoglycines outside this scope of work remains to be seen. Attempting to predict the solvation thermodynamics of longer, disordered polypeptides from that observed for shorter chain segments where intrapeptide interactions are limited may be problematic and render group-additive models less useful.

While there is a tendency for long oligoglycines to collapse, they adopt a diverse set of conformations (14, 16, 17, 21, 67) with shorter chain segments assuming extended conformations more exposed to the solvent. This observation, in combination with the fact that we find short oligoglycine tracts to favorably interact with solvent, may suggest that the conformational ensemble of long oligoglycine chains could be shifted to favor more extended or collapsed states without the need to overcome excessively high free energy barriers. In this way, the protein backbone provides a scaffold amenable to both collapsed-disorder and extended-disorder and that evolutionary forces have selected unique combinations of side chains to alter the distribution of these two states.

Author Contributions

J.A.D. was responsible for performing the free energy calculations, error analysis, interpretation, and cowriting the article. R.C.H. was responsible for thermodynamic analysis and interpretation. B.M.P. was responsible for aspects of the project design, interpretation of the results, and cowriting the article.

Acknowledgments

Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications.

The Robert A. Welch Foundation (grant No. H0037), the National Science Foundation (grant No. CHE1152876), the National Institutes of Health (grant No. GM037657), and the NCSA Blue Waters Graduate Research Fellowship are thanked for partial support of this work. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant No. ACI-1053575. Additionally, this research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (award Nos. OCI-0725070 and ACI-1238993) and the state of Illinois. Visual analysis was aided by equipment from the National Science Foundation (grant No. CNS-1338192).

Editor: Amedeo Caflisch.

Footnotes

Five figures and seven tables are available at http://www.biophysj.org/biophysj/supplemental/S0006-3495(16)30545-8.

Appendix A

The free energy change at a particular λ, ΔGi, is a nonlinear function of ΔUi measured from an MD simulation and is given by Eq. A.1 below. We drop the vdw and elec superscripts as the following applies to both components. While the simulations were performed in the NPT ensemble, we are neglecting the small contributions of pressure and volume to the free energy.

ΔGi=1βln[1Nij=1Niexp(βΔUi,j)], (A.1)

where Ni is the number of observations or frames from a simulation with λi, j indexes each frame, and ΔUi,j=U(λi+1)jU(λi)j. ΔG is then the sum of the contributions ΔGi:

ΔG=i=1nΔGi, (A.2)

where n is the number of λ-values and will differ for the vdw and elec components.

Time correlation analysis

For a function f(x1,x2), that is a linear combination of independent, random variables x1,x2, the variance of f(x1,x2) is given by Eq. A.3 (i.e. error propagation):

σf2=(fx1)2σx12+(fx2)2σx22+ (A.3)

To calculate the variance of ΔGi at λi (σΔGi2), we apply Eq. A.3 but with only one independent variable (ΔUi), to Eq. A.1, which gives

σΔGi2=1Nieffβ2exp(2βΔUi)iexp(βΔUi)i21Nieffβ2, (A.4)
Nieff=Ni2τiΔt+1. (A.5)

Nieff is the effective number of independent observations from a simulation, τi is the correlation time, and Δt is the time interval at which the ΔUi values were saved. To determine τi, we first generated the time ACF, C(t), of exp(βΔUi(t)) for each simulation. Visual inspection of the ACFs showed an initial, rapid decay in correlations, followed in many cases by a second, slower decay as the ACF approaches zero. Due to the large number of simulations and to capture the two-phase decay, we fit the ACFs with a double exponential function:

Cfit(t;γ1,γ2)=A1exp(tγ1)+A2exp(tγ2), (A.6)

where γ1 and γ2 are the decay rate constants and γ1 < γ2. If the double exponential fit failed or A2 < 0.01, the ACFs were fit with a single exponential. To improve the likelihood of capturing the true correlation time, the correlation times were conservatively estimated as τi=4γ2 or τi=4γ1 for the double or single exponential fits, respectively. In the situations where the fits yielded τi < 0, we took a conservative approach and set τi to 1 ps (i.e., Δt). These correlation times were used with Eq. A.4 to estimate the errors, σACFvdw and σACFelec (square root of the variance), in ΔGvdw and ΔGelec at 300 and 320 K. For the FD approach, Eq. A.3 was used to estimate errors in ΔGsol at both temperatures, ΔSsol, ΔHsol, and their vdw and elec components. For the EP approach, ACFs of Uaq(t) and Ugas(t) were similarly fit to Eq. A.6 and their autocorrelation times estimated as described above. Variances of Uaq and Ugas were adjusted for the effective number of independent observations (Eq. A.5), and Eq. A.3 was used to estimate the error in ΔHsol at 300 and 320 K. Equation A.3 was then used to estimate the error in ΔSsol at both temperatures.

BSE analysis

Separately for the vdw and elec components, we construct a matrix of ΔUi -values in which the columns are particular λ-values that span either λvdw or λelec and the rows are time points. The matrix has dimensions N × n, where N is the total number of frames and n is the number of λ-values—subscripts have been dropped because all simulations for the vdw and elec pathways, respectively, were simulated for the same length of time and number of λ-values. The matrix is then broken into blocks of a particular length such that N = M × b, where M is the number of submatrices of dimensions b × n and b is the block length. For a given b, ΔG values are calculated using Eq. 1 for each of the M blocks. The BSE is the standard deviation of these block free energies divided by the square root of the number of blocks, M. A series of BSE values are computed for a range of block lengths and the error is estimated at the point in which the BSE curve plateaus or BSE stops varying with block length. A maximum block length of 8 and 12 ns were used for fixed and flexible Gly2-4 and flexible Gly5, respectively. Taking a conservative approach, we estimate the errors σBSEvdw and σBSEelec in ΔGvdw and ΔGelec by taking the maximum BSE across the range of block lengths. Equation A.3 was used to estimate the errors in solvation free energy at 300 and 320 K, solvation entropy, solvation enthalpy, and their vdw and elec components. For the EP approach, BSEs were calculated in a similar manner except with Uaq or Ugas estimated within each block. Equation A.3 was then used to estimate the BSE of ΔHsolUaqUgas, which was further combined with the BSE of ΔGsol to yield the BSE of ΔSsol at both temperatures.

Supporting Material

Document S1. Figs. S1–S5 and Tables S1–S7
mmc1.pdf (1.2MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (2.1MB, pdf)

References

  • 1.Dunker A.K., Lawson J.D., Obradovic Z. Intrinsically disordered protein. J. Mol. Graph. Model. 2001;19:26–59. doi: 10.1016/s1093-3263(00)00138-8. [DOI] [PubMed] [Google Scholar]
  • 2.Dunker A.K., Brown C.J., Obradovic Z. Intrinsic disorder and protein function. Biochemistry. 2002;41:6573–6582. doi: 10.1021/bi012159+. [DOI] [PubMed] [Google Scholar]
  • 3.Uversky V.N. Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 2002;11:739–756. doi: 10.1110/ps.4210102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Uversky V.N., Santambrogio C., Grandori R. Length-dependent compaction of intrinsically disordered proteins. FEBS Lett. 2012;586:70–73. doi: 10.1016/j.febslet.2011.11.026. [DOI] [PubMed] [Google Scholar]
  • 5.Marsh J.A., Forman-Kay J.D. Sequence determinants of compaction in intrinsically disordered proteins. Biophys. J. 2010;98:2383–2390. doi: 10.1016/j.bpj.2010.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Oldfield C.J., Dunker A.K. Intrinsically disordered proteins and intrinsically disordered protein regions. Annu. Rev. Biochem. 2014;83:553–584. doi: 10.1146/annurev-biochem-072711-164947. [DOI] [PubMed] [Google Scholar]
  • 7.Weathers E.A., Paulaitis M.E., Hoh J.H. Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein. FEBS Lett. 2004;576:348–352. doi: 10.1016/j.febslet.2004.09.036. [DOI] [PubMed] [Google Scholar]
  • 8.Crick S.L., Jayaraman M., Pappu R.V. Fluorescence correlation spectroscopy shows that monomeric polyglutamine molecules form collapsed structures in aqueous solutions. Proc. Natl. Acad. Sci. USA. 2006;103:16764–16769. doi: 10.1073/pnas.0608175103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Möglich A., Joder K., Kiefhaber T. End-to-end distance distributions and intrachain diffusion constants in unfolded polypeptide chains indicate intramolecular hydrogen bond formation. Proc. Natl. Acad. Sci. USA. 2006;103:12394–12399. doi: 10.1073/pnas.0604748103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Walters R.H., Murphy R.M. Examining polyglutamine peptide length: a connection between collapsed conformations and increased aggregation. J. Mol. Biol. 2009;393:978–992. doi: 10.1016/j.jmb.2009.08.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Vitalis A., Wang X., Pappu R.V. Quantitative characterization of intrinsic disorder in polyglutamine: insights from analysis based on polymer theories. Biophys. J. 2007;93:1923–1937. doi: 10.1529/biophysj.107.110080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Uversky V.N., Oldfield C.J., Dunker A.K. Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling. J. Mol. Recognit. 2005;18:343–384. doi: 10.1002/jmr.747. [DOI] [PubMed] [Google Scholar]
  • 13.Bondos S.E., Tan X.-X., Matthews K.S. Physical and genetic interactions link hox function with diverse transcription factors and cell signaling proteins. Mol. Cell. Proteomics. 2006;5:824–834. doi: 10.1074/mcp.M500256-MCP200. [DOI] [PubMed] [Google Scholar]
  • 14.Tran H.T., Mao A., Pappu R.V. Role of backbone-solvent interactions in determining conformational equilibria of intrinsically disordered proteins. J. Am. Chem. Soc. 2008;130:7380–7392. doi: 10.1021/ja710446s. [DOI] [PubMed] [Google Scholar]
  • 15.Teufel D.P., Johnson C.M., Neuweiler H. Backbone-driven collapse in unfolded protein chains. J. Mol. Biol. 2011;409:250–262. doi: 10.1016/j.jmb.2011.03.066. [DOI] [PubMed] [Google Scholar]
  • 16.Karandur D., Harris R.C., Pettitt B.M. Protein collapse driven against solvation free energy without H-bonds. Protein Sci. 2016;25:103–110. doi: 10.1002/pro.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Drake J.A., Pettitt B.M. Force field-dependent solution properties of glycine oligomers. J. Comput. Chem. 2015;36:1275–1285. doi: 10.1002/jcc.23934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Auton M., Bolen D.W. Additive transfer free energies of the peptide backbone unit that are independent of the model compound and the choice of concentration scale. Biochemistry. 2004;43:1329–1342. doi: 10.1021/bi035908r. [DOI] [PubMed] [Google Scholar]
  • 19.Karandur D., Wong K.-Y., Pettitt B.M. Solubility and aggregation of Gly5 in water. J. Phys. Chem. B. 2014;118:9565–9572. doi: 10.1021/jp503358n. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Harris R.C., Drake J.A., Pettitt B.M. Multibody correlations in the hydrophobic solvation of glycine peptides. J. Chem. Phys. 2014;141:22D525. doi: 10.1063/1.4901886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hu C.Y., Kokubo H., Pettitt B.M. Backbone additivity in the transfer model of protein solvation. Protein Sci. 2010;19:1011–1022. doi: 10.1002/pro.378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Best R.B., Zhu X., MacKerell A.D., Jr. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ1 and χ2 dihedral angles. J. Chem. Theory Comput. 2012;8:3257–3273. doi: 10.1021/ct300400x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Case D.A., Berryman J.T., Kollman P.A. University of California-San Francisco; San Francisco, CA: 2012. AMBER 12. [Google Scholar]
  • 24.Kokubo H., Harris R.C., Pettitt B.M. Solvation free energies of alanine peptides: the effect of flexibility. J. Phys. Chem. B. 2013;117:16428–16435. doi: 10.1021/jp409693p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Harris R.C., Pettitt B.M. Effects of geometry and chemistry on hydrophobic solvation. Proc. Natl. Acad. Sci. USA. 2014;111:14681–14686. doi: 10.1073/pnas.1406080111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Phillips J.C., Braun R., Schulten K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005;26:1781–1802. doi: 10.1002/jcc.20289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  • 28.Beutler T.C., Mark A.E., van Gunsteren W.F. Avoiding singularities and numerical instabilities in free energy calculations based on molecular simulations. Chem. Phys. Lett. 1994;222:529–539. [Google Scholar]
  • 29.Choudhury N., Pettitt B.M. Enthalpy-entropy contributions to the potential of mean force of nanoscopic hydrophobic solutes. J. Phys. Chem. B. 2006;110:8459–8463. doi: 10.1021/jp056909r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pettitt B.M., Rossky P.J. Alkali halides in water: ion-solvent correlations and ion-ion potentials of mean force at infinite dilution. J. Chem. Phys. 1986;84:5836–5844. [Google Scholar]
  • 31.Pohorille A., Jarzynski C., Chipot C. Good practices in free-energy calculations. J. Phys. Chem. B. 2010;114:10235–10253. doi: 10.1021/jp102971x. [DOI] [PubMed] [Google Scholar]
  • 32.Zwanzig R.W. High-temperature equation of state by a perturbation method. I. Nonpolar gases. J. Chem. Phys. 1954;22:1420–1426. [Google Scholar]
  • 33.Yu H.-A., Karplus M. A thermodynamic analysis of solvation. J. Chem. Phys. 1988;89:2366–2379. [Google Scholar]
  • 34.Grossfield A., Zuckerman D.M. Quantifying uncertainty and sampling quality in biomolecular simulations. Annu. Rep. Comput. Chem. 2009;5:23–48. doi: 10.1016/S1574-1400(09)00502-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Flyvbjerg H., Petersen H.G. Error estimates on averages of correlated data. J. Chem. Phys. 1989;91:461. [Google Scholar]
  • 36.Tomar D.S., Asthagiri D., Weber V. Solvation free energy of the peptide group: its model dependence and implications for the additive-transfer free-energy model of protein stability. Biophys. J. 2013;105:1482–1490. doi: 10.1016/j.bpj.2013.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Berg B.A., Harris R.C. From data to probability densities without histograms. Comput. Phys. Commun. 2008;179:443–448. [Google Scholar]
  • 38.Gu W., Rahi S.J., Helms V. Solvation free energies and transfer free energies for amino acids from hydrophobic solution to water solution from a very simple residue model. J. Phys. Chem. B. 2004;108:5806–5814. [Google Scholar]
  • 39.Gallicchio E., Kubo M.M., Levy R.M. Enthalpy-entropy and cavity decomposition of alkane hydration free energies: numerical results and implications for theories of hydrophobic solvation. J. Phys. Chem. B. 2000;104:6271–6285. [Google Scholar]
  • 40.Makhatadze G.I., Lopez M.M., Privalov P.L. Heat capacities of protein functional groups. Biophys. Chem. 1997;64:93–101. doi: 10.1016/s0301-4622(96)02234-x. [DOI] [PubMed] [Google Scholar]
  • 41.Pratt L.R. Molecular theory of hydrophobic effects: “She is too mean to have her name repeated.”. Annu. Rev. Phys. Chem. 2002;53:409–436. doi: 10.1146/annurev.physchem.53.090401.093500. [DOI] [PubMed] [Google Scholar]
  • 42.Hummer G., Garde S., Pratt L.R. New perspectives on hydrophobic effects. Chem. Phys. 2000;258:349–370. [Google Scholar]
  • 43.Tanford C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. J. Am. Chem. Soc. 1962;84:4240–4247. [Google Scholar]
  • 44.Rytting J.H., Huston L.P., Higuchi T. Thermodynamic group contributions for hydroxyl, amino, and methylene groups. J. Pharm. Sci. 1978;67:615–618. doi: 10.1002/jps.2600670510. [DOI] [PubMed] [Google Scholar]
  • 45.Wolfenden R., Andersson L., Southgate C.C.B. Affinities of amino acid side chains for solvent water. Biochemistry. 1981;20:849–855. doi: 10.1021/bi00507a030. [DOI] [PubMed] [Google Scholar]
  • 46.Eisenberg D., McLachlan A.D. Solvation energy in protein folding and binding. Nature. 1986;319:199–203. doi: 10.1038/319199a0. [DOI] [PubMed] [Google Scholar]
  • 47.Canchi D.R., García A.E. Backbone and side-chain contributions in protein denaturation by urea. Biophys. J. 2011;100:1526–1533. doi: 10.1016/j.bpj.2011.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Dill K.A. Additivity principles in biochemistry. J. Biol. Chem. 1997;272:701–704. doi: 10.1074/jbc.272.2.701. [DOI] [PubMed] [Google Scholar]
  • 49.Pace C.N., Fu H., Grimsley G.R. Contribution of hydrophobic interactions to protein stability. J. Mol. Biol. 2011;408:514–528. doi: 10.1016/j.jmb.2011.02.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.König G., Bruckner S., Boresch S. Absolute hydration free energies of blocked amino acids: implications for protein solvation and stability. Biophys. J. 2013;104:453–462. doi: 10.1016/j.bpj.2012.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Staritzbichler R., Gu W., Helms V. Are solvation free energies of homogeneous helical peptides additive? J. Phys. Chem. B. 2005;109:19000–19007. doi: 10.1021/jp052403x. [DOI] [PubMed] [Google Scholar]
  • 52.Chang J., Lenhoff A.M., Sandler S.I. Solvation free energy of amino acids and side-chain analogues. J. Phys. Chem. B. 2007;111:2098–2106. doi: 10.1021/jp0620163. [DOI] [PubMed] [Google Scholar]
  • 53.Avbelj F., Baldwin R.L. Limited validity of group additivity for the folding energetics of the peptide group. Proteins. 2006;63:283–289. doi: 10.1002/prot.20756. [DOI] [PubMed] [Google Scholar]
  • 54.Harris R.C., Pettitt B.M. Reconciling the understanding of ‘hydrophobicity’ with physics-based models of proteins. J. Phys. Condens. Matter. 2016;28:083003. doi: 10.1088/0953-8984/28/8/083003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Della Gatta G., Usacheva T., Ichim D. Thermodynamics of solvation of some small peptides in water at T = 298.15 K. J. Chem. Thermodyn. 2006;38:1054–1061. [Google Scholar]
  • 56.Baldwin R.L. Energetics of protein folding. J. Mol. Biol. 2007;371:283–301. doi: 10.1016/j.jmb.2007.05.078. [DOI] [PubMed] [Google Scholar]
  • 57.Meyer E.E., Rosenberg K.J., Israelachvili J. Recent progress in understanding hydrophobic interactions. Proc. Natl. Acad. Sci. USA. 2006;103:15739–15746. doi: 10.1073/pnas.0606422103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Weeks J.D., Chandler D., Andersen H.C. Role of repulsive forces in determining the equilibrium structure of simple liquids. J. Chem. Phys. 1971;54:5237–5247. [Google Scholar]
  • 59.Gallicchio E., Zhang L.Y., Levy R.M. The SGB/NP hydration free energy model based on the surface generalized Born solvent reaction field and novel nonpolar hydration free energy estimators. J. Comput. Chem. 2002;23:517–529. doi: 10.1002/jcc.10045. [DOI] [PubMed] [Google Scholar]
  • 60.Ben-Amotz D. Water-mediated hydrophobic interactions. Annu. Rev. Phys. Chem. 2016;67:617–638. doi: 10.1146/annurev-physchem-040215-112412. [DOI] [PubMed] [Google Scholar]
  • 61.Tomar D.S., Weber V., Asthagiri D. Importance of hydrophilic hydration and intramolecular interactions in the thermodynamics of helix-coil transition and helix-helix assembly in a deca-alanine peptide. J. Phys. Chem. B. 2016;120:69–76. doi: 10.1021/acs.jpcb.5b09881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Ben-Amotz D. Hydrophobic ambivalence: teetering on the edge of randomness. J. Phys. Chem. Lett. 2015;6:1696–1701. doi: 10.1021/acs.jpclett.5b00404. [DOI] [PubMed] [Google Scholar]
  • 63.Anfinsen C.B. The formation and stabilization of protein structure. Biochem. J. 1972;128:737–749. doi: 10.1042/bj1280737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Anfinsen C.B. Principles that govern the folding of protein chains. Science. 1973;181:223–230. doi: 10.1126/science.181.4096.223. [DOI] [PubMed] [Google Scholar]
  • 65.Flory P.J. Hanser-Gardner; Cincinnati, OH: 1989. Statistical Mechanics of Chain Molecules. [Google Scholar]
  • 66.Das R.K., Pappu R.V. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc. Natl. Acad. Sci. USA. 2013;110:13392–13397. doi: 10.1073/pnas.1304749110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Hu C.Y., Lynch G.C., Pettitt B.M. Trimethylamine N-oxide influence on the backbone of proteins: an oligoglycine model. Proteins. 2010;78:695–704. doi: 10.1002/prot.22598. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figs. S1–S5 and Tables S1–S7
mmc1.pdf (1.2MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (2.1MB, pdf)

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES