Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2008 Apr;17(4):644–651. doi: 10.1110/ps.073105408

Side chain burial and hydrophobic core packing in protein folding transition states

Patrick J Farber 1, Anthony Mittermaier 1
PMCID: PMC2271162  PMID: 18305200

Abstract

A critical step in the folding pathway of globular proteins is the formation of a tightly packed hydrophobic core. Several mutational studies have addressed the question of whether tight packing interactions are present during the rate-limiting step of folding. In some of these investigations, substituted side chains have been assumed to form native-like interactions in the transition state when the folding rates of mutant proteins correlate with their native-state stabilities. Alternatively, it has been argued that side chains participate in nonspecific hydrophobic collapse when the folding rates of mutant proteins correlate with side-chain hydrophobicity. In a reanalysis of published data, we have found that folding rates often correlate similarly well, or poorly, with both native-state stability and side-chain hydrophobicity, and it is therefore not possible to select an appropriate transition state model based on these one-parameter correlations. We show that this ambiguity can be resolved using a two-parameter model in which side chain burial and the formation of all other native-like interactions can occur asynchronously. Notably, the model agrees well with experimental data, even for positions where the one-parameter correlations are poor. We find that many side chains experience a previously unrecognized type of transition state environment in which specific, native-like interactions are formed, but hydrophobic burial dominates. Implications of these results to the design and analysis of protein folding studies are discussed.

Keywords: protein folding, phi value analysis, hydrophobic core, hydrophobicity, side chain packing, mutagenesis


Proteins that fold with two-state kinetics are widely studied as model systems for the folding process. Although these molecules adopt their native conformations without the detectable accumulation of partly structured intermediates, information on rate-determining steps during the reaction can be obtained by comparing the effects of mutations on folding kinetics and native-state stability. In a typical analysis, a parameter Φ is calculated for each mutation (Matouschek et al. 1989) according to

graphic file with name 644equ1.jpg

where kFwt and kFmut are the folding rates of the wild-type and mutant proteins and GN is the change in the stability of the native state caused by the mutation,

graphic file with name 644equ2.jpg

If it is assumed that protein folding follows Arrhenius kinetics with a mutation-independent prefactor, then the numerator in Equation 1 is equal to the change in the stability of the transition state,

graphic file with name 644equ3.jpg

Mutations of side chains that are unstructured in the transition state produce Φ values of 0, while mutations of side chains that are natively structured in the transition state produce Φ values of 1. Intermediate values of Φ are obtained when interactions are only partially formed in the transition state or when folding proceeds through parallel transition states of which only a subset contain native-like interactions (Matouschek et al. 1989). Discrimination between single and parallel pathways can be achieved with Brønsted analyses, in which changes in transition-state stability, GT, are plotted as a function of changes in native stability, GN, for multiple mutations (Fersht et al. 1994). These plots can also provide evidence for the existence of native-like interactions in the transition state: When GT depends linearly on GN,

graphic file with name 644equ4.jpg

Φ is taken as the fraction of native contacts formed in transition state of a unique folding pathway (Fersht et al. 1994).

Φ value analysis has been used to investigate whether the tightly packed hydrophobic core characteristic of native globular proteins is formed in the transition state or at some later point in the folding process. In some cases, Brønsted analyses suggest that native-like hydrophobic core packing is present in the transition state. For example, in the chymotrypsin inhibitor CI2, folding data for multiple mutations at three buried positions in the “minicore” region of the protein produce a linear Brønsted plot with a slope of 0.3 (Itzhaki et al. 1995). This suggests that within the transition state, all minicore side chains experience interactions similar to those present in the folded state, but weakened by two-thirds, on average. Recently, a different relationship has been reported for the folding rates of hydrophobic core mutants. GT values obtained for multiple substitutions at individual positions in the N-terminal domain of ribosomal protein L9 (NTL9) and the SH3 domain from the Fyn tyrosine kinase (FYN) correlate with changes in side-chain hydrophobicity, such that

graphic file with name 644equ5.jpg

where GH is the change in side-chain hydrophobicity as calculated from experimental or theoretical water/octanol transfer free energies, GH = ΔG transfer mut − ΔG transfer wt, and η is a position-specific constant of proportionality (Northey et al. 2002a; Anil et al. 2005). The implication is that side chains at these positions do not form specific native-like interactions in the transition state and instead influence transition-state stability purely through hydrophobic burial. Larger, more hydrophobic side chains provide greater transition-state stability and faster folding rates without the specific shape requirements that characterize packing in the folded state (Anil et al. 2005).

If folding data agree with Equation 4 but not with Equation 5, or vice versa, it is possible to conclude that a position is involved in a subset of native-like interactions or, conversely, in nonspecific hydrophobic collapse in the transition state. However, in most cases, GT values correlate with both GN and GH, so it is not possible to unambiguously assign a core position to either of the two models. This is perhaps not surprising, since core mutations that reduce side-chain hydrophobicity also tend to destabilize the native state. Another issue in interpreting the folding rates of hydrophobic core mutants is that correlations between GT and both GH and GN can be quite poor. One hypothesis is that disruptive mutations have perturbed the structure of the transition state such that the folding pathways of the wild-type and mutant proteins are significantly different (Northey et al. 2002a). As described below, we offer an alternative explanation.

We have investigated a model in which shape-specific, native-like interactions are present at the rate-limiting step of folding, but do not necessarily form simultaneously with side chain burial. Taking an approach similar to one used by Dill and coworkers to investigate α-helix formation in transition states (Weikl and Dill 2007), we fit data from multiple mutations at individual hydrophobic core positions in CI2, NTL9, and FYN to an equation that accounts for both native-like interactions and additional hydrophobic collapse,

graphic file with name 644equ6.jpg

The first term corresponds to both hydrophobic burial and specific packing interactions that are shared by the native and transition states. The second term accounts for additional, i.e., nonnative, contributions of hydrophobic burial to the stability of the transition state. The changes in the stabilities of the native and transition states for each mutant, GN and GT, were taken from published experimental data, and the changes in side-chain hydrophobicity, GH, were calculated based on experimental or theoretical octanol/water transfer free energies (Hansch and Leo 1979; Fauchere and Pliska 1983). Using Equation 6, the separate contributions of native-like interactions and additional hydrophobic collapse to the stability of the transition state can be evaluated individually in the form of χN and χH parameters. In this way it is possible to unambiguously determine whether native-like core interactions are present at the rate-limiting step of folding. Furthermore, we show that the weakness of correlations between GT and GN observed for some positions is not due to significant disruption of the folding pathway. The experimental data are well fit by Equation 6, suggesting that all mutants at these positions fold via similar pathways with similar rate-limiting steps at which side chain burial and the formation of native-like interactions play separate but important roles.

Theory

The parameters χN and χH provide specific information regarding transition state interactions for the side chain at a given position in the primary amino acid sequence. In cases where the position is fully solvent exposed in the unfolded state and fully buried in the native state, the sum (χN + χH) corresponds approximately to the extent of side chain burial in the transition state. Even when the position is not fully buried and exposed in the folded and unfolded states, respectively, nonzero values of χH arise only when side chain burial is asynchronous with the formation of native-like packing interactions. This follows from a consideration of the change in native-state stability resulting from a hydrophobic core mutation, GN, which may be separated into two components,

graphic file with name 644equ7.jpg

Gburial refers to the contribution of changes in side-chain hydrophobicity to the stabilization or destabilization of the native state and Greorg contains all other contributions to GN, including changes in side chain packing, geometric strain, hydrogen and electrostatic bonding, and conformational entropy in both the native and unfolded states (Eriksson et al. 1992; Richards and Lim 1994). When a position is fully exposed in the unfolded state and fully buried in the folded state, Gburial = GH, and

graphic file with name 644equ8.jpg

The parameter χN can be interpreted qualitatively as the fraction of the specific, native interactions contributing to Greorg that are formed in the transition state, including both local and long-range effects, while (χH + χN) can be interpreted as the fractional burial of the side chain in the transition state.

Even in cases where a position is partially solvent exposed in the folded state or partially buried in the unfolded state, the parameter χH still provides clear information about the structure of the transition state. The fraction of the change in native-state stability that is due to changes in hydrophobic desolvation is given by

graphic file with name 644equ9.jpg

where Gburial may be less than or equal to GH. Combining Equations 6 and 7, the fraction of GT due to hydrophobic burial is given by

graphic file with name 644equ10.jpg

When χH > 0, nonspecific hydrophobic interactions play a proportionally larger role in the stability of the transition state than they do in the native state and fThyd > fNhyd, regardless of the solvent exposure of the position in the folded and unfolded states.

Results

We have analyzed data for three proteins that fold with two-state kinetics and where at least one hydrophobic core position has been substituted with four or more different amino acids: the chymotrypsin inhibitor 2 (CI2) (one position) (Itzhaki et al. 1995), the N-terminal domain of ribosomal protein L9 (NTL9) (four positions) (Anil et al. 2005), and the SH3 domain from the Fyn tyrosine kinase (FYN) (six positions) (Northey et al. 2002a; de los Rios et al. 2005). The one position in CI2 examined (Ile30) does not participate in the main hydrophobic core; however, it is largely buried in the protein interior and contacts several core residues (Itzhaki et al. 1995); therefore we believe that data for this position are suitable for the two-parameter analysis. For each position, we fit the parameters χN and χH to the set of changes in native-state stability, GN, transition-state stability, GT, and amino acid water/octanol transfer free energy, GH, using Equation 6 and two-parameter linear regression through the origin. Values of χN, χH, and (χN + χH) are listed in Table 1, together with 90% confidence intervals computed using a Monte Carlo approach (Efron and Tibshirani 1986). When the 90% confidence interval does not include zero, the value of χN, χH, or (χN + χH) is considered to be statistically significant.

Table 1.

Two-parameter fits of folding kinetics

graphic file with name 644tbl1.jpg

The motivation for this analysis was to extract information on side chain interactions at the rate-limiting step of folding, particularly in cases where it has not been possible to firmly establish whether specific, native-like interactions are present in the transition state. This is true for most positions used in this study. Pearson correlation coefficients between GT and GN, rTN, and between GT and GH, rTH, listed in Table 1, are both much larger than zero in most cases, and it is not possible to unequivocally select either Equation 4 or 5 as an appropriate model. For example, mutations of Ile28 in FYN produce rTN and rTH values of 0.89 and 0.83, respectively, and plots of GT versus GN and GH, shown in Figure 1A,B are qualitatively similar. The GT values correlate to some extent with both GN and GH, although deviations from the linear fits are much greater than the uncertainties in GT. Significantly better agreement is obtained when χN and χH values are extracted from linear regression using Equation 6 and used to back-calculate GT, as shown in Figure 1C. Mutations of FYN Ala39 produce much lower values of the correlation coefficients rTN and rTH (0.76 and 0.58), and Φ values range from −0.06 for A39F to 0.86 for A39G. Plots of GT versus GN, shown in Figure 1D, and GT versus GH, shown in Figure 1E, are not linear. Even in this case, fits using Equation 6 are able to closely reproduce experimental GT values, as shown in Figure 1F. Notably, most of the Ala39 mutations were excluded from the original analysis, due to concerns that such disruptive, volume-increasing mutations could perturb the folding pathway (Northey et al. 2002a). The good agreement obtained using the two-parameter model suggests that all Ala39 mutants fold with similar mechanisms, passing through similar transition states. Overall, Equation 6 provides significant improvements in fit over Equations 4 and 5 for mutational data from CI2 Ile30, NTL9 Ile4, and FYN Phe4, Ile28, and Ala39.

Figure 1.

Figure 1.

Experimentally determined changes in transition-state free energies, GT exp, plotted as a function of changes in native-state stability, GN (A,D), side-chain hydrophobicity, GH (B,E), and back-calculated values computed using Equation 11, GTcalc (C,F) for mutations of Ile28 (A,B,C) and Ala39 (D,E,F) in the Fyn SH3 domain. Equations for the lines in A, B, D, and E were obtained by one-parameter linear regression through the origin. Lines in C and F have a slope of 1 and a y-intercept of 0. Error bars corresponding to experimental uncertainties are smaller than the symbols shown.

When the analysis is extended to the 11 positions considered in this study, a high level of agreement is obtained; considering all 59 mutations, the root-mean-squared-deviation (RMSD) between back-calculated and experimental GT values is 0.16 kcal/mol. In addition, we performed leave-one-out cross-validation (LOOCV) on a per-position basis to test the predictive ability of χN and χH. Data for each mutation was systematically removed to produce a validation data set, and two-parameter fitting was applied to the remaining n − 1 data points. Values of χN and χH thus obtained were used to predict the validation data set. The experimental and predicted validation data sets were in good agreement, with an overall correlation coefficient of 0.83 and an RMSD of 0.33 kcal/mol. This confirms that Equation 6 is applicable to a wide array of mutations at different hydrophobic core positions in a variety of proteins. More importantly, it shows that the environments of hydrophobic core side chains in the transition state can be well described by just two parameters: one that depends only on the degree of hydrophobic burial and another that depends on the formation of all other native-like interactions.

Notably, values of χH for all positions considered, with the exception of NTL9 Leu6 and FYN Val55, are significantly larger than zero. This implies that for essentially all positions considered here, the process of nonspecific hydrophobic collapse is more complete in the transition state than is the formation of native interactions. This result agrees with the nucleation–condensation model of protein folding in which the transition state is relatively compact even as native interactions are still in the process of being consolidated (Fersht 1999). In the cases where χN is not significantly greater than zero, data may be adequately fit by Equation 5, and contributions to transition-state stability are due primarily to nonspecific hydrophobic collapse. In contrast, when both χH and χN are significantly greater than zero, as is seen for five positions, the two-parameter model predicts that side chains participate in both specific native-like interactions and additional nonspecific hydrophobic collapse. Notably, this analysis shows that the interactions of a single side chain in the transition state often comprise a mixture of specific packing and nonspecific burial. Such mixed-mode interactions at the individual side chain level have not been previously considered in the interpretation of protein folding data.

The two-parameter analysis also provides an estimate of total side chain burial, via the sum (χH + χN). Three positions, NTL9 Val3 and FYN Ile28, Ala39 are highly buried with (χH + χN) = 0.78, 0.78, and 1.19, respectively. The burial of NTL9 Val3 is primarily due to nonspecific collapse, since χN is essentially zero for this position, while for FYN Ile28 and Ala39, χN = 0.38 and 0.61, implying that significant native-like interactions are made in the transition state. Note that the value of (χH + χN) = 1.19 obtained for A39 does not differ significantly from the maximum expected value of (χH + χN) = 1.0, since 1.0 lies within the 90% confidence region calculated for this parameter and reported in Table 1.

There are two hydrophobic core positions for which neither χN nor χH is statistically significant. In the case of FYN Val55, a detailed analysis has shown that hydrophobic and hydrophilic substitutions produce quite different effects on the folding rate through a mechanism that is not well understood (de los Rios et al. 2005). It is therefore expected that GT values would not correlate strongly with either GN or GH. For NTL9 Leu6, the sum (χH + χN) is significantly larger than zero, implying that this position is partially buried, although neither χN nor χH is statistically significant on its own. This anomalous behavior is largely due to multi-collinearity of native-state stability and side-chain hydrophobicity for this position. The correlation coefficient between GN and GH is 0.95 and the regression line passes nearly through the origin. Data for Leu6 agree with Equations 4, 5, and 6 to such an extent that it is not possible to discriminate specific from nonspecific interactions in this case, although it is clear from (χH + χN) that contacts are formed by Leu6 in the transition state (see supplemental material).

Discussion

We have analyzed protein folding data using a two-parameter model that relates changes in folding rates to both changes in native-state stability and changes in side-chain hydrophobicity. This method provides information on the structure of the transition state that is not available from single-factor analyses of protein folding data. Based on correlations between folding rates and side-chain hydrophobicity, it had been proposed that side chains at a number of hydrophobic core positions in NTL9 and FYN contribute to the stability of the transition state purely via the hydrophobic effect in a non-shape-specific manner (Northey et al. 2002a; Anil et al. 2005). However, for several of these positions (NTL9 Ile4, Leu6; FYN Phe4, Phe20) non-negligible correlations also exist between folding rates and native-state stability, which suggests that some native-like interactions could be present in the transition state as well. Similarly, several mutations at positions believed to be partially natively structured in the transition state (Itzhaki et al. 1995; Northey et al. 2002a) produce folding rates that correlate with side-chain hydrophobicity (CI2 Ile30; FYN Ile28, Ala39, Ile50), raising the possibility that nonspecific hydrophobic collapse could play a dominant role. The two-parameter model resolves these ambiguities by effectively factoring out the effects of changes in side-chain hydrophobicity, GH, and testing whether the dependence of GT on GN remains. In some cases, there is no significant correlation between GT and GN, once the dependence of GT on GH is taken into account. This provides a much stronger argument for the absence of shape-specific, native-like contacts in the transition state than simply a correlation between GT and GH. In other cases, GT values show a significant dependence on GN (χN > 0), even after factoring out the influence of GH, which clearly demonstrates that specific, native-like interactions are present in the transition state.

These results shed additional light on the folding mechanisms of the proteins studied here. For example, the good agreement between GT and GH values for NTL9 suggested that the transition state of this molecule is stabilized entirely by nonspecific hydrophobic collapse, with all other native-like interactions formed later in the folding process (Anil et al. 2005). The two-parameter fits employed here agree with this conclusion to some extent. Val3 and Val21 have significant χH values, indicative of nonspecific hydrophobic collapse, and nonsignificant values of χN, implying that side chains at these positions lack specific contacts in the transition state. However, χN is significantly greater than zero for Ile4, which suggests that it participates in specific, native-like contacts at the rate-limiting step of folding. These interactions could potentially involve Leu6. This position is immediately adjacent in the native structure, as shown in Figure 2A, and may or may not be involved in specific interactions, as discussed above. Thus at the rate-limiting step of folding, the NTL9 hydrophobic core contains a few specific interactions involving Ile4 that are stabilized by extensive nonspecific collapse.

Figure 2.

Figure 2.

Structures of (A) the N-terminal domain of ribosomal protein L9 (NTL9: PDB accession code 1DIV) (Hoffman et al. 1994) and (B) the SH3 domain from the Fyn tyrosine kinase (FYN: PDB accession code 1SHF) (Noble et al. 1993) generated using the MOLMOL software package (Koradi et al. 1996). Side chains participating in specific native-like interactions in the transition state (χN > 0) are colored orange. Side chains participating purely in nonspecific hydrophobic collapse (χH > 0, χN ≈ 0) are colored blue.

In the case of FYN, the original interpretation of the kinetic data suggested that the transition state contains a core folding nucleus comprising positions Ile28, Ala39, and Ile50, surrounded by a loosely packed region encompassing Phe4, Ala6, Leu18, Phe20, and Phe26 (Northey et al. 2002a). The two-parameter analysis is in partial agreement with this model. Ile28 and Ala39 have the largest χN values of all positions considered here, reflecting their participation in native-like contacts within the FYN transition state. However, in contrast with the previous model, we have shown that the formation of native-like contacts lags behind hydrophobic burial for side chains at these positions. Within the loosely packed region, only positions 4 and 20 have data available for four or more mutations. χN is not significant for Phe20, indicating a near absence of specific contacts, while χH is significantly larger than zero. Together, these parameters are consistent with purely nonspecific hydrophobic collapse. Data for Phe4 produce a value of χN = 0.12 that is much lower than those of Ile28 and Ala39 yet significantly larger than zero, implying that the side chain at this position is involved in weak, native-like interactions in the transition state. Ile50 was originally included in the core folding nucleus due to the large changes in folding rates caused by mutations at this position. However GT values show a much stronger dependence on GH (rTH = 0.90) than on GN (rTN = 0.39), and χH is significantly larger than zero while χN is not. This suggests that Ile50 participates in nonspecific hydrophobic collapse but does not form native-like interactions in the transition state. Estimates of side chain burial based on (χN + χH) values show that although the interactions of Ile50 are nonspecific, it is more deeply buried at the rate-limiting step of folding than is Phe4, whose weak interactions have more native character. Thus, hydrophobic core interactions in the FYN transition state involve asynchronous formation of native contacts and side chain burial. The adjacent positions, Ala39 and Ile28 shown in Figure 2B, are both deeply buried and involved in specific interactions. Ile50 is significantly buried but does not participate in specific interactions that impose strong steric constraints, whereas Phe4 is more solvent exposed but forms weak, native-like contacts.

These results have implications for the design and analysis of protein-folding experiments and highlight the detailed information that can be obtained by performing multiple substitutions at individual positions (Mok et al. 2001; Northey et al. 2002a,b; Anil et al. 2005; de los Rios et al. 2005). Disruptive mutations are typically avoided in Φ value analyses because of concerns that these changes may alter the folding pathway (Fersht 1999). We have found that just two parameters can account for the effects of multiple mutations on protein folding rates. Many of the mutations considered in the analysis are disruptive, since they increase side chain volume or introduce polar groups at buried positions. The agreement of these folding data with the two-parameter fits suggests that transition-state structures are quite robust with respect to amino acid sequence changes; therefore a wide range of substitutions can be used in folding studies. In fact, disruptive mutations of the hydrophobicity- and volume-increasing type can yield useful information on the interactions of core side chains in the transition state, since they will tend to decrease folding rates at positions that are specifically packed and increase folding rates at positions that are nonspecifically buried. In fact, if only a single mutation is to be made at a hydrophobic core position, then one that increases side chain volume may be preferable, due to the potential for discriminating specific from nonspecific interactions. Care must be taken when interpreting Φ values obtained from such disruptive mutations, since they are more qualitative indicators of the ability of the transition state as a whole to accommodate a different side chain than they are quantitative measures of the number of side chain contacts. Nevertheless, they can provide useful information on whether a side chain is specifically packed in the protein folding transition state.

In conclusion, we have shown that folding kinetic data for multiple mutations at individual hydrophobic core positions can be well accounted for by two parameters describing the dependence of transition-state stability on the destabilization of the native state and the difference in water/octanol transfer free energies between the wild-type and mutant side chains. These parameters can be interpreted in terms of the formation of native-like structure and additional hydrophobic collapse on a per-site basis. The analysis shows that the process of nonspecific burial is generally further progressed than the formation of specific interactions in the transition state, although some native-like contacts are present at the rate-limiting step of folding in all three proteins considered here. We find that the folding transition states examined in this study are fairly tolerant of disruptive mutations, and argue that such substitutions can provide important information regarding side chain interactions in the transition state. The two-parameter fits reported here represent a general approach for interpreting folding data for multiple mutations at hydrophobic core positions and provide more detailed information on the structure of the transition state than do existing methods.

Materials and Methods

We analyzed protein folding data for three proteins in which four or more mutations were made at individual hydrophobic core positions: chymotrypsin inhibitor 2 (CI2) [I30V,A,G,T], the N-terminal domain of ribosomal protein L9 from Bacillus stearothermophilus (NTL9) [V3I,L,Tle,Abu,A; I4Nva,V,Abu,A; L6I,Nva,V,Abu,A; and V21I,L,Abu,A], and the SH3 domain from the Fyn tyrosine kinase (FYN) [F4S,A,V,L; F20S,A,V,L; I28S,A,V,L,F; A39S,V,L,F,G; I50A,V,L,F; V55F,L,W,M,A,H,T,G,S]. In the case of NTL9, both naturally and non-naturally occurring amino acids were employed. The abbreviations Tle, Abu, and Nva refer to tert-leucine, amino-butyric acid, and norvaline, respectively. For CI2, GT and GN were taken from the tabulated folding data of Itzhaki et al. (1995). NTL9 data were taken from Anil et al. (2005). FYN Val55 data were taken from de los Rios et al. (2005), while data for all other FYN positions were taken from Northey et al. (2002a). Only positions with greater than 75% burial, as calculated using MOLMOL (Koradi et al. 1996), were considered. For FYN and CI2, GH values were calculated using water/octanol transfer free energy scale of Fauchere and Pliska (1983). The non-naturally occurring amino acids employed in some NTL9 variants are not included in this scale; therefore, all GH values for this protein were obtained from calculations based on N-acetyl amino acid amides using software from ChemAxon.

The change in transition-state free energy for a substitution to amino acid i was equated to the change in native-state stability and change in hydrophobicity produced by the mutation according to

graphic file with name 644equ11.jpg

where

graphic file with name 644equ12.jpg

(kFmut)i and (kUmut)i are the folding and unfolding rates of the mutant protein, kFwt and kUwt are the folding and unfolding rates of the wild-type protein, and (GH)i is the octanol to water transfer free energy of the wild-type side chain subtracted from the octanol to water transfer free energy of the substituted side chain i. The parameters χN and χH were obtained for each hydrophobic core position by linear regression through the origin (Walpole et al. 1998),

graphic file with name 644equ13.jpg
graphic file with name 644equ14.jpg

where the subscript i runs from 1 to n, the number of mutations made at a given position, and (GT exp)i is the experimentally observed change in transition-state free energy produced by mutation i, calculated according to Equation 3. Confidence intervals for χN and χH were estimated from Monte Carlo simulations in which linear regression through the origin was repeated on samples randomized according to the uncertainty in GT exp values (Efron and Tibshirani 1986). One hundred thousand repetitions were performed for each position, where for iteration j,

graphic file with name 644equ15.jpg

and εj,i is drawn randomly from a normal distribution with a mean of zero and a standard deviation equal to the uncertainty in (GT exp)i. The 90% confidence region was taken to be the interval with 5% of iterations lying above the upper bound and 5% of iterations lying below the lower bound. Uncertainties in (GT exp)i values, σi, were set equal to the reported experimental error or calculated from residuals to the best fit regression line according to

graphic file with name 644equ16.jpg

selecting the larger of the two estimates for each mutation. All calculations were performed using the MATLAB software package.

Electronic supplemental material

Separate plots of GT exp as functions of GN, GH, and GTcalc for all 11 hydrophobic core positions included in the analysis are included in the supplemental material.

Acknowledgments

We thank L.E. Kay, J.D. Forman-Kay, and A.R. Davidson for critical reading of this manuscript. This work was supported by grants from the Canadian Foundation for Innovation, the Canadian National Science and Engineering Research Council, and Le Fonds Québécois de la Recherche sur la Nature et les Technologies.

Footnotes

Supplemental material: see www.proteinscience.org

Reprint requests to: Anthony Mittermaier, Department of Chemistry, McGill University, 801 Sherbrooke Street West, Room 322, Montreal H3A 2K6, Canada; e-mail: anthony.mittermaier@mcgill.ca; fax: (514) 398-3797.

Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.073105408.

References

  1. Anil, B., Sato, S., Cho, J.-H., Raleigh, D.P. Fine structure analysis of a protein folding transition state; distinguishing between hydrophobic stabilization and specific packing. J. Mol. Biol. 2005;354:693–705. doi: 10.1016/j.jmb.2005.08.054. [DOI] [PubMed] [Google Scholar]
  2. de los Rios, M.A., Daneshi, M., Plaxco, K.W. Experimental investigation of the frequency and substitution dependence of negative φ-values in two-state proteins. Biochemistry. 2005;44:12160–12167. doi: 10.1021/bi0505621. [DOI] [PubMed] [Google Scholar]
  3. Efron, B., Tibshirani, R. Bootstrap methods for standard errors, confidence intervals and other measures of statistical accuracy. Stat. Sci. 1986;1:54–77. [Google Scholar]
  4. Eriksson, A.E., Baase, W.A., Zhang, X.-J., Heinz, D.W., Blaber, M., Baldwin, E.P., Matthews, B.W. Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect. Science. 1992;255:178–183. doi: 10.1126/science.1553543. [DOI] [PubMed] [Google Scholar]
  5. Fauchere, J.-L., Pliska, V. Hydrophobic parameters π of amino acid side chains from the partitioning of N-acetyl-amio-acid amides. Eur. J. Med. Chem. 1983;18:369–375. [Google Scholar]
  6. Fersht, A. Structure and mechanism in protein science. W.H. Freeman and Company; New York: 1999. [Google Scholar]
  7. Fersht, A.R., Itzhaki, L.S., Elmasry, N., Matthews, J.M., Otzen, D.E. Single versus parallel pathways of protein-folding and fractional formation of structure in the transition-state. Proc. Natl. Acad. Sci. 1994;91:10426–10429. doi: 10.1073/pnas.91.22.10426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Hansch, H., Leo, A. Substitution constants for correlation analysis in chemistry and biology. Wiley; New York: 1979. [Google Scholar]
  9. Hoffman, D.W., Davies, C., Gerchman, S.E., Hycia, J.H., Porter, S.J., White, S.W., Ramakrishnan, V. Crystal structure of prokaryotic ribosomal protein L9: A bi-lobed RNA-binding protein. EMBO J. 1994;13:205–212. doi: 10.1002/j.1460-2075.1994.tb06250.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Itzhaki, L.S., Otzen, D.E., Fersht, A.R. The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: Evidence for a nucleation-condensation mechanism for protein folding. J. Mol. Biol. 1995;254:260–288. doi: 10.1006/jmbi.1995.0616. [DOI] [PubMed] [Google Scholar]
  11. Koradi, R., Billeter, M., Wüthrich, K. MOLMOL: A program for display and analysis of macromolecular structures. J. Mol. Graph. 1996;14:51–55. doi: 10.1016/0263-7855(96)00009-4. [DOI] [PubMed] [Google Scholar]
  12. Ludvigsen, S., Shen, H., Kjaer, M., Madsen, J.C., Poulsen, F.M. Refinement of the three-dimensional solution structure of barley serine proteinase inhibitor 2 and comparison with the structures in crystals. J. Mol. Biol. 1991;222:621–635. doi: 10.1016/0022-2836(91)90500-6. [DOI] [PubMed] [Google Scholar]
  13. Matouschek, A., Kellis J.T., Jr, Serrano, L., Fersht, A.R. Mapping the transition state and pathway of protein folding by protein engineering. Nature. 1989;340:122–126. doi: 10.1038/340122a0. [DOI] [PubMed] [Google Scholar]
  14. Mok, Y.K., Elisseeva, E.L., Davidson, A.R., Forman-Kay, J.D. Dramatic stabilization of an SH3 domain by a single substitution: Roles of the folded and unfolded states. J. Mol. Biol. 2001;307:913–928. doi: 10.1006/jmbi.2001.4521. [DOI] [PubMed] [Google Scholar]
  15. Noble, M.E., Mussachio, A., Saraste, M., Courtneidge, S.A., Wierenga, R.K. Crystal structure of the SH3 domain in human Fyn; comparison of the three-dimensional structures of SH3 domains in tyrosine kinases and spectrin. EMBO J. 1993;12:2617–2624. doi: 10.2210/pdb1shf/pdb. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Northey, J.G.B., Di Nardo, A.A., Davidson, A.R. Hydrophobic core packing in the SH3 domain folding transition state. Nat. Struct. Biol. 2002a;9:126–130. doi: 10.1038/nsb748. [DOI] [PubMed] [Google Scholar]
  17. Northey, J.G.B., Maxwell, K.L., Davidson, A.R. Protein folding kinetics beyond the Φ value: Using multiple amino acid substitutions to investigate the structure of the SH3 domain folding transition state. J. Mol. Biol. 2002b;320:389–402. doi: 10.1016/S0022-2836(02)00445-X. [DOI] [PubMed] [Google Scholar]
  18. Richards, F., Lim, W. An analysis of packing in the protein folding problem. Q. Rev. Biophys. 1994;26:423–498. doi: 10.1017/s0033583500002845. [DOI] [PubMed] [Google Scholar]
  19. Walpole, R.E., Myers, R.H., Myers, S.L. Probability and statistics for engineers and scientists. 6th ed. Prentice Hall; Upper Saddle River, NJ: 1998. [Google Scholar]
  20. Weikl, T.R., Dill, K.A. Transition-states in protein folding kinetics: The structural interpretation of Φ values. J. Mol. Biol. 2007;365:1578–1586. doi: 10.1016/j.jmb.2006.10.082. [DOI] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES