Abstract
Here we present a method for determining the inference of non-native conformations in the folding of a small domain, α-spectrin Src homology 3 domain. This method relies on the preservation of all native interactions after Tyr/Phe exchanges in solvent-exposed, contact-free positions. Minor changes in solvent exposure and free energy of the denatured ensemble are in agreement with the reverse hydrophobic effect, as the Tyr/Phe mutations slightly change the polypeptide hydrophilic/hydrophobic balance. Interestingly, more important Gibbs energy variations are observed in the transition state ensemble (TSE). Considering the small changes induced by the H/OH replacements, the observed energy variations in the TSE are rather notable, but of a magnitude that would remain undetected under regular mutations that alter the folded structure free energy. Hydrophobic residues outside of the folding nucleus contribute to the stability of the TSE in an unspecific nonlinear manner, producing a significant acceleration of both unfolding and refolding rates, with little effect on stability. These results suggest that sectors of the protein transiently reside in non-native areas of the landscape during folding, with implications in the reading of φ values from protein engineering experiments. Contrary to previous proposals, the principle that emerges is that non-native contacts, or conformations, could be beneficial in evolution and design of some fast folding proteins.
Studies of in
vitro folding, especially protein engineering experiments
(1), performed on several proteins have provided an increasing amount
of kinetic and thermodynamic information that is now becoming ready for
compiling and systematic analysis (2, 3). Experimentally obtained
(where ‡ is
TSE, U is unfolded) values (1), which summarize the mutagenesis effects
on protein folding, are used to improve simple potentials that attempt
to model folding landscapes. Because of the computational difficulty in
contemplating non-native interactions and other effects, the physical
parameters determined by simple models could be biased until realistic
contributions of the many factors affecting apparent
values are
considered. A large proportion of the
values lie in the
range 0 <
<
1, expected for a partially condensed native structure (2). This
finding, together with the relative success of simple structure-based
models (4–7), suggests that there is a bias toward native interactions
in the folding reaction. Non-native interactions are, however, detected
in some representations of transition states (8, 9). Experimentally,
non-native contacts could be identified by
values negative
or larger than unity. Such noncanonical
values are obtained
in a non-negligible fraction of examined positions: 18% of 192
mutations for seven different proteins (2). Also non-native contacts
and conformations could be masked under
values in the range
from 0 to 1, but their contribution is extremely difficult to
establish. Additionally, it recently has been illustrated that negative
values also may
arise from multiple transition states (10). Considering that
values are one of
the most valuable experimental parameters to be used in testing,
tuning, and improvement of potential functions of theoretical folding
models, it seems crucial to discern to what extent observed
values could be the
result of phenomena different from the progressive acquisition of the
native form.
Here we introduce a strategy to assess the effect of non-native interactions in the folding of a protein. To do so, it is particularly important to avoid the disruption, or creation, of interactions in the native conformation that could affect the denatured-native energy gap, thus altering the folding driving force. There are some previous attempts to experimentally explore the consequences of the induction of non-native secondary structure propensities in certain regions of proteins (11, 12). Results obtained in those examples are, however, obscured by the fact that pivotal native interactions are simultaneously eliminated, thus altering folding energetics by several means. The impact of the mutations in the fully folded state could be minimized if they eliminate highly exposed polar groups that hold no contact with any other protein moiety. Protein groups that are equally exposed to the solvent in the native, denatured, and partially folded species, relevant for the folding reaction, should neither contribute to nor detract from stability of native state or transition state ensemble (TSE), because the change in accessible surface area (ΔASAF-U and ΔASA‡-U, where F is folded, U is unfolded, and ‡ is TSE) is zero. On the other hand, some nonpolar side chains may be less exposed to the solvent in the denatured than in the native state. In this case, the protein would be destabilized because the ΔASAF-U is negative (reverse hydrophobic effect, ref. 13). Similarly, if the apolar side chain is more buried in the TSE than in the unfolded state (ΔΔASA‡-U > 0), the refolding rate should be faster.
We have chosen to operate Tyr/Phe mutations in solvent-exposed positions of α-spectrin Src homology 3 (SH3) domain. The OH/H replacement occurs in a location remote from the residue backbone, so that shifts in local structural tendencies, in the mutation sites, are not expected, as would be the case for Ala/Ser mutations. Thus, the methodology explored here preserves all of the native contacts, simultaneously introducing small hydrophilic/hydrophobic switches.
Materials and Methods
Protein Mutagenesis and Purification.
Mutants were obtained by PCR (14) and expressed in Escherichia coli, BL21-DE3 strain by using pBAT-4 plasmid (15). Proteins were purified as described in the Methods published as supporting information on the PNAS web site, www.pnas.org, and their concentration was determined by using the method of Gill and von Hippel (16).
Kinetic Measurements.
Folding and unfolding kinetics were followed in a Biologic (Grenoble, France) stopped-flow machine by fluorescence emission selected with a 305-nm cut-off filter upon excitation at 290 nm. The buffer used was 50 mM sodium phosphate, pH 7.0. The cell chamber and the syringes were kept at 298 K. The kinetic phases were fitted to a mono-exponential function by means of algorithms provided by Biologic, from which kobs, the apparent rate constant at a given concentration of denaturant, was calculated. The cis-trans Pro isomerization slow refolding phase is not considered in this analysis.
Determination of the Kinetic and Thermodynamic Parameters: Chevron Plots.
The constant kobs versus [urea] was then fitted to a two-state kinetic model by the following equation, which takes into account deviations from linearity at high [urea] (17).
![]() |
1 |
![]() |
where k‡-U and k‡-F are the refolding and unfolding rate constants in water, respectively. m‡-U and m‡-F are the slopes of ln k versus [urea] in the refolding and unfolding reactions, respectively. The 0.014 M−2 prefactor accounts for the curvature found for a large number of unstable SH3 mutants (18). ΔΔGF-U (the destabilization energy induced by the mutation, where ΔG is the Gibbs energy change) can be calculated and dissected into its components, ΔΔG‡-U and ΔΔG‡-F, from:
![]() |
2 |
![]() |
3 |
![]() |
4 |
![]() |
5 |
Results
Mutational Strategy.
Tyr/Phe substitutions have been done on the α-spectrin
SH3 domain. This 62-residue polypeptide folds into an orthogonal
β-sandwich (19) and follows a two-state transition at low ionic
strength (20). Local propensities (21) and the denatured state under
near-native conditions (22, 23), as well as the folding nucleus (24),
have been previously characterized. The α-spectrin SH3 domain
contains three Tyr residues (Y13, Y15, and Y57, with side-chain
hydroxyl group solvent accessibility of 99%, 15%, and 98%,
respectively) and a Phe residue (F52 with a solvent accessibility of
72% for its Cζ atom). Thus, Y13, F52, and Y57
seem to be appropriate targets for the Tyr/Phe mutational
screening. To increase the number of markers, we also selected, by
visual inspection, other positions occupied by hydrophilic,
solvent-exposed residues, amenable to Tyr/Phe mutations. K18,
K27, and K59 were selected (Cɛ atoms show an
accessibility of 87%, 97%, and 81%, respectively). A model of the
protein with Phe residues in the seven indicated positions is shown in
Fig. 1. No bumps are introduced by
Lys/Phe mutations and the locations and orientations of the Phe
ring appear to be appropriate, in principle, for convenient solvation
of the hydroxyl group of tyrosines occupying these sites. The selected
residues probe different regions of the protein, in terms of secondary
structure and overall spatial distribution. Previous studies have shown
that key residues in the folding of SH3 domains are concentrated mainly
in the region corresponding to the distal hairpin (18, 26). Of the
residues considered here, only F52 shows a significantly high
value, whereas
positions 13, 18, 27, 57, and 59 do not seem to contribute
substantially to the stability of the “folding nucleus.”
Figure 1.

Crystallographic structure of α-spectrin SH3 domain (1SHG) in which phenylalanine side chains have been modeled in place of Tyr-13, Tyr-57, Lys-18, Lys-27, and Lys-59 (whatif program, ref. 25). α-Spectrin SH3 domain contains three tyrosines (Y13, Y15, and Y57) and one phenylalanine (F52). Y15 was not considered because its OH group is involved in hydrogen bonding with E22 side chain. Position K27 was discarded because the mutants with three Lys to Tyr/Phe substitutions were too unstable. F52 was also not mutated in the last set of mutants because OH of Tyr occupying this position was able to participate in a water-mediated salt bridge with R21 and N35 side chains. The final analysis was limited to positions shown in yellow: Y13, K18, Y57, and K59 in wild-type sequence. Figure was prepared with the program WEBLAB VIEWERLITE 3.5 (Molecular Simulations, Waltham, MA).
After some preliminary mutant analysis, only four of the six potential positions (Y13, K18, Y57, and K59) were selected for the final survey of single and multiple Tyr/Phe screening. The other two residues, K27 and F52, were discarded. Fig. 2 Top and Table 1 illustrate the influence of single, double, and triple Lys/Tyr-Phe substitutions. A single Lys/Tyr-Phe mutation destabilizes the protein ≈0.8 kcal⋅mol−1 at position 59, ≈1.7 kcal⋅mol−1 at position 18, and ≈2.7 kcal⋅mol−1 at position 27. In the following steps, Lys was kept at position 27 to work in a convenient range of stabilities. The final group of mutants always contained Phe at position 52, because the imposed prerequisite to select contact-free OH groups cannot be satisfied at this site. Mutation of Phe-52 into Tyr provokes a stabilization of 0.4 kcal⋅mol−1 (wt_K18Y+F52Y+K59Y versus wt_K18Y+K59Y and wt_F52Y versus wt, Table 1 and Fig. 2 Middle) that can be explained by changes in the folded state. In the crystal structure of wt_F52Y the hydroxyl group of Tyr-52 is involved in water-mediated intramolecular interactions that are obviously absent in wt (see additional Methodsand Fig. 5, which are published as supporting information on the PNAS web site).
Figure 2.
Urea concentration dependence of the natural logarithm of the rate
constants for refolding and unfolding at 298 K in 50 mM
Na2PO4, pH 7.0 monitored by fluorescence.
(Top) Wild-type parent SH3 domain (wt,
●) and mutants, yfg
(wt_K59Y, ▵), yfp (wt_K59Y/K18Y,
○), and yfc
(wt_K59Y/K18Y/K27Y/F52Y, ×). As expected from
the previous analysis of the α-spectrin SH3 TSE (18), the
Lys/Tyr mutations, at the three positions, affect mainly the
unfolding rate. This finding confirms the little participation of the
selected positions in the folding nucleus.
(Middle) wt (▵), yfa
(wt_F52Y, ●), yfp
(wt_K18Y/K59Y, ×), and yfe
(wt_K18Y/K59Y/F52Y, ○). The
value for F52Y is
≈0.65, similar to 0.58 ± 0.2 obtained for a F52A mutation (24).
(Bottom) Single Tyr-Phe mutations over the parent
yfp (wt_K18Y/K59Y) mutant
(●), yfh (yfp_Y13F, ▵),
yfs (yfp_Y18F, ○), yfi (yfp_Y57F,
×), and yft (yfp_Y59F, ◊). Legends inside the figure
indicate the sequence at positions 13, 18, 27, 52, 57, and 59.
Continuous lines represent best fits to Eq. 1.
Table 1.
Kinetic parameters, folding free energies, ΔGF-U, and changes upon mutation relative to wild-type SH3 domain, ΔΔGF-U, of most relevant mutants, at 298 K, pH 7.0 in 50 mM sodium phosphate
| Protein | Mutations | Sequencea | k‡-F, s−1 | m‡-F, M−1 | k‡-U, s−1 | m‡-U, M−1 | ΔGF-U, kcal⋅ mol−1 | ΔΔGF-U, kcal⋅ mol−1 | mF-U,b kcal⋅ mol−1⋅M−1 |
|---|---|---|---|---|---|---|---|---|---|
| yfa (3Y:0F) | wt_F52Y | YKKYYK | 0.0037 ± 0.0005 | 0.47 ± 0.01 | 6.8 ± 0.1 | −0.89 ± 0.007 | −4.45 | −0.44 | 0.81 |
| wt (2Y:1F) | wt | YKKFYK | 0.0045 ± 0.0005 | 0.47 ± 0.01 | 3.94 ± 0.09 | −0.87 ± 0.01 | −4.01 | 0 | 0.79 |
| yfb (0Y:2F) | wt_Y13F+Y57F | FKKFFK | 0.0097 ± 0.0006 | 0.45 ± 0.01 | 5.49 ± 0.07 | −0.86 ± 0.006 | −3.76 | 0.25 | 0.77 |
| yfg (3Y:1F) | wt_K59Y | YKKFYY | 0.015 ± 0.0007 | 0.428 ± 0.007 | 3.16 ± 0.04 | −0.85 ± 0.007 | −3.17 | 0.84 | 0.76 |
| yfk (2Y:2F) | wt_K59F | YKKFYF | 0.018 ± 0.0009 | 0.417 ± 0.006 | 3.31 ± 0.04 | −0.86 ± 0.007 | −3.09 | 0.92 | 0.76 |
| yfj (2Y:2F) | wt_Y57F+K59Y | YKKFFY | 0.021 ± 0.001 | 0.426 ± 0.006 | 3.58 ± 0.05 | −0.83 ± 0.007 | −3.04 | 0.97 | 0.75 |
| yfe (5Y:0F) | wt_K18Y+F52Y+K59Y | YYKYYY | 0.107 ± 0.003 | 0.375 ± 0.004 | 2.67 ± 0.04 | −0.91 ± 0.01 | −1.90 | 2.11 | 0.76 |
| yfp (4Y:1F) | wt_K18Y+K59Y | YYKFYY | 0.139 ± 0.004 | 0.384 ± 0.004 | 1.76 ± 0.03 | −0.82 ± 0.02 | −1.50 | 2.51 | 0.72 |
| yfh (3Y:2F) | yfp_Y13F | FYKFYY | 0.224 ± 0.005 | 0.364 ± 0.003 | 2.72 ± 0.03 | −0.87 ± 0.01 | −1.48 | 2.53 | 0.73 |
| yfs (3Y:2F) | yfp_Y18F | YFKFYY | 0.116 ± 0.003 | 0.397 ± 0.004 | 1.98 ± 0.03 | −0.83 ± 0.01 | −1.68 | 2.33 | 0.73 |
| yfi (3Y:2F) | yfp_Y57F | YYKFFY | 0.200 ± 0.007 | 0.366 ± 0.005 | 2.26 ± 0.05 | −0.90 ± 0.02 | −1.44 | 2.57 | 0.75 |
| yft (3Y:2F) | yfp_Y59F | YYKFYF | 0.150 ± 0.006 | 0.388 ± 0.006 | 1.89 ± 0.04 | −0.82 ± 0.02 | −1.50 | 2.51 | 0.72 |
| yfr (2Y:3F) | yfp_Y18/59F | YFKFYF | 0.159 ± 0.005 | 0.386 ± 0.004 | 2.47 ± 0.04 | −0.90 ± 0.02 | −1.62 | 2.39 | 0.77 |
| yfo (2Y:3F) | yfp_Y13F+Y57F | FYKFFY | 0.298 ± 0.009 | 0.369 ± 0.005 | 3.26 ± 0.05 | −0.83 ± 0.02 | −1.41 | 2.60 | 0.71 |
| yfl (1Y:4F) | yfp_Y18/59F+Y57F | YFKFFF | 0.215 ± 0.005 | 0.377 ± 0.003 | 2.94 ± 0.04 | −0.85 ± 0.01 | −1.55 | 2.46 | 0.73 |
| yfm (1Y:4F) | yfp_Y13F+Y18/59F | FFKFYF | 0.260 ± 0.007 | 0.378 ± 0.004 | 3.67 ± 0.05 | −0.82 ± 0.01 | −1.57 | 2.44 | 0.71 |
| yff (0Y:5F) | yfp_Y13F+Y18/59F+Y57F | FFKFFF | 0.36 ± 0.01 | 0.366 ± 0.005 | 4.04 ± 0.08 | −0.77 ± 0.02 | −1.53 | 2.48 | 0.67 |
| yfc (6Y:0F) | wt_K18Y+K27Y+F52Y+K59Y | YYYYYY | 2.1 ± 0.1 | 0.26 ± 0.01 | 0.6 ± 0.4 | 0.87c | 0.74c | 4.75c | 0.67c |
| yfd (0Y:6F) | wt_Y13F+K18F+K27F+ Y57F+K59F | FFFFFF | 8.8 ± 0.7 | 0.26d | d | d | d | d | d |
The abbreviated sequence (positions 13, 18, 27, 52, 57, and 59) of the mutants is shown.
mF-U = RT*(m‡-F − m‡-U).
The refolding limb is too short to obtain a good estimation of m‡-U, data is fitted with a fix m‡-U value, obtained for wt.
The mutant was too unstable to obtain reliable data. The rows are further subdivided attending to the number of Lys residues in each mutant. The errors in free energy are estimated to be ± 0.02 kcal⋅mol−1.
Tyr/Phe Exchanges and the Effect on Folding Kinetics.
Based on the above results, four potential sites could be used for Tyr/Phe exchange: 13, 18, 57, and 59. Thus, for the following, we will consider as our reference protein, instead of wt, the mutant yfp (wt_K18Y/K59Y) containing Tyr at these four positions and we will refer to the mutants by the residue type occupying these sites (e.g., yfp = wt_K18Y/K59Y = YYYY). On this reference protein we made a series of individual and multiple Tyr/Phe mutations. The reduced sequence space between YYYY and FFFF covered here is shown below in a four-letter code,
![]() |
where the numbers on top indicate the positions in the protein and the last column, the total number of Tyr and Phe residues in each mutant at the four considered locations.
The refolding/unfolding rate constants versus denaturant
concentration of the reference protein YYYY and each of the single
Tyr/Phe mutants at the four sites are shown in Fig. 2
Bottom. The folding rates of the mutants are similarly
affected although to a different extent. Both limbs—unfolding and
refolding—of the chevron representations are accelerated in the
Phe-containing mutants, indicating that the TSE (‡) tends to be more
affected by the substitution than the folded (F) and unfolded (U)
states. Changes in ΔGF-U are quite small
(Table 2) and the computed
|
|
absolute value would be higher than unity in all cases. The four
positions are differently sensitive to the TSE stabilization by Phe
versus Tyr, following the order 18 < 59 < 57 < 13
(Fig. 2
Bottom; Table 1). Thus, we decided to group
mutations Y18+Y59 and then combine the individual Y13, Y57, and Y18+Y59
replacements to phenylalanine for a subsequent multiple mutant
thermodynamic analysis.
Table 2.
Refolding and unfolding activation energies relative to yfp (Y13Y18Y57Y59) for the mutants comprising the triple mutant set, at 298 K, pH 7.0 in 50 mM sodium phosphate
| Protein | Sequence | Mutations | Substitutionsb | k‡-F, s−1 | ΔΔG‡-F,c kcal⋅mol−1 | k‡-U, s−1 | ΔΔG‡-U,d kcal⋅mol−1 | ΔΔGF-U,e kcal⋅mol−1 |
|---|---|---|---|---|---|---|---|---|
| yfpa | YYYY | 0 | — | 0.139 | 0.000 | 1.76 | 0.000 | 0.000 |
| yfh | FYYY | Y13F | a | 0.224 | −0.283 | 2.72 | −0.258 | 0.025 |
| yfr | YFYF | Y18/59F | b | 0.159 | −0.080 | 2.47 | −0.201 | −0.121 |
| yfi | YYFY | Y57F | c | 0.200 | −0.216 | 2.26 | −0.148 | 0.067 |
| yfm | FFYF | Y13F+Y18/59F | ab | 0.260 | −0.371 | 3.67 | −0.435 | −0.064 |
| yfo | FYFY | Y13F+Y57F | ac | 0.298 | −0.452 | 3.26 | −0.365 | 0.087 |
| yfl | YFFF | Y18/59F+Y57F | bc | 0.215 | −0.258 | 2.94 | −0.304 | −0.046 |
| yff | FFFF | Y13F+Y18/59F+Y57F | abc | 0.360 | −0.564 | 4.04 | −0.492 | 0.072 |
The double mutant yfp (Y13Y18Y57Y59) containing two aromatic residues more than wt α-spectrin SH3 domain (wt_K18Y/K59Y) is taken as a reference of the final triple mutation set. Tyr/Phe substitutions are applied on YYYY in positions 13, 18, and 59 (simultaneously) and 57.
These substitutions will be referred to as a, b, and c, respectively, for better clarity. The second column indicates the sequence of the mutant in positions 13, 18, 57, and 59.
Calculated as indicated under Materials and Methods, Eq. 3,
Eq. 4, and
Eq. 5. The error in ΔΔG‡-U and ΔΔG‡-F is ± 0.02 kcal⋅mol−1.
Cooperative and Context Effects in Multiple Tyr/Phe Mutations.
The effect at any single position can be tested under different contexts by comparing four different pairs of mutants: e.g., Y13F—YYYY/ FYYY, YFYF/FFYF, YYFY/FYFY, and YFFF/FFFF.
Fig. 3 Top shows the comparison of the different chevron plots obtained for two pairs of mutants with Y13F substitutions. For clarity, we only show the first and last pairs or the four couples enumerated above: YYYY versus FYYY, where the rest of the positions are occupied by Tyr, and YFFF versus FFFF, where Phe occupies the other sites. The same is shown in Fig. 3 Middle for the Y57F mutation and in Fig. 3 Bottom for the double mutant Y18F+Y59F. Comparison of the Tyr/Phe mutants in a different background allows the detection of context effects. In other words, if there are no context effects and mutations are mutually independent, a Tyr/Phe exchange in one position would provoke identical changes independently of the nature of the residues at the other three sites.
Figure 3.
Chevron curves of the final collection of mutants with Tyr or Phe occupying positions Y13, K18, Y57, and K59. (Top) Effect of Y13F mutation: YYYY vs. FYYY and YFFF vs. FFFF. (Middle) Y57F: YYYY vs. YYFY and FFYF vs. FFFF. (Bottom) Y18F/Y59F: YYYY vs. YFYF and FYFY vs. FFFF. Legends inside the figure indicate the sequence at positions 13, 18, 57, and 59.
Fig. 4 shows ΔΔG‡-U and ΔΔG‡-F calculated according to Eqs. 3 and 4 (see Materials and Methods) for the three groups of mutations studied here: Y13F (x axis), Y57F (y axis), and Y18F+Y59F (z axis). It is observed that effects on unfolding are larger than those observed in refolding for Y13F and Y57F, whereas the opposite is observed for Y18F+Y59F. As a consequence, Y13F and Y57F provoke a destabilization of the protein whereas Y18F+Y59F slightly stabilizes it. The sum of the differential refolding activation energies for individual mutations, ΣΔΔG‡-U = 0.61 kcal⋅mol−1, is slightly higher than ΔΔG‡-U = 0.49 kcal⋅mol−1, obtained for the triple mutant FFFF. On the other hand, the sum of differential unfolding activation energy ΣΔΔG‡-F = 0.58 kcal⋅mol−1 approaches the value obtained for the triple mutant, 0.56 kcal⋅mol−1. Although the numbers are small, the effect in refolding appears to level off with increasing numbers of Phe residues.
Figure 4.
Three-dimensional representation of the energetic effect in refolding (ΔΔG‡-U, pink, Eq. 3) and unfolding (ΔΔG‡-F, blue, Eq. 4) of the three group of mutations studied here: Y13F (x axis), Y57F (y axis), and Y18F/Y59F (z axis). The way between YYYY (bottom rear corner) and FFFF (upper front corner) may be covered by six trajectories. The cube dimensions (in gray for refolding and light blue for unfolding) correspond to the energetic change induced by the first Tyr/Phe substitution on YYYY and correspond to the expected energy changes in the absence of context effects. Subsequent mutations provoke, however, smaller effects in refolding and measured values (pink) deviate from the model cube trace; ΔΔG‡-U (FFFF-YYYY) is somewhat smaller than the addition of individual mutations. On the contrary, the FFFF vertex is reached in the unfolding representation; ΔΔG‡-F (FFFF-YYYY) coincides within error with the sum of individual mutations.
Context effects can be more properly analyzed by the method previously described by Horovitz and Fersht (27, 28). When this analysis is applied to the Tyr/Phe mutants, the energetic differences fall below the reliable level (Table 3, which is published as supporting information on the PNAS web site). Second-order effects on refolding activation energies are just above the significance margin, whereas higher-order variations are too low. Thus, an accurate dissection of cooperative effects does not seem pertinent with our data.
Other parameters extracted from the kinetic experiments, m‡-F and m‡-U, show some sensitivity to Tyr/Phe exchanges. Single Tyr/Phe substitutions provoke a decrease in both m‡-F and m‡-U. Changes are more dramatic in refolding; m‡-U can be diminished by 10% whereas m‡-F changes less than 3%. Only mutant YYYY, with the highest Tyr content, escapes from this tendency (Fig. 6, which is published as supporting information on the PNAS web site).
Discussion
In this work we describe a strategy to elucidate the role of non-native interactions in protein folding. In many cases these interactions could be weak and masked in the classical mutagenesis strategy, when groups that hold extensive contacts with the rest of the protein are eliminated. By deleting polar groups, fully solvent-exposed and remote from other protein moieties (Tyr to Phe mutations), we minimize the effects in the folded state, except for changes in the first hydration shell.
The Hydrophobic Effect and Protein Stability.
The major driving force in protein folding is considered to be the hydrophobic effect that arises because of unfavorable interactions between apolar side chains and water. This favors polypeptide arrangements in which the side chains of hydrophobic amino acids are packed in the interior of the protein (29–31). Opposite to this, a hydrophobic side chain, placed at certain solvent-exposed positions in a protein, might be buried more effectively in the ensemble of denatured states than in the native protein. Such behavior, termed reverse hydrophobic effect, would be expected to shift the folding equilibrium toward denaturation (11, 32–36). Only a very weak effect on protein stability has been observed here for Tyr/Phe mutations. These changes in stability are nonetheless compatible with the reverse hydrophobic effect, described above. Thus, Tyr-containing proteins are in general slightly more stable than those having Phe at equivalent solvent-exposed positions, with the exception of position 18.
Simple mutations have been described that considerably modify the slope mU-F of the denaturant-induced unfolding transition of staphylococcal nuclease (37, 38) and iso-1-cytochrome c (32, 34). The changes in mU-F have been interpreted as shifts in the relative populations of conformational substates of the denatured state that have different degrees of partial structure. In our case we observe a slight reduction of the mU-F value when the number of Phe residues increases. This result denotes that very minor conformational changes could be taking place in the denatured state ensemble, in agreement with the small changes in ΔGF-U. These putative small hydrophobic clusters do not concern the TSE, as indicated by the small variation in m‡-F.
The Hydrophobic Effect and Protein Folding.
Little work has been done on the kinetic consequences of polar/apolar surface mutations. Two recent studies have tried to address this question by mutating solvent-exposed residues to alter their hydrophobic nature (39, 40). Gu et al. (39) concluded that destabilizing surface hydrophobic substitutions have much larger effects on the rate of unfolding than on the rate of folding, suggesting that non-native hydrophobic interactions do not significantly interfere with the rate of core assembly. On the other hand, Poso et al. (40) found that the rapidly formed intermediate state and, to a greater extent, the TSE are generally stabilized by hydrophobic surface mutations, resulting in a faster acquisition of the folded state. In both reports, however, studied mutations do not only change the hydrophobic nature of the mutated residue, but also delete or introduce new interactions with the rest of the protein, complicating the analysis.
Most mutations in a protein affect non-native and native contacts simultaneously. In our study, by mutating Tyr residues into Phe, we minimize any effect not related to the pure deletion of a polar group in the surface and the consecutive exposure of a hydrophobic group. We observe that, beyond the folding nucleus, Tyr/Phe mutations either do not affect the energetics of the TSE or produce a relative stabilization with respect to the folded and unfolded states. This effect is more apparent when the number of Phe residues increases. Thus, the TSE of mutant FFFF is relatively stabilized by ≈0.5 kcal⋅mol−1 with respect to that of YYYY. As a result both folding and unfolding rates are significantly accelerated.
The Hydrophobic Effect and Solvation Energy.
It is generally accepted that protein solvent interactions play a major role in protein folding and that the hydration structure mediated by means of hydrogen bonds covering the protein surface plays an important role in the conformational stability of proteins (41, 42). Thus, substituting a polar residue by a hydrophobic one can produce changes in stability and folding rates caused by changes in the first hydration shell of the protein. Observed ΔΔGF-U are very small, indicating that this aspect is not playing a major role in native state energetics. However, the solvent contribution still remains a severe problem to disclose and the extent to which the observed kinetic effects are also related to a disruption of the first solvation shell cannot be easily discerned. Structural water molecules hydrogen-bonded to tyrosine hydroxyl groups are found frequently in crystallographic structures of proteins. Buried tyrosine residues, or those establishing hydrogen bonds with other protein groups, are expected to drop water molecules in their way to the native state, thus linking the folding reaction to dramatic changes in hydration structures surrounding the protein. However, transient or permanent interactions of the Tyr hydroxyl group with isolated water molecules should in principle not be very different in the denatured, native, or transition state in our experiment. All of the H/OH exchanges have been done in free, exposed protein loci; the hydroxyl group and its solvation molecules can be considered as a structural entity. Nevertheless, current data does not allow to clearly discard a major role of cooperative interactions among water molecules, closest to the protein surface, in the TSE that could give rise to a penalty of forming a network of hydrogen bonds in the protein folding saddle point.
Non-Native Interactions and Protein Folding.
It has been proposed that, even for apparent two-state folders, misfolding and nonobligatory intermediates are likely to occur and slow down overall folding (43–45). Thus, it is usually thought that an essential element for fast folding is elimination of nonobligatory intermediates that can act as traps. The occurrence and destruction of alternative contacts not present in the final folded form may be raising enthalpic barriers. Also, the futile journey of the polypeptide through non-native regions of the landscape could distract protein molecules from entering in the fast track of efficient folding intermediates. If there were a negative selection for non-native interactions, native-like regions of the protein folding landscape would be more populated during folding, keeping the protein in the native-like productive paths. Consequently, folding of a protein is expected to be more efficient if the conformational space accessible to the polypeptide chain during folding is more restricted, provided that native interactions are favored over non-native ones. However, this could be denoting an important misconception. Non-native conformations by definition cannot change the energy of the native state, but could affect denatured and transition state ensembles for two-state folding proteins. Tyr/Phe exchanges operated in α-spectrin SH3 do show that the TSE is more stabilized than F (folded) and U (unfolded) when a free hydroxyl group is eliminated from the protein surface.
Aromatic side chains frequently are found as part of hydrophobic clusters retained in many denatured states of proteins (32, 46, 47). Here, it is expected that Phe side chains could be involved in the formation of non-native hydrophobic clusters and minicores in which its Cζ atom is buried. Tyr side chain, on the contrary, would not favor the formation of such substructures and would restrict more the conformational space available to the polypeptide to native-like tracks, with exposed Cζ and OH groups. The comparison of the kinetics of mutants bearing Tyr or Phe side chains at a given position could tell us about the potential dispersion of this protein locus from native-like conformations along the folding route. If the protein conformation was restricted to native-like forms in the fully unfolded and saddle point of the TSE, the kinetic consequences of the presence of the hydroxyl group including its solvation shell is supposed to be negligible. Otherwise, if the deletion of a small protein group, which is not reporting any intramolecular interaction, has any consequences in folding kinetics, these should be interpreted in terms of colonization of non-native regions of the folding landscape by the Phe-containing mutants.
Transition states are contemplated as the ensemble of the most stable conformational collection where the minimal quantity of entropy is lost by the formation of a set of clue contacts (folding nucleus, ref. 48). The relative scarcity of TSE conformations with respect to unfolded state makes it more sensitive to individual contacts while it still benefits from a relative degeneracy of states. Increasing the conformational variability in the TSE in regions outside the folding nucleus could, in principle, stabilize the ensemble, accelerating folding. Accordingly, the more restricted conformational space available for a Tyr containing mutant, caused by the difficulty in burying the OH and possibly accompanying water molecules, could result in an entropic destabilization of the TSE. Non-native interactions in a protein folding nucleus have been investigated on a lattice model (8). The authors proposed that non-native nucleus contacts assist folding by providing energetically accessible pathways to the native state. The experimental results obtained here seem to support the above hypothesis.
Similarly, the TSE ensemble containing ≈ <50% of native contacts could be stabilized by unspecific (less represented individually) contacts with other hydrophobic residues in unstructured regions outside the nucleus without a significant loss of entropy. m Values could help discriminate between an entropic versus a collapse stabilization. Lower mF-U values of the Phe-containing mutants are caused mainly by a decrease of the refolding m‡-U figures, whereas m‡-F remains more invariable. mF-U Values imply a slight change of compactness of the denatured state. On the other hand, m‡-F values, observed here, do not argue in favor of a notable change in the degree of collapse of the TSE. m‡-F Values are, however, not completely constant; they could vary by 3% upon a single Tyr/Phe exchange. Most realistic views probably involve both entropic and enthalpic features. Considering that many discriminating conformations (available to Phe and not to Tyr) are those in which the Cζ is buried and/or contacting other hydrophobic regions of the protein, it is conceivable that both terms of the free energy are affected. Stabilization of the TSE is common for all of the positions tested, although to a different extent. Loci showing lower kinetic impact seem to be those within less-ordered regions in the native structure; Lys-18 is located in the rather unordered RT loop and Lys-59 is close to the C terminus. This position dependence—the more rigid the region that accepts the mutation, the higher the TSE stabilization—suggests that some restriction is required to induce the effect, which argues more in favor of entropy being quantitatively more important, although still compatible with changes in m‡-F values.
As our probes are located in regions that seem to be rather unordered before downhill folding, the free energy benefit in the TSE could be revealed more easily. It is expected that residues belonging to the nucleus, supporting a high density of contacts in the TSE, would have far less freedom. The possibility of adopting alternative conformations would decrease the probability of forming the right contacts and the effect could, in fact, be null or even the opposite to the one observed here. It is conceivable that the consequence of this type of mutation will be related to the role that the region where the mutated residue is located plays during the folding reaction.
It has been shown for typical medium-sized globular proteins that there is a significant hydrophobic solvent-ASA (49). Here we illustrate that hydrophobic residues in solvent-exposed positions could not only be tolerated from the stability viewpoint, but also beneficial to gain faster folding rates.
Supplementary Material
Acknowledgments
M.C.V. acknowledges the European Union for their Training and Mobility in Research postdoctoral fellowship. This work was supported by European Union Network Grant CT96-0013.
Abbreviations
- ASA
accessible surface area
- ΔG
Gibbs energy change
- SH3
Src homology 3
- TSE
transition state ensemble
Footnotes
This paper was submitted directly (Track II) to the PNAS office.
References
- 1.Fersht A R. Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding. New York: Freeman; 1999. [Google Scholar]
- 2.Goldenberg D P. Nat Struct Biol. 1999;6:987–990. doi: 10.1038/14866. [DOI] [PubMed] [Google Scholar]
- 3.Oliveberg M. Curr Opin Struct Biol. 2001;11:94–100. doi: 10.1016/s0959-440x(00)00171-8. [DOI] [PubMed] [Google Scholar]
- 4.Galzitskaya O V, Finkelstein A V. Proc Natl Acad Sci USA. 1999;96:11299–11304. doi: 10.1073/pnas.96.20.11299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Alm E, Baker D. Proc Natl Acad Sci USA. 1999;96:11305–11310. doi: 10.1073/pnas.96.20.11305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Muñoz V, Eaton W A. Proc Natl Acad Sci USA. 1999;96:11311–11316. doi: 10.1073/pnas.96.20.11311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Clementi C, Nymeyer H, Onuchic J N. J Mol Biol. 2000;298:937–953. doi: 10.1006/jmbi.2000.3693. [DOI] [PubMed] [Google Scholar]
- 8.Li L, Mirny L A, Shakhnovich E L. Nat Struct Biol. 2000;7:336–342. doi: 10.1038/74111. [DOI] [PubMed] [Google Scholar]
- 9.Vendruscolo M, Paci E, Dobson C M, Karplus M. Nature (London) 2001;409:641–645. doi: 10.1038/35054591. [DOI] [PubMed] [Google Scholar]
- 10.Ozkan S B, Bahar I, Dill K A. Nat Struct Biol. 2001;8:765–769. doi: 10.1038/nsb0901-765. [DOI] [PubMed] [Google Scholar]
- 11.Cregut D, Civera C, Macias M J, Wallon G, Serrano L. J Mol Biol. 1999;292:389–401. doi: 10.1006/jmbi.1999.2966. [DOI] [PubMed] [Google Scholar]
- 12.Prieto J, Wilmans M, Jimenez M A, Rico M, Serrano L. J Mol Biol. 1997;268:760–778. doi: 10.1006/jmbi.1997.0984. [DOI] [PubMed] [Google Scholar]
- 13.Pakula A A, Sauer R T. Nature (London) 1990;344:363–364. doi: 10.1038/344363a0. [DOI] [PubMed] [Google Scholar]
- 14.Higuchi R, Krummel B, Saiki R K. Nucleic Acids Res. 1988;16:7351–7367. doi: 10.1093/nar/16.15.7351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Peranen J, Rikkonen M, Hyvonen M, Kaariainen L. Anal Biochem. 1996;236:371–373. doi: 10.1006/abio.1996.0187. [DOI] [PubMed] [Google Scholar]
- 16.Gill S C, von Hippel P H. Anal Biochem. 1989;182:319–326. doi: 10.1016/0003-2697(89)90602-7. [DOI] [PubMed] [Google Scholar]
- 17.Johnson C M, Fersht A R. Biochemistry. 1995;34:6795–6804. doi: 10.1021/bi00020a026. [DOI] [PubMed] [Google Scholar]
- 18.Martínez J C, Pisabarro M T, Serrano L. Nat Struct Biol. 1998;5:721–729. doi: 10.1038/1418. [DOI] [PubMed] [Google Scholar]
- 19.Musacchio A, Noble M, Pauptit R, Wierenga R, Saraste M. Nature (London) 1992;359:851–855. doi: 10.1038/359851a0. [DOI] [PubMed] [Google Scholar]
- 20.Viguera A R, Martinez J C, Filimonov V V, Mateo P L, Serrano L. Biochemistry. 1994;33:2142–2150. doi: 10.1021/bi00174a022. [DOI] [PubMed] [Google Scholar]
- 21.Viguera A R, Jimenez M A, Rico M, Serrano L. J Mol Biol. 1996;255:507–521. doi: 10.1006/jmbi.1996.0042. [DOI] [PubMed] [Google Scholar]
- 22.Blanco F J, Serrano L, Forman-Kay J D. J Mol Biol. 1998;284:1153–1164. doi: 10.1006/jmbi.1998.2229. [DOI] [PubMed] [Google Scholar]
- 23.Kortemme T, Kelly M J, Kay L E, Forman-Kay J, Serrano L. J Mol Biol. 2000;297:1217–1229. doi: 10.1006/jmbi.2000.3618. [DOI] [PubMed] [Google Scholar]
- 24.Martinez J C, Serrano L. Nat Struct Biol. 1999;6:1010–1016. doi: 10.1038/14896. [DOI] [PubMed] [Google Scholar]
- 25.Vriend G. J Mol Graphics. 1990;8:52–56. doi: 10.1016/0263-7855(90)80070-v. [DOI] [PubMed] [Google Scholar]
- 26.Grantcharova V P, Riddle D S, Santiago J V, Baker D. Nat Struct Biol. 1998;5:714–720. doi: 10.1038/1412. [DOI] [PubMed] [Google Scholar]
- 27.Horovitz A, Fersht A R. J Mol Biol. 1990;214:613–617. doi: 10.1016/0022-2836(90)90275-Q. [DOI] [PubMed] [Google Scholar]
- 28.Horovitz A, Fersht A R. J Mol Biol. 1992;224:733–740. doi: 10.1016/0022-2836(92)90557-z. [DOI] [PubMed] [Google Scholar]
- 29.Eriksson A E, Baase W A, Zhang X J, Heinz D W, Blaber M, Baldwin E P, Matthews B W. Science. 1992;255:178–183. doi: 10.1126/science.1553543. [DOI] [PubMed] [Google Scholar]
- 30.Silverstein K A T, Haymet A D J, Dill K A. J Chem Phys. 1999;111:8000–8009. [Google Scholar]
- 31.Ratnaparkhi G S, Varadarajan R. Biochemistry. 2000;39:12365–12374. doi: 10.1021/bi000775k. [DOI] [PubMed] [Google Scholar]
- 32.Bowler B E, May K, Zaragoza T, York P, Dong A, Caughey W S. Biochemistry. 1993;32:183–190. doi: 10.1021/bi00052a024. [DOI] [PubMed] [Google Scholar]
- 33.Munoz V, Lopez E M, Jager M, Serrano L. Biochemistry. 1994;33:5858–5866. doi: 10.1021/bi00185a025. [DOI] [PubMed] [Google Scholar]
- 34.Herrmann L, Bowler B E, Dong A, Caughey W S. Biochemistry. 1995;34:3040–3047. doi: 10.1021/bi00009a035. [DOI] [PubMed] [Google Scholar]
- 35.Schindler T, Perl D, Graumann P, Sieber V, Marahiel M A, Schmid F X. Proteins. 1998;30:401–406. doi: 10.1002/(sici)1097-0134(19980301)30:4<401::aid-prot7>3.0.co;2-l. [DOI] [PubMed] [Google Scholar]
- 36.Cordes M H, Sauer R T. Protein Sci. 1999;8:318–325. doi: 10.1110/ps.8.2.318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Creighton T E, Shortle D. J Mol Biol. 1994;242:670–682. doi: 10.1006/jmbi.1994.1616. [DOI] [PubMed] [Google Scholar]
- 38.Shortle D. Adv Protein Chem. 1995;46:217–247. doi: 10.1016/s0065-3233(08)60336-8. [DOI] [PubMed] [Google Scholar]
- 39.Gu H, Doshi N, Kim D E, Simons K T, Santiago J V, Nauli S, Baker D. Protein Sci. 1999;8:2734–2741. doi: 10.1110/ps.8.12.2734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Poso D, Sessions R B, Lorch M, Clarke A R. J Biol Chem. 2000;275:35723–35726. doi: 10.1074/jbc.M001747200. [DOI] [PubMed] [Google Scholar]
- 41.Petukhov M, Cregut D, Soares C M, Serrano L. Protein Sci. 1999;8:1982–1989. doi: 10.1110/ps.8.10.1982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Funahashi J, Takano K, Yamagata Y, Yutani K. Biochemistry. 2000;39:14448–14456. doi: 10.1021/bi0015717. [DOI] [PubMed] [Google Scholar]
- 43.Sosnick T R, Mayne L, Hiller R, Englander S W. Nat Struct Biol. 1994;1:149–156. doi: 10.1038/nsb0394-149. [DOI] [PubMed] [Google Scholar]
- 44.Bryngelson J D, Onuchic J N, Socci N D, Wolynes P G. Proteins. 1995;21:167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
- 45.Fersht A R. Proc Natl Acad Sci USA. 1995;92:10869–10873. doi: 10.1073/pnas.92.24.10869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Clarke J, Hounslow A M, Bond C J, Fersht A R, Daggett V. Protein Sci. 2000;9:2394–2404. doi: 10.1110/ps.9.12.2394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wong K B, Clarke J, Bond C J, Neira J L, Freund S M, Fersht A R, Daggett V. J Mol Biol. 2000;296:1257–1282. doi: 10.1006/jmbi.2000.3523. [DOI] [PubMed] [Google Scholar]
- 48.Fersht A R. Curr Opin Struct Biol. 1997;7:3–9. doi: 10.1016/s0959-440x(97)80002-4. [DOI] [PubMed] [Google Scholar]
- 49.Eisenhaber F. Protein Sci. 1996;5:1676–1686. doi: 10.1002/pro.5560050821. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.










