Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2019 Mar 15;116(14):6806–6811. doi: 10.1073/pnas.1818744116

Networks of electrostatic and hydrophobic interactions modulate the complex folding free energy surface of a designed βα protein

Sujit Basak a, R Paul Nobrega a,1, Davide Tavella a, Laura M Deveau a,1, Nobuyasu Koga b, Rie Tatsumi-Koga b, David Baker c,d, Francesca Massi a,2, C Robert Matthews a,2
PMCID: PMC6452746  PMID: 30877249

Significance

Natural protein sequences have evolved to fold efficiently to their functional forms, traversing a relatively smooth energy surface with minimal frustration. The optimization of stability in de novo-designed proteins, without regard to dynamical processes required for function, can have unintended consequences on energy surfaces. Electrostatic and hydrophobic networks of side chains for Di-III_14, a small βα protein with a natural topology, create a rough energy surface whose high-energy states reflect the progressive unlocking of a tightly packed interior. Unbiased explorations of sequence space can reveal complex properties not seen in natural proteins, expanding our understanding of the relationships among sequence, structure, folding, and function of these ubiquitous biopolymers.

Keywords: folding free energy surface, de novo-designed proteins, hydrogen exchange, partially folded states

Abstract

The successful de novo design of proteins can provide insights into the physical chemical basis of stability, the role of evolution in constraining amino acid sequences, and the production of customizable platforms for engineering applications. Previous guanidine hydrochloride (GdnHCl; an ionic denaturant) experiments of a designed, naturally occurring βα fold, Di-III_14, revealed a cooperative, two-state unfolding transition and a modest stability. Continuous-flow mixing experiments in our laboratory revealed a simple two-state reaction in the microsecond to millisecond time range and consistent with the thermodynamic results. In striking contrast, the protein remains folded up to 9.25 M in urea, a neutral denaturant, and hydrogen exchange (HDX) NMR analysis in water revealed the presence of numerous high-energy states that interconvert on a time scale greater than seconds. The complex protection pattern for HDX corresponds closely with a pair of electrostatic networks on the surface and an extensive network of hydrophobic side chains in the interior of the protein. Mutational analysis showed that electrostatic and hydrophobic networks contribute to the resistance to urea denaturation for the WT protein; remarkably, single charge reversals on the protein surface restore the expected urea sensitivity. The roughness of the energy surface reflects the densely packed hydrophobic core; the removal of only two methyl groups eliminates the high-energy states and creates a smooth surface. The design of a very stable βα fold containing electrostatic and hydrophobic networks has created a complex energy surface rarely observed in natural proteins.


The sequence space occupied by natural proteins is a small fraction of the total space accessible to a heteropolymer with 20 monomeric units (1, 2). Evolved sequences are constrained by the requirements to support ∼1,000 natural motifs (3) and to fold to a thermodynamically stable structure within seconds while maintaining biological function and avoiding aggregation. One might wonder if de novo-designed sequences not limited by history and biological constraints are capable of folding to unique, stable structures on a smooth energy surface typically observed for natural proteins. In the late 1980s, Regan and DeGrado (4) systematically aimed to synthesize a four-helix bundle protein that adopts stable folded structure in aqueous solution. Later, Harbury et al. (5) designed an unnatural right-handed tetrameric coiled coil based on design principles extracted from natural left-handed dimeric coiled coils. Subsequently, Kuhlman et al. (6) designed the sequence of a βα protein, Top7, also with a topology not found in nature. Top7 was also capable of folding to a well-defined structure (6), but its kinetic folding mechanism was complex and plagued by aggregation (7). Whether these unintended properties are a consequence of the novel topology or inherent to the sequence remains to be determined.

The Baker group has gone on to design small βα proteins (8, 9), multisubunit proteins (10, 11), membrane proteins (12), and even virus-like capsids (13) that achieve the desired structures with substantial stability. The overall design approach is to maximize favorable side-chain interactions and minimize backbone strain in the target folded state, and partially folded states and possible folding pathways are not considered. Koga et al. (8) created sequences that adopted five different arrangements of α-helices and β-strands with the desired structures and stabilities: the Ferredoxin-like fold, the Rossmann 2 × 2 fold, the IF3-like fold, the P-loop 2 × 2 fold, and the Rossmann 3 × 1 fold. Each contains a four-stranded β-sheet with a nonsequential strand order. The Rossmann 2 × 2 fold, the P-loop 2 × 2 fold, and the Rossmann 3 × 1 fold flank the β-sheet with α-helices on both faces, whereas the Ferredoxin-like fold and the IF3-like fold have α-helices on only one face. Although these designed βα proteins achieved the desired structures and stability, their kinetic folding mechanisms, in which the complexities in the folding free-energy surface of Top7 were encountered, were not explored.

A thermodynamic and kinetic analysis of the folding mechanism of one of the designed proteins, Di-III_14, a candidate from the IF3-like fold family and containing 74 aa, revealed a simple transition on a smooth energy surface from a guanidine hydrochloride (GdnHCl)-denatured state to the native state that occurs in a few dozen microseconds upon dilution into buffer. Surprisingly, hydrogen exchange (HDX) NMR analysis in water revealed the presence of numerous slowly interchanging high-energy states that require days for the complete exchange of main-chain amide hydrogens (NHs) with solvent. The observed protection patterns are closely associated with a pair of electrostatic networks on the surface and a complementary network of nonpolar interactions in the hydrophobic core. Evidently, there are unexplored regions of sequence space for a natural fold that have complex free-energy surfaces not revealed by a classical chemical denaturation analysis. These observations have implications for protein design.

Results

Chemical Denaturation of Di-III_14 with GdnHCl Reveals a Smooth Energy Surface.

Di-III_14 was reversibly denatured with GdnHCl, a chemical denaturant that typically induces a self-avoiding random coil structure for globular proteins (14). In excellent agreement with a previous measurement, the CD data were well described by simple two-state process (N ⇌ U) and yielded a predicted stability in water of 5.54 ± 0.18 kcal⋅mol−1 (Fig. 1A); compare with 5.6 kcal⋅mol−1 from the Baker laboratory (8). Further support for the two-state model was obtained by extracting a nearly identical stability from the unfolding transition monitored by the fluorescence lifetime of the single tryptophan at position 63 in β3 (SI Appendix, Fig. S1 and Table S1).

Fig. 1.

Fig. 1.

Thermodynamic and kinetic analyses of Di-III_14. (A) GdnHCl denaturation melts of the His6 (○, ●) and ΔHis6 (◊, ♦) constructs at pH 6.0 (open symbols) and pH 7.4 (closed symbols) at 25 °C. The data were globally fit to a two-state model (solid line). (B) A semilog plot of the relaxation times, τ, acquired from SF and CF kinetic folding experiments against final denaturant concentration; the symbols are identified in A. The solid line illustrates the fit of the kinetic data to a two-state model. The buffer in all experiments contained 100 mM NaCl, 5.6 mM Na2HPO4, and 1.1 mM KH2PO4. (C) Urea titration of Di-III_14 in the absence of NaCl (○) and in the presence of 2.5 M NaCl (●) monitored by the ellipticity at 222 nm for pH 7.4 and 25 °C. The expected urea titration curve for a stability of 5.54 kcal⋅mol−1 and an m-value of 1.2 kcal⋅mol−1⋅M−1 (dashed and dotted line). The fit of the partial titration curve in the presence of 2.5 M NaCl (dotted line) yields a predicted stability, ΔG°, of 11.4 kcal⋅mol−1. A minimal estimate of the predicted stability in water, 14 kcal⋅mol−1 (dashed line), reflects the stable native baseline up to 9.25 M urea. The solid black line indicates the GdnHCl denaturation melt of Di-III_14. (D) Urea melt of Di_III-14 variants E32R (■, solid line), R69E (◊, dotted line), I70A (♦, dashed dotted line), V31A (hexagons, dashed line), and WT (●) measured by CD spectroscopy at 222 nm. The transition curves were measured at 4 μM protein in 5.6 mM Na2HPO4, 1.1 mM KH2PO4, and 100 mM NaCl at pH 7.4 and 25 °C. All curves were fitted to a two-state model as described in the footnote to SI Appendix, Table S4.

Stopped-flow (SF) total intensity and continuous-flow (CF) time-resolved fluorescence were employed to monitor the dynamic unfolding and refolding reactions of Di-III_14 in GdnHCl for a time range from microseconds to seconds. A chevron plot of these data, i.e., the log of the relaxation time, τ, as a function of denaturant concentration, is shown in Fig. 1B. The unfolding and refolding reactions display simple exponential behavior with a maximum relaxation time of ∼20 ms near 2.4 M GdnHCl, the midpoint, Cm, of the equilibrium unfolding reaction. Unfolding and refolding reactions accelerate exponentially above and below this denaturant concentration, as expected for a two-state mechanism. Fitting the chevron to a two-state model (15), the unfolding and refolding relaxation times in the absence of denaturant were obtained by linear extrapolation and found to be 479 ms and 48 μs, respectively. These relaxation times correspond to rate constants for unfolding and refolding, ku and kr, of 2.09 s−1 and 2.08 × 104 s−1, respectively [τ = (1/k)]. The stability calculated from the kinetic data for the transition over a single barrier from the N to the U state, 5.44 ± 0.12 kcal⋅mol−1 (Fig. 1B and SI Appendix, Table S1), is in excellent agreement with that measured from equilibrium experiments. The concordant thermodynamic and kinetic results provide further support for a simple two-state folding mechanism for Di-III_14 when denatured by GdnHCl.

HDX Reveals a Rough Energy Surface for Di-III_14 in Water.

HDX of NHs with water provides an independent assessment of the folding free-energy surface of Di-III_14. There are two limiting scenarios. (i) First, the NHs exchange with solvent deuterium via an EX1 mechanism in which refolding from the exchange competent state to the protected state is slower than exchange of the exposed NH with solvent deuterium. In this case, exchange is limited by the rate constant of unfolding (16). (ii) Alternatively, the NHs exchange via an EX2 mechanism, whereby the rate constant of refolding is faster than the rate constant of exchange. In this case, exchange is modulated by the free energy difference between the protected and unprotected states. The definitive test of the mechanism and the correct interpretation of the HDX results involves measurement of the exchange at different pH values. The rate constants for exchange are independent of pH for the EX1 mechanism and increase by 10-fold for an increase in pH of 1 unit for the EX2 mechanism (17).

We found that the exchange process requires days for completion (SI Appendix, Fig. S2) and is independent of pH between pH 6.0, pH 6.5, and pH 7.4 for 29 measurably protected NHs, consistent with an EX1 process (Fig. 2A). The exchange of these NHs is limited by access to numerous exchange-competent states and is a highly noncooperative process. EX1 exchange kinetics for Di-III_14 require that the rate constant for refolding to the state offering protection against exchange be at least 10-fold slower than the intrinsic rate constant for exchange of the exposed NH. Based on published intrinsic exchange rate constants (18), the refolding rate constants must be <0.1 s−1 and the lifetimes of the exchange competent states >10 s. This behavior is in distinct contrast to the smooth energy surface found for GdnHCl denaturation, in which the 2.08 × 104-s−1 refolding rate constant in water leads to the expectation of an EX2 mechanism for exchange.

Fig. 2.

Fig. 2.

(A) Exchange-rate constants measured at pH 6.5 (○) and pH 7.4 (●) plotted against the rate constants at pH 6.0 at 25 °C. The solid line indicates the response expected for an EX1 mechanism, and the dotted and dashed lines indicate the responses expected for an EX2 mechanism at pH 6.5 and 7.4, respectively. (B) Exchange-rate constants at pH 7.4 of V31A (●) and E32R (○) are plotted against the rate constants at pH 6.5 at 25 °C. The solid line indicates the response expected for an EX1 mechanism, and the dashed lines indicate the responses expected for an EX2 mechanism at pH 7.4 for the Di-III_14 variants.

Urea Denaturation as a Test of the Role of Electrostatic Interactions in Defining the Energy Surface of Di-III_14.

The contrasting views of the folding free-energy surface for Di-III_14 by GdnHCl denaturation and HDX in water motivated the use of an uncharged chemical denaturant, urea. Di-III_14 has an unnaturally high but balanced content of cationic and anionic side chains, 35%, that form multiple salt bridges on the surface. We tested whether the electrostatic screening by the high salt concentrations with GdnHCl is responsible for differing views of the energy surface by employing urea to denature Di-III_14. Generally, GdnHCl and urea provide comparable estimates of stability for natural proteins (19).

We were surprised to find that the protein remains folded up to the solubility limit of urea (Fig. 1C), 9.25 M. Based on the thermodynamic parameters for the GdnHCl melt, ΔG = 5.54 kcal/mol, and m = 2.4 kcal/mol/M (SI Appendix, Table S1), the m-value for urea is expected to be ∼1.2 (20) and the Cm of Di-III_14 is predicted to be ∼4.6 M urea (Fig. 1C). Its resistance to unfolding up to 9.25 M urea implies a stability of greater than 14 kcal/mol for a two-state process. To test the possible role of the salt in modulating the resistance to urea denaturation, we added 2.5 M NaCl to the solvent. As shown in Fig. 1C, Di-III_14 begins to unfold at ∼7 M urea in the presence of 2.5 M salt, well above the expected Cm if urea and GdnHCl were to yield comparable measures of the stability. Screening of electrostatic interactions alone does not account for the discrepancy between the urea and GdnHCl denaturation of Di-III_14. The very high apparent stability when measured by urea denaturation is consistent with its resistance to thermal unfolding, with a midpoint of equilibrium thermal denaturation >>95 °C (8).

Structural Correlates of Protection Against HDX in Di-III_14.

To gain further insight into the contradictory perspectives on the folding free-energy surface of Di-III_14, we mapped the 29 strongly protected NHs onto the structure. We found that these NHs correspond to a 15-residue hydrophobic core surrounded by 14 solvent-exposed polar and charged side chains. Interestingly, 12 of the latter 14 side chains are components of two electrostatic networks (Fig. 3). One spans the surface of α2 and links it to α1, and the other contains a quartet of salt bridges that link the two internal β-strands, β2 and β4. The most slowly exchanging V31 (β2) and I70 (β4) NHs are not only flanked by two of these salt bridges, but are also hydrogen-bonded to each other through their amide NHs and carbonyl oxygens. The twofold slower exchange for V31 vs. I70, however, shows that their shared H-bonds do not break in a concerted fashion (SI Appendix, Fig. S3 and Tables S2 and S3). The observed exchange-rate constants are inversely correlated to the packing density (Fig. 4A) (21), as might be expected for an EX1 mechanism whereby the molecular rearrangements required for exchange are limited by high barriers. Comparisons with the average packing densities of natural mesophilic, thermophilic, and hyperthermophilic IF3-like proteins (Fig. 4B) find Di-III_14 to be as well packed as the latter group of exceedingly stable proteins.

Fig. 3.

Fig. 3.

Structural correlates of exchange-rate constants at pH 7.4 and 25 °C. (A) Measurable exchange-rate constants are superimposed on a ribbon diagram of the NMR structure of Di-III_14 (Protein Data Bank ID code 2LN3). V31 and I70, which have observable exchange over a 3-d collection window, are shown in black. The color scale from orange to blue follows the quintiles of the observable range of the rate constants as indicated. Weakly protected residues (2.5 × 10−4⋅s−1kex < 8.3 × 10−4⋅s−1) are shown in red. (B) Networks of electrostatic interactions spanning the surface of helix α2 and linked to helix α1 (Left) and linking β2 and β4 (Right). The orange dashed lines indicate distances of less than 8 Å between the carboxyl oxygens on aspartic acid or glutamic acid side chains and nitrogens on arginine or lysine side chains. The images were created with the PyMOL Molecular Graphics System (version 1.2r3pre; Schrödinger).

Fig. 4.

Fig. 4.

(A) Correlation between the normalized packing densities of each amino acid in WT Di-III_14 calculated with Voronoia software (21) and observed exchange-rate constants of their amide hydrogens at pH 7.4 and 25 °C. (B) Comparative global analysis of the normalized packing densities of naturally occurring IF3-like fold proteins from different mesophilic (37 °C), thermophilic (75 °C), and hyperthermophilic (95 °C) organisms with Di-III_14. The width of the vertical line indicates the statistical dispersion, and the middle bar represents the median of the distribution. The small cross symbols represent the outliers. The vertical brackets indicate the maximum and minimum values of packing densities. The small square boxes correspond to the representative averaged value of the packing density for the mesophilic (0.715), thermophilic (0.764), and hyperthermophilic (0.83) variants, respectively.

Exchange-rate constants were uniformly accelerated by increasing the concentration of NaCl by 0.25 M and 0.5 M (SI Appendix, Fig. S3). By contrast, nonmonotonic protection for NHs in α1, α2, β2, and β4, increasing and then decreasing, was observed when increasing the GdnHCl concentration from 0.5 to 1.0 M. The monotonic response to NaCl likely reflects the screening of electrostatic interactions on the surface of Di-III_14. The nonmonotonic response to GdnHCl suggests that the guanidinium cation selectively binds in the vicinity of the electrostatic networks in the native conformation, slowing exchange, before favoring denaturation and accelerating exchange. Crystallographic molecular dynamics simulations and solution-phase studies have previously revealed the binding of GdnHCl to the surface of native proteins, before its preferential binding to the unfolded state (2224).

Mutational Analysis of Electrostatic and Hydrophobic Networks in Di-III_14.

Electrostatic networks.

To test the role of electrostatic networks in defining the folding free-energy surface of Di-III_14, we targeted the E32 and R69 salt bridge, linking β2 and β4, and R46 in α2, the central cationic charge surrounded by a ring of anionic side chains in α2 (Fig. 3), for mutagenesis. In all three cases, the neutralization of the charge by mutation to alanine weakened the resistance to urea unfolding, with unfolding beginning at approximately 5 M urea that was not complete by 9 M urea (SI Appendix, Fig. S5B). By contrast, the individual reversal of the charges at all three positions had a dramatic effect on the stability of the variants. The E32R, R46E, and R69E variants are readily unfolded in urea with modest stabilities comparable to those from the GdnHCl melts, 3.32–5.48 kcal/mol (Fig. 1D and SI Appendix, Fig. S5 A and B and Table S4). E32R and R46E retain native-like secondary structure, but R69E loses approximately 25% of the CD signal at 222 nm (SI Appendix, Fig. S6). Notably, all three charge-reversal variants retain a detectable fraction of their ellipticity at 222 nm in the urea-denatured state relative to the GdnHCl-denatured state (Fig. 1D and SI Appendix, Fig. S5 A and B). Thus, the estimates of stability from urea melts are a lower bound on the difference in free energies between the native and fully unfolded states of this set of proteins.

To further probe the E32/R69 electrostatic interaction, we constructed the R69K variant to test the sensitivity of the energy surface to a change in sequence while conserving the charge. The urea melt for R69K began at ∼3 M urea and was only partially complete at 9 M urea (SI Appendix, Fig. S7). Lysine, a primary amine, is not capable of maintaining the remarkable resistance of Di-III_14 to urea denaturation observed for the WT protein. Evidently, the structure of the guanidinium moiety, not simply a positive charge, is crucial to the electrostatic network.

The dramatic decrease in resistance to urea denaturation and retention of structure for the E32R mutation led to an HDX NMR study to determine if the mutation altered the exchange properties. We found that the E32R variant retained the same pH-independent EX1 exchange kinetics as the WT protein (Fig. 2B), albeit in a faster time range. Therefore, the surface roughness of Di-III_14 in water can persist in the absence of its unusual resistance to urea denaturation.

Hydrophobic network.

The uniquely strong protection for V31 and I70, deeply buried in the hydrophobic network, motivated a mutational analysis of their roles in defining the folding free-energy surface of Di-III_14. The replacement of V31 or I70 with alanine had a striking effect on the chemical denaturation profiles (Fig. 1D and SI Appendix, Fig. S5A). The V31A and I70A variants now unfold in urea and yield stabilities very similar to those for their respective GdnHCl titrations (Fig. 1D and SI Appendix, Fig. S5A and Table S4). Remarkably, the HDX experiment on V31A revealed a much-accelerated exchange process governed by an EX2 mechanism (Fig. 2B). It is astonishing that the deletion of only two methyl groups when valine is replaced by alanine is sufficient to eliminate the surface roughness in Di-III_14.

Discussion

The de novo design of proteins has been remarkably successful, with a variety now available for novel applications (6, 2527). Failed sequences indicate the need for further refinement (28), but targeting the final structure, presumably the global minimum in free energy, has been a winning strategy. However, as we found in the present study of a small βα protein, the underlying free-energy surface may be very complex. The determination of the molecular basis for these complexities proved to be instructive for the next generation of design strategies.

Molecular Basis for the Complex Folding Free-Energy Surface of Di-III_14.

The folding free-energy surface of Di-III_14 is remarkably more complex than in almost all other proteins studied thus far. The surface as determined by a GdnHCl melt is very simple, with only the native and unfolded states being highly populated and separated by a single barrier. The complete resistance to urea denaturation implies a stability far greater than that observed in GdnHCl, reflecting contributions from the extensive electrostatic and hydrophobic networks of side chains in this small βα protein. The free-energy surface in water is exceedingly rough, with multiple high-energy states that slowly interconvert to enable exchange of main-chain NHs with solvent (Fig. 5).

Fig. 5.

Fig. 5.

The free-energy landscape of Di-III 14 at pH 7.4 and 25 °C in water, illustrating the barrier heights and energies of a few representative exchange-competent states relative to the native state, N. The exchange-competent states are labeled by the residue whose amide hydrogen atom exchanges with the solvent. The data do not distinguish between stochastic and sequential mechanisms for exchange. The procedures used to calculate the barrier heights and free energies of the exchange-competent states are described in the SI Appendix. The relative free energy of the unfolded state, U, and the transition state to the native basin in water are not known and are represented as a dashed line.

We offer conjecture that the dramatically different responses to the chemical denaturants reflect several properties of GdnHCl. First, the ionic nature of GdnHCl screens surface electrostatic interactions that contribute to stability (Fig. 1C). Second, the guanidinium moiety binds selectively to the surface of Di-III_14 in the vicinity of salt bridges whose cationic component is arginine (SI Appendix, Fig. S3). Direct competition between the denaturant and R30 and R32 would further weaken the electrostatic network. Third, GdnHCl is a more potent denaturant than urea, binding more tightly to exposed main-chain amide linkage in the unfolded state (29) and increasing the solubility of nonpolar side chains in water (30). Point mutations in the electrostatic and hydrophobic networks can enable two-state unfolding in urea; however, residual structure persists in the urea-denatured state. We presume that this residual structure, along with the hydrophobic network, also stabilizes the high-energy states that enable HDX in water. GdnHCl unfolds these states and results in a simple two-state folding reaction when extrapolated to water.

Electrostatic networks.

A significant fraction of the charged side chains in Di-III_14 are involved in a pair of electrostatic networks, one on the solvent-exposed faces of α2 and α1 and the other linking β2 and β4 (Fig. 3). Charge reversal at E32 (β2) and R69 (β4) eliminates the resistance to urea denaturation and reveals comparable stabilities to GdnHCl melts. Looking more closely at the β2 and β4 strands, we find a striking segregation of charge (Fig. 3). β2 contains four anionic side chains in alternating positions, D28, E30, E32, and D34, and β4 contains a corresponding number of cationic side chains, K73, R71, R69, and K67, that form interstrand salt bridges. R69 bridges E30 and E32 in a three-way salt bridge. Placing a cationic side chain, R32, adjacent to E30 and D34 in β2 and opposite R69 in β4 introduces intrastrand attractive interactions in β2 and repulsive interstrand interactions between β2 and β4, effectively uncoupling the network that stabilizes the native conformation. The similar effects of replacing R69 in β4 and R46 in α2 with an anionic glutamic acid reinforces the conclusion that both networks are crucial to the resistance to urea denaturation. Several groups (3133) have explored the relationship between electrostatic interactions on the surface of proteins and their stabilities. Makhatadze and coworkers (31) have found that the optimization of electrostatic contributions to stability reflects less the pair-wise short-range interactions and more the combined effect of multiple long-range interactions defined by the global distribution of charges. The pair of electrostatic networks in Di-III_14 illustrates another source of collective interactions that serve to enhance its stability. Although not an explicit feature of the de novo design, the effect on the stability to thermal or urea denaturation is dramatic for such a small protein.

Hydrophobic network.

The strong protection against HDX in water for Di-III_14 also draws attention to a 15-residue network of nonpolar side chains, FILV (phenylalanine-isoleucine-leucine-valine), sandwiched between the surface α-helices and the underlying β-sheet. Especially noteworthy are V31 and I70 that exchange over a period of days. The replacement of the branched aliphatic side chains in V31 or I70 with alanine eliminates the resistance to urea denaturation, demonstrating that the hydrophobic network also plays a crucial role in the differential effects of the denaturants. In contrast to the disruption of the electrostatic network by charge reversal, the replacement of the isopropyl side chain of V31 with the methyl group of alanine eliminates the rough energy surface of Di-III_14. As the last NH to exchange, V31 must stabilize all of the preceding high-energy states on the folding free-energy surface of Di-III_14 and support the barriers between these states that give rise to EX1 exchange kinetics.

There are two scenarios for the EX1 HDX kinetics. One scenario imagines a progressive unfolding of the protein, dictated by the packing density. The most deeply buried V31 is the last to be exposed to solvent as its hydrophobic environment melts away. The second scenario imagines a stochastic exposure of the main-chain NHs, with the hydrophobic core largely intact but dynamic, as each NH is uniquely exposed to solvent before the core reforms around it. The first scenario is favored by the observation of residual secondary structure in urea-denatured E32R (Fig. 1D and SI Appendix, Fig. S5A). The persistence of residual structure in 9.25 M urea for E32R makes it very likely that residual structure also exists in the exchange-competent state for V31 in the parent Di-III_14 sequence in water. We presume that this residual structure is the platform that stabilizes the hydrophobic core and, in combination with the core, results in a rough energy surface. In either scenario, the inverse correlation of exchange-rate constants with packing density in the native conformation means that exchange must take place from higher-energy states in the native basin. These exchange-competent states in the native basin are separated by substantial barriers as a result of the large hydrophobic and electrostatic networks that characterize the structure of Di-III_14.

In addition, the protection from hydrogen exchange observed at 0.5 M GdnHCl (SI Appendix, Fig. S3) is likely a result of the interaction of the guanidinium cation with the electrostatic networks present in the native conformation. This observation further supports our hypothesis that the rugged folding free energy of Di-III_14, characterized by multiple exchange-competent states, exists in the native basin. Acceleration of the exchange process is observed at 1.0 M GdnHCl as the denaturation potential of the guanidinium cation begins to modulate the free-energy surface.

Charge Segregation and the Folding Free-Energy Surface of Di-III_14.

Das and Pappu (34) have simulated the behavior of polypeptides containing various compositions and distributions of oppositely charged side chains, lysine and glutamic acid, and constructed a phase diagram for polyampholytes. They found that the total fraction of charged residues, the net charge per residue (NCPR), and a normalized charge segregation parameter, κ, were sufficient to predict chain compaction. Di-III_14, with 35% charged residues, an NCPR of −0.06, and a κ value of 0.25, is expected to form a compact unfolded state in water (34, 35). The predicted compactness derives in part from the preferred proximity of oppositely charged segments in the unfolded state. Not considered in their analysis was the potential effect of charge segregation on the native state of a resident protein. The dramatic effect on urea denaturation by neutralizing or reversing the charge of either partner in the E32/R69 salt bridge in Di-III_14 points to a potential role for charge segregation in creating electrostatic networks that make significant contributions to the free energy of the native state.

Implications from Evolution for Protein Design.

The complex energy surface observed for Di-III_14 in water has implications for the design of proteins with folding landscapes more similar to those of naturally occurring proteins.

Electrostatics.

The unusually high but balanced composition of charged side chains and their segregation into positive and negative segments of the sequence in Di-III_14 differentiate this designed protein from almost the entire proteome of the thermophile Sulfolobus solfataricus (Fig. 6). Charge neutralization or reversal of individual residues that participate in the pair of electrostatic networks overcomes the complete resistance to urea denaturation, resulting in an estimate of stability comparable to that for GdnHCl denaturation, as seen for numerous other naturally occurring proteins. Decreasing the composition of charged side chains and avoiding charge segregation to mimic those properties found in proteomes whose sequences were filtered by evolution should be an important design feature.

Fig. 6.

Fig. 6.

Distribution of the normalized charge segregation parameter, κ, vs. the fraction of charged residues for the S. solfataricus proteome from UniProt, calculated with the CIDER algorithm (34). Scatter-plot color coding is scaled from high disorder (red) to low disorder (blue). The bold red circle corresponds to the value for Di_III-14. Representations of the integration of the scatter plots along the x-axis (Top) and y-axis (Right) are also shown.

Hydrophobics.

The sequence of Di-III_14 has a distinct basis toward larger aliphatic side chains, 10 leucines and 6 isoleucines vs. 1 valine and 1 alanine; larger polar/charged side chains, 10 glutamic acids and 6 glutamines vs. 6 aspartic acids and 2 asparagines; and away from aromatic side chains, 3 phenylalanines, 1 tryptophan, 0 tyrosines, and 0 histidines. The bias toward larger side chains in very stable thermophilic proteins has been observed long ago (36); however, the present study argues that there are limits to the benefits of an increased packing density. The β- and γ-branched side chains in the isoleucine–leucine–valine (ILV) subset can also create an interlocked network that could impede interconversions between high-energy states and contribute to a rough energy surface. The introduction of even a single alanine in place of a valine, V31A, unlocks the hydrophobic network and allows the rapid interchange between high-energy states in the native basin. Perhaps the lesson from evolution is that the inclusion of a more diverse set of nonpolar side chains in design, e.g., alanine, cysteine, and methionine, would enable the retention of the high packing density seen in thermophilic proteins (Fig. 4B), but reduce the probability of an interlocked network of side chains that create the rough energy surface seen for Di-III_14.

The rough energy surface in this small designed protein, created by a gradient in the packing density and enhanced by a pair of electrostatic networks, is a consequence of a design algorithm that logically favors stability (i.e., packing density) and solubility (i.e., charged side chains on the surface). The lessons from evolution—to increase the compositional complexity, reduce the composition of charged side chains, and avoid charge segregation—can readily be incorporated into future design projects.

Materials and Methods

Protein Expression and Purification.

Di-III_14 was engineered with an N-terminal His6 tag and an intervening TEV Protease site (GenScript), ligated into the expression plasmid pGS-21a, and transfected into the Escherichia coli strain BL21 CodonPlus (DE3)RIL for expression. The purification processes are provided in the SI Appendix.

CD and Fluorescence Analysis.

Far-UV CD experiments were carried out in 100 mM NaCl, 5.6 mM Na2HPO4, and 1.1 mM KH2PO4 buffer at different pHs at 25 °C on a JASCO model J810 CD spectrophotometer. Intrinsic tryptophan fluorescence was monitored on a Horiba Flourolog3 by using a 0.5-cm path length, a slit width of 5 nm, and excitation at 295 nm. The fluorescence and CD data were globally fit to a two-state N ⇌ U model as a function of denaturant by using our in-house Savuka and commercial Origin software package. More details about the fitting are provided in the SI Appendix.

Kinetic Studies.

The time dependence of the Trp fluorescence intensity during unfolding and refolding reactions was monitored in the millisecond time regime on an Applied Photophysics SF system, model SX.18MV, with a dead time of 2 ms. Di-III_14 was injected from stock concentrations of 40 μM and mixed with a 1:10 dilution to 4 μM with 100 mM NaCl, 5.6 mM Na2HPO4, 1.1 mM KH2PO4 at varying pHs to the desired final experimental GdnHCl concentration.

Kinetic folding reactions in the microsecond time regime were measured on a home-built CF time-correlated single photon counting instrument with mixing time of 60 μs. Further details are provided in the SI Appendix.

HDX NMR.

Uniform labeling of Di-III_14 with 15N was accomplished by growing cells in isotopically enriched M9 medium, 1 g 15NH4Cl per liter. The sample was lyophilized after purification. The experimental buffer used in the HDX NMR experiments contained 100 mM NaCl, 5.6 mM Na2HPO4, and 1.1 mM KH2PO4 at pH 7.4, pH 6.5, or pH 6.0. All pH values reported are pD values obtained by correcting the pH meter readings: pD = pH (meter reading) + 0.41. Hydrogen exchange rates were measured in two ways: (i) lyophilized samples of 15N Di-III_14 were dissolved in a 100% D2O experimental buffer described earlier or (ii) samples of 15N Di-III_14 dissolved in 100% H2O experimental buffer were diluted in 100% D2O buffer to a final ratio of 90% D2O/10% H2O. Further details are provided in the SI Appendix.

Supplementary Material

Supplementary File

Acknowledgments

We thank all members of the laboratories of F.M. and C.R.M. for helpful discussions. This work was supported by National Science Foundation Grant MCB 1121942 (to C.R.M.), National Institutes of Health Grant GM117008 (to F.M.), and the Howard Hughes Medical Institute (D.B.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1818744116/-/DCSupplemental.

References

  • 1.Huang P-S, et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nat Chem Biol. 2016;12:29–34. doi: 10.1038/nchembio.1966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Keefe AD, Szostak JW. Functional proteins from a random-sequence library. Nature. 2001;410:715–718. doi: 10.1038/35070613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sillitoe I, et al. New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res. 2013;41:D490–D498. doi: 10.1093/nar/gks1211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Regan L, DeGrado WF. Characterization of a helical protein designed from first principles. Science. 1988;241:976–978. doi: 10.1126/science.3043666. [DOI] [PubMed] [Google Scholar]
  • 5.Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS. High-resolution protein design with backbone freedom. Science. 1998;282:1462–1467. doi: 10.1126/science.282.5393.1462. [DOI] [PubMed] [Google Scholar]
  • 6.Kuhlman B, et al. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364–1368. doi: 10.1126/science.1089427. [DOI] [PubMed] [Google Scholar]
  • 7.Watters AL, et al. The highly cooperative folding of small naturally occurring proteins is likely the result of natural selection. Cell. 2007;128:613–624. doi: 10.1016/j.cell.2006.12.042. [DOI] [PubMed] [Google Scholar]
  • 8.Koga N, et al. Principles for designing ideal protein structures. Nature. 2012;491:222–227. doi: 10.1038/nature11600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rocklin GJ, et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science. 2017;357:168–175. doi: 10.1126/science.aan0693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bale JB, et al. Accurate design of megadalton-scale two-component icosahedral protein complexes. Science. 2016;353:389–394. doi: 10.1126/science.aaf8818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Boyken SE, et al. De novo design of protein homo-oligomers with modular hydrogen-bond network–mediated specificity. Science. 2016;352:680–687. doi: 10.1126/science.aad8865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lu P, et al. Accurate computational design of multipass transmembrane proteins. Science. 2018;359:1042–1046. doi: 10.1126/science.aaq1739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Butterfield GL, et al. Evolution of a designed protein assembly encapsulating its own RNA genome. Nature. 2017;552:415–420. doi: 10.1038/nature25157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Plaxco KW, Simons KT, Ruczinski I, Baker D. Topology, stability, sequence, and length: Defining the determinants of two-state protein folding kinetics. Biochemistry. 2000;39:11177–11183. doi: 10.1021/bi000200n. [DOI] [PubMed] [Google Scholar]
  • 15.Matthews CR. Effect of point mutations on the folding of globular proteins. Methods Enzymol. 1987;154:498–511. doi: 10.1016/0076-6879(87)54092-7. [DOI] [PubMed] [Google Scholar]
  • 16.Krishna MMG, Hoang L, Lin Y, Englander SW. Hydrogen exchange methods to study protein folding. Methods. 2004;34:51–64. doi: 10.1016/j.ymeth.2004.03.005. [DOI] [PubMed] [Google Scholar]
  • 17.Maity H, Lim WK, Rumbley JN, Englander SW. Protein hydrogen exchange mechanism: Local fluctuations. Protein Sci. 2003;12:153–160. doi: 10.1110/ps.0225803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zhang YZ. 1995 Protein and peptide structure and interactions studied by hydrogen exchange and NMR. PhD thesis (Univ Pennsylvania, Philadelphia). Available at https://books.google.com/books?id=p-KcNwAACAAJ.
  • 19.Pace CN. Measuring and increasing protein stability. Trends Biotechnol. 1990;8:93–98. doi: 10.1016/0167-7799(90)90146-o. [DOI] [PubMed] [Google Scholar]
  • 20.Myers JK, Pace CN, Scholtz JM. Denaturant m values and heat capacity changes: Relation to changes in accessible surface areas of protein unfolding. Protein Sci. 1995;4:2138–2148. doi: 10.1002/pro.5560041020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Rother K, Hildebrand PW, Goede A, Gruening B, Preissner R. Voronoia: Analyzing packing in protein structures. Nucleic Acids Res. 2009;37:D393–D395. doi: 10.1093/nar/gkn769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gangadhara BN, Laine JM, Kathuria SV, Massi F, Matthews CR. Clusters of branched aliphatic side chains serve as cores of stability in the native state of the HisF TIM barrel protein. J Mol Biol. 2013;425:1065–1081. doi: 10.1016/j.jmb.2013.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gu Z, Zitzewitz JA, Matthews CR. Mapping the structure of folding cores in TIM barrel proteins by hydrogen exchange mass spectrometry: The roles of motif and sequence for the indole-3-glycerol phosphate synthase from Sulfolobus solfataricus. J Mol Biol. 2007;368:582–594. doi: 10.1016/j.jmb.2007.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Jha SK, Marqusee S. Kinetic evidence for a two-stage mechanism of protein denaturation by guanidinium chloride. Proc Natl Acad Sci USA. 2014;111:4856–4861. doi: 10.1073/pnas.1315453111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Chevalier A, et al. Massively parallel de novo protein design for targeted therapeutics. Nature. 2017;550:74–79. doi: 10.1038/nature23912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hosseinzadeh P, et al. Comprehensive computational design of ordered peptide macrocycles. Science. 2017;358:1461–1466. doi: 10.1126/science.aap7577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hsia Y, et al. Design of a hyperstable 60-subunit protein dodecahedron. [corrected] Nature. 2016;535:136–139, and erratum (2016) 540:150. doi: 10.1038/nature18010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci USA. 2000;97:10383–10388. doi: 10.1073/pnas.97.19.10383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tanford C. 1970. Protein Denaturation: Part C. Theoretical Models for the Mechanism of Denaturation. Advances in Protein Chemistry, eds Anfinsen CB, Edsall JT, Richards FM (Academic, New York), Vol 24, pp 1–95.
  • 30.Nozaki Y, Tanford C. The solubility of amino acids and related compounds in aqueous thylene glycol solutions. J Biol Chem. 1965;240:3568–3575. [PubMed] [Google Scholar]
  • 31.Tzul FO, Schweiker KL, Makhatadze GI. Modulation of folding energy landscape by charge–Charge interactions: Linking experiments with computational modeling. Proc Natl Acad Sci USA. 2015;112:E259–E266. doi: 10.1073/pnas.1410424112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Grimsley GR, et al. Increasing protein stability by altering long-range coulombic interactions. Protein Sci. 1999;8:1843–1849. doi: 10.1110/ps.8.9.1843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Spector S, et al. Rational modification of protein stability by the mutation of charged surface residues. Biochemistry. 2000;39:872–879. doi: 10.1021/bi992091m. [DOI] [PubMed] [Google Scholar]
  • 34.Das RK, Pappu RV. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc Natl Acad Sci USA. 2013;110:13392–13397. doi: 10.1073/pnas.1304749110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Huihui J, Firman T, Ghosh K. Modulating charge patterning and ionic strength as a strategy to induce conformational changes in intrinsically disordered proteins. J Chem Phys. 2018;149:085101. doi: 10.1063/1.5037727. [DOI] [PubMed] [Google Scholar]
  • 36.Vieille C, Zeikus GJ. Hyperthermophilic enzymes: Sources, uses, and molecular mechanisms for thermostability. Microbiol Mol Biol Rev. 2001;65:1–43. doi: 10.1128/MMBR.65.1.1-43.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES