SUMMARY
The folding fate of a protein in vivo is determined by the interplay between a protein’s folding energy landscape and the actions of the proteostasis network, including molecular chaperones and degradation enzymes. The mechanisms of individual components of the E. coli proteostasis network have been studied extensively, but much less is known about how they function as a system. We used an integrated experimental and computational approach to quantitatively analyze the folding outcomes (native folding vs. aggregation vs. degradation) of three test proteins biosynthesized in E. coli under a variety of conditions. Overexpression of the entire proteostasis network benefited all three test proteins, but the effect of upregulating individual chaperones or the major degradation enzyme, Lon, varied for proteins with different biophysical properties. In sum, the impact of the E. coli proteostasis network is a consequence of concerted action by the Hsp70 system (DnaK/DnaJ/GrpE), the Hsp60 system (GroEL/GroES), and Lon.
INTRODUCTION
Protein homeostasis, or proteostasis, is achieved when an organism has enough natively folded proteins to carry out its essential functions but not enough misfolded and aggregated proteins to interfere with organismal fitness (Balch et al., 2008; Powers et al., 2009). In a simplified view of proteostasis, new proteins can have three fates: they can fold to their native states; they can misfold and/or aggregate; or they can be degraded (Figure 1). Proteins that experience the latter two fates are not functional. All organisms regulate the health of their proteomes via a collection of chaperones, folding enzymes, proteases and other components that together make up the proteostasis network, or PN (Kim et al., 2013; Powers and Balch, 2013).
Figure 1. Schematic of Kinetic Partitioning During Protein Folding In Vivo.
Protein folding in vivo begins with ribosomal synthesis (“Synthesis”), which yields unfolded protein molecules (“U”; note that we neglect the possibility of co-translational folding). Unfolded protein can fold to the native state (“N”), misfold to the misfolded state (“M”) or be degraded (“Degradation”). Misfolded protein can be degraded or self-associate to form aggregates (“A”; note that aggregation is reversible in principle, as shown here, but in many cases is irreversible in practice unless assisted by the proteostasis network). A protein’s folding energy landscape dictates its partitioning among the unfolded, native, misfolded, and aggregated states in vitro (red text). However, each folding process is modulated by components of the proteostasis network in vivo. In E. coli, DnaK, DnaJ, and GrpE (KJE; the Hsp70–Hsp40 system) opposes misfolding by binding to misfolded protein molecules and forcing them to resume the unfolded state. GroEL and GroES (GroELS; the chaperonin system) promote folding by encapsulating unfolded protein molecules and enabling them to fold in an isolated cavity. Degradation is carried out by proteases, in particular Lon. Finally, ClpB (Hsp104) collaborates with KJE to solubilize aggregates.
Since protein folding, misfolding, and aggregation equilibria are linked, PN components that modulate any of these processes indirectly affect the others. However, each PN component seems to have an “assigned responsibility”: a process that it affects most directly. Using the E. coli PN as an example, native folding is promoted most directly by GroEL and GroES, the E. coli chaperonin–co-chaperonin pair (Chapman et al., 2006; Horwich and Fenton, 2009). Misfolding and aggregation are opposed most directly by DnaK, DnaJ, and GrpE, the E. coli Hsp70–Hsp40–nucleotide exchange factor trio (Calloni et al., 2012; Mayer and Bukau, 2005; Sharma et al., 2010), and by the collaboration of this trio with the disaggregating chaperone ClpB (Doyle et al., 2013). Finally, many proteases degrade proteins in E. coli, but Lon appears to be the most important for degrading misfolded protein (Gottesman, 1996; Gur and Sauer, 2008).
While the main functions of individual PN components are fairly well understood, their contributions to the integrated, system-level function of the whole PN are not as clear (Bershtein et al., 2013; Dickson and Brooks, 2013; Hingorani and Gierasch, 2014; Kim et al., 2013; Powers et al., 2012; Wiseman et al., 2007). How do PN components complement each other? Do they perform multiple or redundant functions? To what extent does a protein’s folding energy landscape determine its route through the PN and its fate? We have used a combination of experiments and computational modeling to address these and related questions.
Here, we focus on how proteins with low stabilities behave when overexpressed in E. coli because expression of such proteins challenges proteostasis (Gidalevitz et al., 2006; Olzscha et al., 2011). We chose E. coli as a model organism because of the availability of FoldEco (Powers et al., 2012), a computational model of E. coli’s PN that is essential to answer the mechanistic questions posed above. In addition, E. coli is widely used as a microbial factory for producing heterologous proteins. Failures in proteostasis were quantified by measuring total expression levels and the amount of aggregated vs. soluble protein. These quantities report on the extent of degradation, aggregation, and native folding experienced by our test proteins, and therefore cover each of a protein’s potential fates. We interrogated the roles of the DnaK– DnaJ–GrpE pathway (KJE), the GroEL–GroES pathway (GroELS), and Lon by overexpressing these PN components individually or in combinations. The point at which, and the extent to which, proteostasis failed for the test proteins then informed us as to the limits of the E. coli PN.
The test proteins in this work are unstable variants of E. coli dihydrofolate reductase (EcDHFR), murine cellular retinoic acid-binding protein 1 (MmCRABP1), and a de novo designed retroaldolase enzyme (RA114.3). These proteins span a range of origins (endogenous E. coli vs. mammalian vs. de novo designed, respectively) and folds (αβα sandwich, β barrel, and α/β barrel, respectively; Figure 2) (Bjelic et al., 2014; Kleywegt et al., 1994; Liu et al., 2014; Sawaya and Kraut, 1997), and have no significant sequence similarity. By examining how each member of this diverse group partitions between being soluble, aggregating, and being degraded as a function of the composition of the E. coli PN, we hoped to extract general lessons about the attributes of the PN as a system in its interactions with as broad as possible a selection of proteins, as well as lessons about the dominant contributors to the PN’s various functions. It is important to note here that this undertaking requires the assumption that the PN of E. coli handles heterologous proteins and its own endogenous proteins similarly. While a PN component from one organism generally cannot complement the loss of the orthologous component in another organism, chaperones from one organism are generally capable of assisting the folding of proteins from another. For example, upregulation of E. coli chaperones improved the expression yields for most of a set of 64 heterologous proteins (de Marco et al., 2007). Thus, KJE, GroELS, and Lon from E. coli appear to be quite general in their selection and handling of substrates.
Figure 2. Test Proteins Used in this Work.
(A) Structures of wild type test proteins: EcDHFR (PDB ID: 1RX2) (Sawaya and Kraut, 1997), MmCRABP1 (PDB ID: 2CBR) (Kleywegt et al., 1994), and RA114 (PDB ID: 4OU1) (Liu et al., 2014). The sites of the mutations in the destabilized test proteins are highlighted as pink spheres. (B) Bands corresponding to the test proteins in SDS-PAGE gels run on samples derived from E. coli overexpressing the wild type (“wt”) and mutant forms of the test proteins for 2 h at 30 °C. The lane labeled “total” is from pre-centrifugation cell lysates and the lanes labeled “soluble” and “aggregated” are from the supernatants and pellets, respectively, after cell lysates were centrifuged for 10 min at 13,500×g. The wild type variants of the test proteins form little or no aggregates when overexpressed in E. coli, but the mutants aggregate substantially. See also Figure S1.
RESULTS
The Test Proteins and their Overexpression in the E. coli Proteostasis Network in the Absence of Other Perturbations
The test proteins studied here are the M42T/H114R mutant of EcDHFR (m-EcDHFR), the R131Q/Y133S mutant of MmCRABP1 (m-MmCRABP1), and the E10K/D120V/N124S/L225P mutant of RA114.3 with a C-terminal His tag (m-RA114). These proteins are small- to medium-sized (DHFR: 159 amino acids; CRABP1: 137 amino acids; RA114: 258 amino acids) and monomeric (Figure 2). That each of these mutants is less stable than the corresponding wild type protein is demonstrated by their susceptibilities to urea denaturation for m-EcDHFR and m-RA114 (Figure S1A, B), and was reported previously for m-MmCRABP1 (Budyak et al., 2013).
Each of these test proteins was expressed by isopropyl β-D-1-thiogalactopyranoside (IPTG) induction of a plasmid under control of the lac promoter in E. coli K12 DE3 (HMS174) cells growing in LB media at 30 °C. After 2 h induction, the cells were lysed, and aggregated and soluble proteins were separated by centrifugation. Total, soluble, and aggregated (i.e., in the pellet) protein fractions were analyzed by SDS-PAGE and the absolute amount of test protein in each was determined by comparison to a calibration line constructed using purified recombinant protein (Figure S1C; Supplemental Experimental Procedures). The total protein concentrations present at the end of the 2 h expression period were 498 ± 58, 106 ± 5, and 385 ± 40 μM (mean ± SEM) for m-EcDHFR, m-MmCRABP1, and m-RA114, respectively. Substantial aggregation was observed for each test protein, with the aggregated fractions amounting to 46 ± 3%, 76 ± 1% and 86 ± 1% of total protein for m-EcDHFR, m-MmCRABP1, and m-RA114, respectively (Figure 2B, Figure 3). Thus, m-EcDHFR appears to be the best behaved of our test proteins, while m-MmCRABP1 and m-RA114 are more aggregation prone. Expression of the wild type versions of these proteins under the same conditions resulted in negligible (wt-EcDHFR and wt-MmCRABP1) to low (wt-RA114) levels of aggregates (Figure 2B).
Figure 3. Folding Fates of Test Proteins upon Overexpression in E. coli under Adapted-Basal Conditions and with Individual PN Components Upregulated.
(A) Experimental scheme for overexpression of test proteins. All experiments were performed at 30 °C. (B) Bar graph showing the results of overexpressing m-EcDHFR in E. coli under various conditions. Left axis: cytoplasmic concentration of m-EcDHFR as determined by quantitative analysis of gels like those shown in Figures 2B and S2E. Right axis: concentration of m-EcDHFR relative to the total concentration of m-EcDHFR (soluble + aggregated) produced under adapted-basal conditions. These concentrations are referred to in the text as [Sol]rel,X and [Agg]rel,X for soluble and aggregated forms of a given test protein “X”. White bars: total protein concentration under each condition. Blue bars: concentration of soluble protein under each condition (i.e., the concentration in the supernatant after lysis and centrifugation). Red bars: concentration of aggregated protein under each condition (i.e., the concentration in the pellet after lysis, centrifugation, and re-suspension). Error bars: standard errors of the mean. Numbers above bars: percentage of the total protein that is soluble or aggregated under each condition ± SEM. (C) As (B), but for m-MmCRABP1. (D) As (B), but for m-RA114. The “high” and “low” upregulation levels resulted in respective concentration increases of ~4- to 8-fold and ~2- to 3- fold for the major PN components (DnaK, GroEL, and Lon; Figure S2C,D). The “medium” upregulation level (for Lon only) resulted in a ~4-fold increase in the concentration of Lon (Figure S2C, D). See also Figure S2 and Table S1.
Heterologous protein expression can cause stress and lead to the upregulation of PN components (Gasser et al., 2008; Hoffmann and Rinas, 2004). We therefore measured the levels of GroEL, DnaK, and Lon before and 2 h after induction of m-EcDHFR and m-RA114. Compared to an empty vector control, DnaK and GroEL increased ~20% to 40% and Lon ~80% to 160% (Figure S2A). Similar increases in PN component levels were observed when wt-RA114, which is more stable than m-RA114 but still aggregates, was overexpressed. In contrast, overexpression of wt-EcDHFR, which is stable and well behaved, resulted in much smaller changes in PN component levels (Figure S2A). These results suggest that it is the overexpression of aggregation-prone proteins, and not the overexpression of proteins per se, that causes PN component levels to increase. The PN as it exists after being perturbed by expression of the test proteins will be referred to as the “adapted-basal” PN.
Individual Upregulation of KJE, GroELS, or Lon Moderately Decreases Test Protein Aggregation
To assess the effects of KJE, GroELS, and Lon on the test proteins, we introduced them into pBAD expression vectors (Figure S2B) so that their expression could be titratably induced by arabinose. Co-transformation with separate plasmids carrying the test protein and the PN components enabled us to express each independently. The expression of KJE, GroELS, or Lon was induced by adding arabinose 1 h before induction of the destabilized test protein (Figure 3A). The level of upregulation of the PN components was controlled by the concentration of arabinose added. The “high” and “low” upregulation levels resulted in respective concentration increases of ~4- to 8-fold and ~2- to 3-fold for the major PN components (DnaK, GroEL, and Lon; Figure S2C, D). The “medium” upregulation level, which was used only for Lon because the high-Lon conditions caused a drastic decrease in total protein levels (see below), increased Lon ~4-fold (Figure S2C, D).
Upregulation of KJE decreased, but did not eliminate, aggregation for each of the test proteins. The aggregated fraction decreased significantly from 46 ± 3% to 25 ± 4% of the total protein for m-EcDHFR (p = 0.001, one-tailed t-test); from 76 ± 1% to 46 ± 5% for m-MmCRABP1 (p < 0.001, one-tailed t-test); and from 86 ± 1% to 37 ± 3% for m-RA114 (p < 0.001, one-tailed t-test) under the high-KJE upregulation conditions (Figure 3B–D, Figure S2E, Table S1). In addition, the total test protein concentrations decreased by 30–40% under the high-KJE upregulation conditions (Figure 3B–D, Table S1). Low-KJE upregulation conditions also decreased total concentrations of the three proteins, but to a lesser extent. We cannot exclude the possibility that this result is due to direct delivery of substrates to proteases by DnaK or DnaJ (Sherman and Goldberg, 1992). However, based on modeling results with FoldEco that are presented in a later section, it appears more likely that the observed decreases in protein concentration are due to proteins that are rescued from misfolding and/or aggregation by KJE being degraded before they can fold or re-aggregate.
Upregulation of GroELS similarly decreased aggregation of m-EcDHFR and m-RA114. The aggregated fractions decreased from 46 ± 3% to 22 ± 9% for m-EcDHFR (p = 0.004, one-tailed t-test) and from 86 ± 1% to 39 ± 3% for m-RA114 (p < 0.001, one-tailed t-test) under the high-GroELS upregulation conditions (Figure 3B, D; Table S1). In contrast, GroELS upregulation did not significantly decrease the aggregation of m-MmCRABP1 (from 76 ± 1% to 66 ± 11%; p = 0.26, one-tailed t-test) (Figure 3C, Table S1). This observation suggests that m-MmCRABP1 is not a good substrate for GroELS; we examine this possibility in a later section.
Upregulation of Lon decreased the levels of both the soluble and aggregated forms of the test proteins, but, especially at the low and medium upregulation levels, the decrease was larger for the aggregates than for the soluble protein for two of the three (m-MmCRABP1 and m-RA114; compare Figure 3B to Figures 3C and D).
Upregulation of KJE, GroELS, or Lon in Pairs Further Decreases Test Protein Aggregation
To determine the effects of joint upregulation of KJE, GroELS, and Lon on our test proteins, we introduced pairs of these systems into the same expression vectors with one system under an arabinose promoter and the other under a tetracycline promoter (Figure S2B). We then repeated the test protein overexpression experiments described above (Figure 3A), except that the PN pathways were upregulated in pairs using inducer concentrations at the high upregulation level. In all cases, the levels of the PN components did not increase as much as when they were overexpressed on their own (~2- to 4-fold instead of 4- to 8-fold; Figure S3A).
GroELS+Lon and KJE+Lon were the most effective pairs of PN components for suppressing test protein aggregation (Figure 4, Table S1). The GroELS+KJE combination was less effective at suppressing aggregation (i.e., the fraction aggregated was higher under this condition) than either GroELS+Lon or KJE+Lon for m-MmCRABP1 and m-RA114 (for m-MmCRABP1: p = 0.0025 and 0.0016 for GroELS+KJE vs. KJE+Lon and GroELS+Lon; for m-RA114: p = 0.001 and 0.023 for GroELS+KJE vs. KJE+Lon and GroELS+Lon, one-tailed t-test). The same trends are apparent for m-EcDHFR, although the null hypothesis cannot be rejected with the same level of confidence (p = 0.089 and 0.176 for GroELS+KJE vs. KJE+Lon and GroELS+Lon, one-tailed t-test).
Figure 4. Folding Fates of Test Proteins upon Overexpression in E. coli under Adapted-Basal Conditions and with PN Components Upregulated in Pairs or after Overexpression of I54N σ32.
(A) As in Figure 3B, but with multiple PN components upregulated. (B) As in Figure 3C, but with multiple PN components upregulated. (C) As in Figure 3D, but with multiple PN components upregulated. The levels of DnaK, GroEL, and Lon increased by ~2- to 4-fold when they were upregulated in pairs or via overexpression of I54N σ32 (Figure S3A). See also Figure S3 and Table S1.
Simultaneous Upregulation of KJE, GroELS, or Lon via Expression of σ32 Virtually Eliminates Aggregation
As noted above, pair-wise upregulation of KJE, GroELS, and Lon resulted in lower levels of these PN components than upregulating them individually. Extrapolating this result suggests that it could be difficult to attain sufficiently high levels of upregulation if all three PN pathways were upregulated using arabinose- and tetracycline-induced expression systems. We therefore sought another way to simultaneously upregulate KJE, GroELS, and Lon.
KJE, GroELS, and Lon, as well as many other PN components, are in the regulon of the heat shock transcription factor, σ32 (Guisbert et al., 2008; Zhang et al., 2014). We therefore overexpressed the I54N mutant of σ32, which evades the post-translational regulation of σ32 (Guisbert et al., 2008; Yura et al., 2007; Zhang et al., 2014), to simultaneously increase the levels of KJE, GroELS, and Lon. Induction of I54N σ32 for 1 h prior to inducing the test proteins yielded 3- to 4-fold increases in the levels of DnaK, GroEL, and Lon (Figure S3A), as reported previously (Zhang et al., 2014).
Using I54N σ32 expression to upregulate the σ32 regulon nearly eliminated aggregation of the test proteins (Figure 4A–C). To determine the extent to which this result was due to KJE, GroELS, and Lon upregulation and not to other PN pathways that are part of the σ32 regulon, we examined the effect of upregulating ClpB (Doyle et al., 2013), or HtpG, the E. coli Hsp90 (Pearl and Prodromou, 2006), together with KJE, since both of these chaperones can cooperate with KJE (Doyle et al., 2013; Genest et al., 2011; Nakamoto et al., 2014). Upregulating ClpB+KJE or HtpG+KJE by expressing them simultaneously did not decrease the aggregated fractions of the test proteins beyond what was observed by upregulating KJE alone, even at the low level of upregulation (compare Figure S3B to the low-KJE upregulation results in Figure 3B–D). Simultaneous upregulation of IbpA and IbpB, the small heat shock proteins of E. coli (Kuczynska-Wisnik et al., 2002; Thomas and Baneyx, 1998), tended to increase the extent of aggregation of m-EcDHFR and m-MmCRABP1 relative to adapted-basal conditions, although this increase was only significant for m-EcDHFR (p = 0.003, one-tailed t-test). Upregulation of IbpA and IbpB had no effect on m-RA114 (Figure S3B).
Taken together, these results show that upregulating the PN using σ32, which evolved to counter the protein folding stress caused by heat shock, enables E. coli to suppress aggregation even for highly destabilized proteins at high expression levels, consistent with previous results (Zhang et al., 2014). Moreover, our results argue that KJE, GroELS, and Lon are primarily responsible for the effects observed with σ32 overexpression.
Analysis of Test Protein Folding Fates
Understanding how the E. coli PN manages proteostasis for our test proteins requires a quantitative analysis of our data. Thus, we first extract protein-specific trends, so far as they exist, by a phenomenological method. We then model how a test protein’s response to the PN reports on its energy landscape by using FoldEco, a mechanistic model for proteostasis in E. coli.
Phenomenological Models
To quantify how KJE, GroELS, and Lon affect our test proteins, we fit the overexpression data for each test protein to the phenomenological models below:
| (1) |
| (2) |
where [Agg]rel,X and [Sol]rel,X are the concentrations of the aggregated and soluble forms of test protein “X” normalized to the total concentration under adapted-basal conditions; [DnaK]rel, [GroEL]rel, and [Lon]rel are the concentrations of the PN components relative to their adapted-basal concentrations; aK,X, aG,X, and aL,X are the gradients of [Agg]rel,X with respect to [DnaK]rel, [GroEL]rel, and [Lon]rel; sK,X, sG,X, and sL,X are the gradients of [Sol]rel,X with respect to [DnaK]rel, [GroEL]rel, and [Lon]rel; and cAgg,X and cSol,X are the model intercepts. Values for [Agg]rel,X and [Sol]rel,X under the various PN conditions can be read off the right-hand axes in Figures 3B–D and 4A–C. Values for [DnaK]rel, [GroEL]rel, and [Lon]rel under all conditions are shown in Figures S2D and S3A. The gradient parameters quantify the efficacies of the PN components for a given test protein. For example, a protein that benefits greatly from GroELS would have a large, positive value of sG,X (indicating that [Sol]rel,X increases sharply as GroELS is upregulated) and/or a large negative value of aG,X (indicating that [Agg]rel,X decreases sharply as GroELS is upregulated). Because of the differences in the expression levels of the three test proteins and the inherent non-linearity of aggregation kinetics with respect to protein concentration, one must be cautious when comparing the magnitudes of the gradient parameters of different proteins. However, differences in the signs of the gradient parameters, and whether or not they differ significantly from 0, are not subject to such concerns. Also, we have chosen to use relative concentrations of the PN components in our model rather than their absolute concentrations for ease of presentation. However, we also report the gradient parameters scaled to the absolute test protein and PN component concentrations in Table S2.
The qualities of the fits of Equation (1) to the normalized concentrations of aggregated protein are moderate to good for all three test proteins (adjusted R2 = 0.73 for m-EcDHFR, 0.83 for m-RA114, and 0.68 for m-MmCRABP1; Figure 5A, red data points, see Table S1 for fit residuals). The parameters aK, aG, and aL are negative for each test protein, indicating that all of the PN components decrease aggregation (Figure 5B, red bars). KJE decreases the extent of aggregation the most for a given fold-change in its concentration, followed by Lon and GroELS (aK,X < aL,X < aG,X for all test proteins X). This result is consistent with KJE specifically antagonizing misfolding and aggregation, whereas GroELS and Lon affect these processes less directly.
Figure 5. Dependences of Test Protein Folding Fates on Different PN Components Based on Phenomenological Fits of Overexpression Data.
(A) Plots of the experimental values of [Agg]rel,X and [Sol]rel,X of m-EcDHFR (left), m-MmCRABP1 (middle), and m-RA114 (right) from Figures 3 and 4 vs. the corresponding model-derived values from the fits of Equations (1) or (2). Red data points: data points for [Agg]rel,X, fit with Equation (1). Blue data points: data points for [Sol]rel,X, fit with Equation (2). Dashed line: Line through the origin with a slope of 1. The extent to which the data points fall on the dashed line indicates the goodness-of-fit of the model. The circled data points have the largest residuals in the fit of Equation (2) to the [Sol]rel,RA114 data. (B) Bar graph showing the gradient parameters and their standard errors from the fits of Equations (1) (red bars) and (2) (blue bars). Negative values indicate that increasing the concentration of a PN component decreases the concentration of the aggregated (red bars) or soluble (blue bars) form of a test protein. Positive values indicate the opposite. The blue bars for m-RA114 are bordered by dashed lines because these parameter values were obtained from a poor quality fit. The p-values for the gradient parameters are indicated as follows: ***, p < 0.0001; **, p < 0.001; *, p < 0.01; and n.s., p > 0.05. Tests were not performed for sK,RA114, sG,RA114, or sL,RA114 since the p-value for the fit as a whole was > 0.01. See also Tables S1 and S2.
The fits of Equation (2) to the normalized concentrations of soluble protein are good for m-EcDHFR and m-MmCRABP1 (adjusted R2 = 0.82 for m-EcDHFR and 0.71 for m-MmCRABP1), but poor for m-RA114 (adjusted R2 = 0.10) (Figure 5A, blue data points, see Table S1 for fit residuals). The poor fit for m-RA114 is likely due to the small range of [Sol]rel,RA114 combined with its surprisingly low value when KJE+GroELS are jointly upregulated and its surprisingly high value when GroELS+Lon are jointly upregulated (highlighted data points in Figure 5A, right-hand panel). This observation suggests that the KJE and GroELS pathways may interfere with each other whereas the GroELS and Lon pathways cooperate to handle m-RA114.
The parameter sL,X is negative for both m-EcDHFR and m-MmCRABP1, but sL,DHFR is roughly the same as aL,DHFR whereas sL,CRABP1 is much smaller in magnitude than aL,CRABP1 (Figure 5B). This observation indicates that upregulating Lon preferentially depletes aggregates for m-MmCRABP1, but not for m-EcDHFR. In addition, sK,DHFR is close to 0, indicating that while KJE is very effective for diminishing aggregation for m-EcDHFR, it does not increase the concentration of soluble protein (Figure 5B). In contrast, sK,CRABP1 is substantial and positive for m-MmCRABP1 (Figure 5B). The situation is reversed for the GroELS system: sG,DHFR is substantial and positive but sG,CRABP1 is much smaller (Figure 5B). These results suggest that m-MmCRABP1 may be a poor substrate for GroELS. While one should be cautious when interpreting this observation because of the expression level differences of m-EcDHFR and m-MmCRABP1, an analysis of the data using FoldEco (see below), which explicitly accounts for these expression level differences, corroborates this notion.
While fitting Equations (1) and (2) to our data has enabled us to quantify the effects of KJE, GroELS, and Lon on our test proteins, these equations are phenomenological and cannot inform us about the causes of a given protein’s behavior. Based on previous results with FoldEco, the extent to which a protein benefits from different chaperoning mechanisms should be a function of that protein’s folding energetics (Dickson and Brooks, 2013; Powers et al., 2012). Thus, the values of the best-fit parameters for Equations (1) and (2) should reflect the folding energetics of our test proteins. To explore this possibility, we used FoldEco to fit our data by using the folding energetics as adjustable parameters.
Analysis of Test Protein Folding Fates Using FoldEco: General
FoldEco comprises a system of ordinary differential equations that represent the kinetics of the processes undergone by proteins in vivo: synthesis, folding, misfolding and aggregation, interaction with KJE and GroELS, and degradation by Lon (Figure 1) (Powers et al., 2012). To fit FoldEco to our data, we have to vary the parameters in FoldEco until the model optimally matches the experimental data. FoldEco has many dozens of parameters, but many of these are likely to be independent of, or weakly dependent on, the nature of the protein (Powers et al., 2012). For example, chaperone–co-chaperone interaction parameters should not be differentially affected by bound substrates (that is, the effect of one bound substrate on chaperone–co-chaperone interaction parameters should be similar to the effect of another). Furthermore, chaperones bind to substrates promiscuously (Aoki et al., 2000; Landry and Gierasch, 1991; Rudiger et al., 1997; Wang et al., 1999)—GroEL can even chaperone a substrate composed entirely of D-amino acids (Weinstock et al., 2014)—so chaperone-substrate interaction parameters are likely to be similar for most substrates. Values for such parameters can be obtained from the literature, and the available data are broadly (though not perfectly) consistent with the assertions in the preceding sentences (see (Powers et al., 2012), and Figures S1–S5 therein). Thus, to a first approximation, the only adjustable parameters needed to fit FoldEco to our data are those for the folding energy landscapes and synthesis rates of the test proteins. These parameters include the folding rate and equilibrium constants (kf and Kf); the misfolding rate and equilibrium constants (km and Km); the aggregation rate and equilibrium constants (ka and Ka, where Ka is the equilibrium constant for adding a misfolded monomer to an aggregate); and the steady-state protein synthesis rate (σ) (Figure 1). Importantly, FoldEco can account for how the different expression levels of the test proteins affect their overall behavior and in particular their tendency to aggregate via its built-in nucleated polymerization model for protein aggregation (Powers et al., 2012).
It is important to note that the parameters derived from FoldEco fits of our data are “effective parameters”, since we are applying FoldEco’s generic folding mechanism to the test proteins. For example, EcDHFR folds through a multi-step mechanism (Frieden, 1990). Thus, the single folding rate constant derived from FoldEco fits to the m-EcDHFR expression data (kf) subsumes the rate constants for the individual folding steps. In addition, the equilibrium denaturation of m-RA114 reveals at least one intermediate (Figure S1B). Thus, the best-fit folding equilibrium constant for m-RA114 encompasses the energetics for all of the states in the native conformational ensemble. Nevertheless, we expect that the effective parameters derived from fitting FoldEco will faithfully capture the essences of the true folding and misfolding processes.
The qualities of the FoldEco fits range from moderate to good, with the best-fit values of [Agg]rel,X and [Sol]rel,X deviating from their experimental values on average by 0.05 for m-EcDHFR, 0.12 for m-MmCRABP1, and 0.11 for m-RA114 (Figure 6; see Table S1 for fit residuals). Unfortunately, only the fit for m-EcDHFR permitted parameters to be estimated with acceptable precision (Table S3). These are discussed in the next section. The FoldEco fits for m-MmCRABP1 and m-RA114, despite not providing definite parameter estimates, nevertheless define some relationships among the parameters and thereby enable us to discern some general features of the in vivo folding energy landscapes of these two proteins.
Figure 6. FoldEco-Derived Fits of Experimental Data from the Overexpression of Test Proteins.

As in Figure 5A, except that the model-derived relative concentrations are determined from the FoldEco fits to the data. The circled points are those for m-MmCRABP1 under the low- and high-upregulation GroELS conditions, which have particularly large residuals. See also Figure S4, Figure S5, Table S1, and Table S3.
Analysis of the Folding Fate of m-EcDHFR Using FoldEco
The effective folding parameters for m-EcDHFR are: km = 0.06 s−1; Km = 0.3 (corresponding to ΔGm = +0.7 kcal mol−1); kf = 0.3 s−1; Kf = 130 (corresponding to ΔGf = −2.9 kcal mol−1); and ka = 4×109 M−1 s−1 (see Table S3 for the errors in these parameters). Only a lower limit could be determined for Ka, which was 1.4×109 M−1 (corresponding to ΔGa = −12.7 kcal mol−1). These parameters indicate that m-EcDHFR folds about five times faster than it misfolds, but its native state is not very stable, and once it misfolds it aggregates rapidly to form stable aggregates (Figure 7A). In fact, the aggregation rate constant is around the diffusion-controlled limit, possibly indicating that m-EcDHFR aggregates directly from the unfolded state, rather than through a monomeric misfolded intermediate as assumed in FoldEco.
Figure 7. Partitioning of the Three Test Proteins under Adapted-Basal Conditions at t = 2 h Based on FoldEco Simulations.
(A) Summary diagram for m-EcDHFR, laid out as in Figure 1. The sizes of the colored circles indicate the concentrations of the unfolded (U), native (N), misfolded (M), and aggregated (A) states. The radii of the circles are proportional to the cube roots of the concentrations. Cube roots are used to enable the lowest and highest concentrations to be shown on the same diagram. The circles for “Synthesis” and “Degradation” represent the total concentration of protein synthesized and degraded over the 2 h time course of the simulation. The numerical values of each concentration are written below the circles. Blue text: best-fit biophysical parameters from the fit of FoldEco to the data. Red numbers: percentages of misfolded protein molecules that aggregate, engage the KJE recovery pathway, or engage the Lon degradation pathway. Black italic numbers: percentages of degraded protein taken from the unfolded or misfolded states. (B) As in (A), but for m-MmCRABP1. (C) As in (A) but for m-RA114. Note that qualitative descriptors are used for the biophysical processes in (B) and (C). See also Table S3.
It is of interest to compare the parameters obtained from our FoldEco fits to the analogous parameters measured in vitro. However, because parameters for processes that are not on the folding pathway, like the rate and equilibrium constants for misfolding and aggregation, can be difficult to determine, we limit our comparisons to the folding equilibrium and rate constants, which can be determined using well-established methods. The in vitro free energy and rate constant for folding of m-EcDHFR were found to be ΔGf = −2.6 kcal mol−1 and kf = 1.1 s−1 by equilibrium denaturation (Figure S1A) and fluorescence-monitored refolding kinetics (Figure S4; kf here refers to the rate of tertiary structure acquisition, not subsequent steps required for NADP binding that were identified in past mechanistic studies of wt-EcDHFR folding (Frieden, 1990)). These values were determined at [m-EcDHFR] = 3.5 μM and 25 °C; aggregation was not detected under these conditions. The values determined for these parameters from our FoldEco fit were ΔGf = −2.9 kcal mol−1 and kf = 0.3 s−1 (Table S3, Figure 7A). The good correspondence between these parameter estimates and their experimental values supports our use of FoldEco to understand the in vivo folding of our test proteins.
Some aspects of the sensitivities of m-EcDHFR proteostasis to the PN components (quantified in Figure 5B) can be explained by the FoldEco fit to the m-EcDHFR expression data. For example, Lon affects the levels of aggregated and soluble m-EcDHFR about equally (aL,DHFR = −0.080 vs. sL,DHFR = −0.076; Figure 5B). Since the relatively slow misfolding of m-EcDHFR is followed by fast formation of stable aggregates (Figure 7A), the misfolded state of m-EcDHFR has a short lifetime and a low concentration, and Lon cannot intercept and degrade it before it aggregates. In contrast, Lon appears to preferentially diminish the aggregated state for m-MmCRABP1 (aL,CRABP1 = −0.129 vs. sL,CRABP1 = −0.014; Figure 5B). The same is qualitatively true for m-RA114 (Figure 3D), although a quantitative comparison cannot be made for m-RA114 because of the poor fit of Equation (2).
This point can be further illustrated using the FoldEco simulation of m-EcDHFR under adapted-basal conditions. In this simulation, only 0.1% of the protein molecules that sample the misfolded state are degraded by Lon (Figure 7A, red numbers), and furthermore, of the almost 188 μM of m-EcDHFR that is degraded over the 2 h time course of the expression simulation, only 0.2% is taken from the misfolded state (Figure 7A, black numbers). The remaining 99.8% of the m-EcDHFR that is degraded comes from the unfolded state. Since the unfolded state is the progenitor of both natively folded and aggregated protein, Lon affects the levels of each similarly for m-EcDHFR.
The short lifetime of the misfolded state also explains why KJE decreases levels of aggregated m-EcDHFR without increasing soluble m-EcDHFR. KJE affects the partitioning between aggregated and soluble protein only when it directly reverses the process of misfolding by converting misfolded protein to unfolded protein. Because misfolded m-EcDHFR exists so briefly and at such a low concentration, DnaJ and DnaK, like Lon, cannot bind to it before it aggregates. In our FoldEco simulation of m-EcDHFR under adapted-basal conditions, only 1% of the m-EcDHFR molecules that sample the misfolded state are engaged by KJE (Figure 7A, red numbers). Although KJE, especially with the assistance of ClpB, can recover protein from the aggregated state and return it to the unfolded state, this process happens after the “product determining step” of the protein folding pathway—that is, after unfolded protein is partitioned between folding and misfolding/aggregation pathways—and therefore cannot influence the folding/misfolding decision. The protein that the ClpB+KJE pathway recovers from aggregates is, however, susceptible to degradation, which at least partly explains why upregulating KJE leads to lower total protein levels for m-EcDHFR as well as m-MmCRABP1 and m-RA114.
Analysis of Folding Fates of m-MmCRABP1 and m-RA114 Using FoldEco
Although the fits of FoldEco to the data for m-MmCRABP1 and m-RA114 did not permit precise estimation of the biophysical parameters for these two proteins, the fits did define some relationships between the parameters that illuminate some general features of the folding energy landscapes of m-MmCRABP1 and m-RA114. For example, the folding rate constant (kf) of m-MmCRABP1 can be increased from its optimal value with very little decrease in the quality of the FoldEco fit provided that, first, the misfolding equilibrium constant (Km) also increases, and second, the misfolding rate constant (km) is much greater than kf so that misfolding is nearly at equilibrium with respect to folding. Then, increases in kf and Km have offsetting effects on the unfolded-to-native-state flux. Given that this flux is equal to the product kf × [U] (where [U] is the concentration of unfolded protein), increasing kf directly increases this product, while increasing Km decreases this product by shifting the misfolding equilibrium away from the unfolded state, thereby decreasing [U]. The overall effect of these combined changes on the behavior of m-MmCRABP1 is therefore minimal.
Despite the uncertainties in the parameters from the FoldEco fits for m-MmCRABP1 and m-RA114, certain aspects of their folding energy landscapes are nevertheless clear. Both m-MmCRABP1 and m-RA114 misfold faster than they fold (km > kf) but do not aggregate as fast as m-EcDHFR, resulting in appreciable accumulation of misfolded protein under adapted-basal PN conditions (Table S3; Figure 7B, C). This behavior can explain the response of m-MmCRABP1 and m-RA114 to KJE and Lon overexpression. The higher concentration of misfolded protein increases the utilization of the KJE recovery and Lon degradation pathways by almost 10-fold compared to m-EcDHFR (compare Figure 7A to Figure 7B and C, red numbers). Thus, Lon preferentially reduces the concentration of aggregates of m-MmCRABP1 and m-RA114 by degrading mainly misfolded protein. Similarly, KJE can directly convert misfolded m-MmCRABP1 and m-RA114 to the unfolded state, simultaneously increasing the levels of soluble protein and decreasing levels of aggregated protein.
The FoldEco fits, however, are unable to account for the poor chaperoning of m-MmCRABP1 by GroELS. According to the FoldEco fits, all of the test proteins should benefit similarly from GroELS upregulation. In reality, m-MmCRABP1 benefits much less from GroELS than do m-EcDHFR or m-RA114 (Figure 3B–D), as can be seen in the residuals of the FoldEco fit for [Agg]rel,CRABP1 and [Sol]rel,CRABP1 under the GroELS-low and -high upregulation conditions (Figure 6, middle panel, highlighted data points; Table S1). These residuals average 0.21 for m- MmCRABP1, compared to 0.05 for m-EcDHFR and 0.12 for m-RA114, suggesting that m-MmCRABP1 may interact differently with GroEL than the other test proteins do.
To explore this notion, we fit FoldEco to the expression data for m-MmCRABP1 while varying the equilibrium association constant between m-MmCRABP1 and GroEL (KGro–CRABP1). The best fit was obtained when KGro-CRABP1 was 1000-fold lower than its default value (103 M−1 vs. 106 M−1; Figure S5A), in which case the overall FoldEco fit improved by about 20% (mean residuals = 0.10 vs. 0.12), and the residuals for the GroELS upregulation conditions decreased from 0.21 to 0.15 (Figure S5B). The improved fit still did not yield precise estimates of the biophysical parameters of m-MmCRABP1, but it is clear that the fit improves because of the large decrease in KGro–CRABP1, suggesting that m-MmCRABP1 binds weakly to GroEL. This observation based solely on fitting the folding fate of this protein under different PN conditions is consistent with experimental findings in vitro that GroELS has no effect on MmCRABP1 folding and that there is no detectable binding between GroEL and incompletely folded MmCRABP1 variants (I. Budyak, H.-P. Feng, and L. M. Gierasch, unpublished results).
We note that adding more adjustable parameters does not dramatically improve the FoldEco fits for m-EcDHFR or m-RA114, nor does varying parameters beyond KGro–CRABP1 yield further substantive improvements to the FoldEco fit for m-MmCRABP1. Thus, the default, literature-derived parameters used in FoldEco appear to be sufficiently accurate in most cases, and when they are not, a lack of fit can indicate an unusual interaction between a PN component and a substrate.
DISCUSSION
Our results show that increasing the levels of PN components enables cells to better handle situations that involve protein misfolding, consistent with previous studies of the effect of chaperone upregulation on, for example, the yields of overexpressed heterologous proteins (de Marco, 2007; de Marco et al., 2007; Makino et al., 2011; Zhang et al., 2014), protein evolution (Bershtein et al., 2013; Bogumil and Dagan, 2012; Queitsch et al., 2002; Tokuriki and Tawfik, 2009), and stress tolerance (Feder et al., 1996; Welte et al., 1993). The “pro-folding” (GroELS), “anti-misfolding” (KJE), and “concentration control” (Lon) arms of the E. coli PN form an efficacious triad for maintaining proteostasis for a broad range of proteins, in that the concentration of soluble protein is generally maintained or increased by chaperone upregulation even when the total amount of protein decreases (see Figures 3 and 4). These arms are especially effective when upregulated in their native ratios as part of the σ32 transcriptional program (Figure 4). The two quantitative models of E. coli proteostasis, the phenomenological Equations (1) and (2) and FoldEco, provide insight into how these systems contribute to proteostasis, and how their contributions depend on their substrates’ folding energy landscapes.
The two models, which generally explain 60–80% of the variation in our data (Figures 5A and 6), share two important features: first, the only components of the PN that they include are KJE, GroELS, and Lon; and second, they do not account for any direct mechanisms by which these components can collaborate. The success of the models in fitting the data in light of the first feature suggests that KJE, GroELS, and Lon are the primary contributors to the proteostasis of our test proteins. This does not mean that other PN components, like ClpB, HtpG, and IbpA/IbpB are not important, but rather that their importance lies outside generic proteostasis in the absence of environmental stress. ClpB is vital for recovery from heat shock (Doyle et al., 2013; Squires et al., 1991; Weibezahn et al., 2004), but it is not necessary for maintaining normal growth rates under non-stress conditions (Squires et al., 1991). Overexpressing ClpB in our system has little effect on the folding fates of our test proteins, probably because protein disaggregation by ClpB+KJE is simply not as fast as protein synthesis. Similarly, HtpG may be important for the folding of specific substrates (Yosef et al., 2011), but it is not necessary for either normal growth or thermotolerance (Bardwell and Craig, 1988).
The ability of the models to fit our data without invoking collaboration among KJE, GroELS, and Lon suggests that these PN components act independently under our experimental conditions and that their effects are therefore largely, though not perfectly, additive. This assertion is corroborated by the residuals of the fits of our models to the test protein data, ~70% of which are less than ±0.1 and ~90% of which are less than ±0.2 (Table S1). However, we emphasize that many examples of collaboration between chaperone systems have been reported. In eukaryotes, the Hsp70 and Hsp90 systems work together in the proteostasis of, for example, steroid hormone receptors (Pratt and Toft, 2003; Wegele et al., 2006). PN components may also “hand off” substrates from one to another—for example, DnaK to GroEL (Langer et al., 1992)—and this sequential activity is important for in vivo protein folding. However, the large amounts of unfolded protein that are being produced in our experiments may lead to unoccupied chaperones binding to newly synthesized protein before they can accept the transfer of a substrate from another chaperone, thereby suppressing chaperone collaboration.
While the major PN systems in E. coli appear to act independently under high protein folding loads, there may still be occasional exceptions to this rule. Such an exception may have caused the poor fit of Equation (2) to the [Sol]rel,RA114 data. While we can only speculate that direct interactions between PN components is the reason that Equation (2) does not fit the [Sol]rel,RA114 data well, it remains an intriguing possibility that some client proteins can access chaperone collaboration pathways that others cannot.
FoldEco simulations allow us to go beyond the phenomenological treatment of Equations (1) and (2) because they enable correlations between folding energetics and chaperone mechanisms. For example, the response of m-EcDHFR to KJE and Lon was due to a combination of slow misfolding and fast aggregation. Even the cases where a good fit of the model to the data was elusive could be a clue to underlying biophysics. The poor fit of FoldEco to the data for m-MmCRABP1 under GroELS upregulation suggested weak binding of m-MmCRABP1 to GroEL. It should be noted that some aspects of proteostasis that could contribute to protein misfolding and aggregation are not yet modeled in FoldEco, like the effect of translation rates on proteostasis. Given that the m-MmCRABP1 and m-RA114 genes were not codon optimized for E. coli, such effects could contribute to the high aggregation propensities observed for these proteins. However, previous FoldEco simulations of luciferase expression found that ignoring the effect of co-translational folding did not greatly diminish the performance of FoldEco (Powers et al., 2012).
FoldEco has enabled us, in a sense, to invert the usual reductionist process for studying in vivo protein folding. Rather than analyzing our test proteins’ folding in vitro and then using this information to rationalize their behavior in vivo, we have used FoldEco to translate the behavior of our test proteins in vivo into information about their folding energy landscapes. In addition, since FoldEco embodies our best understanding of chaperone mechanisms based on the wealth of biochemical literature that exists on this subject, using FoldEco to fit our data constitutes a test of this understanding. We believe that the results of these fits were good enough to indicate that this understanding is fundamentally sound, albeit incomplete. We expect that as FoldEco is refined and expanded, its ability to explain the nuances of how proteostasis is managed for proteins with different folding energy landscapes will improve, providing a foundation for understanding processes like organismal stress responses and protein evolution that are intimately linked to protein folding energetics.
EXPERIMENTAL PROCEDURES
E. coli Strains and Plasmids
E. coli K12 strain HMS174 (DE3) was used for the overexpression experiments. Vectors under pBAD and/or pTet promoters on low copy number plasmids were used to express chaperones, Lon, and I54N σ32. Test proteins were expressed using a low copy number pET29b vector. Further details can be found in the Supplemental Experimental Procedures.
Measurement of the Levels of Aggregated and Soluble Protein
Overexpression of desired PN components was induced by adding arabinose to E. coli harboring a plasmid containing the genes for the desired PN component(s) and a plasmid containing the gene for the desired test protein. After 1 h of overexpressing the PN component(s) at 30 °C, test protein overexpression was induced by adding IPTG. After 2 h at 30 °C, cells were lysed by sonication at 4°C. Portions of the lysates were centrifuged for 10 min at 13500×g at 4°C. Test protein in the supernatant and pellet were defined as the soluble and aggregated fractions, respectively. The total (i.e., not centrifuged), soluble, and aggregated fractions were analyzed by SDS-PAGE (12 or 15 % Biorad gel) and then stained with Coomassie blue (m-EcDHFR) or subjected to western blot analysis (m-MmCRABP1 and m-RA114) to quantify protein levels. Adapted-basal PN controls were included in all gels so that the total test protein expression levels under perturbed PN conditions and adapted-basal PN conditions could be directly compared. Further details can be found in the Supplemental Experimental Procedures.
Fits of Equations (1) and (2) and FoldEco to Test protein Overexpression Data
Equations (1) and (2) were fit to the data by linear regression. FoldEco was fit to the data using a least squares approach. Details can be found in the Supplemental Experimental Procedures.
Supplementary Material
HIGHLIGHTS.
DnaK/DnaJ/GrpE, GroEL/GroES, and Lon dominate the E. coli proteostasis network.
The effects of the proteostasis network components are largely additive.
Modeling reveals relationships between chaperone efficacy and folding energetics.
Acknowledgments
We thank Gareth Morgan, Luke Wiseman, and Bernd Bukau for valuable discussions about this project and Colleen Fearns for critical reading of the manuscript. The anti-Lon and anti-GrpE antibodies used herein were kindly provided by the Sauer and Bukau laboratories, respectively. X.Z. was a Howard Hughes Medical Institute Fellow of the Helen Hay Whitney Foundation and is currently supported by a Burroughs Wellcome Fund Career Award at the Scientific Interface. This work was supported by the Skaggs Institute for Chemical Biology, the Lita Annenberg Hazen Foundation, and NIH grants GM101644 to L.M.G. and E.T.P. and DK075295 to J.W.K.
Footnotes
Supplemental Information includes Supplemental Experimental Procedures, five figures, and three tables.
AUTHOR CONTRIBUTIONS
Y.H.C., X.Z., K.F.R.P., J.W.K., L.M.G., and E.T.P. conceived the project, designed the experiments, and wrote the manuscript. Y.H.C., X.Z., K.F.R.P., and Y.L. performed the experiments. D.L.P. and E.T.P. fit the models to the data. All authors were involved in interpreting and discussing the results.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Aoki K, Motojima F, Taguchi H, Yomo T, Yoshida M. GroEL binds artificial proteins with random sequences. J Biol Chem. 2000;275:13755–13758. doi: 10.1074/jbc.275.18.13755. [DOI] [PubMed] [Google Scholar]
- Balch WE, Morimoto RI, Dillin A, Kelly JW. Adapting proteostasis for disease intervention. Science. 2008;319:916–919. doi: 10.1126/science.1141448. [DOI] [PubMed] [Google Scholar]
- Bardwell JC, Craig EA. Ancient heat shock gene is dispensable. J Bacteriol. 1988;170:2977–2983. doi: 10.1128/jb.170.7.2977-2983.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bershtein S, Mu W, Serohijos AW, Zhou J, Shakhnovich EI. Protein quality control acts on folding intermediates to shape the effects of mutations on organismal fitness. Mol Cell. 2013;49:133–144. doi: 10.1016/j.molcel.2012.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bjelic S, Kipnis Y, Wang L, Pianowski Z, Vorobiev S, Su M, Seetharaman J, Xiao R, Kornhaber G, Hunt JF, et al. Exploration of alternate catalytic mechanisms and optimization strategies for retroaldolase design. J Mol Biol. 2014;426:256–271. doi: 10.1016/j.jmb.2013.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bogumil D, Dagan T. Cumulative impact of chaperone-mediated folding on genome evolution. Biochemistry. 2012;51:9941–9953. doi: 10.1021/bi3013643. [DOI] [PubMed] [Google Scholar]
- Budyak IL, Krishnan B, Marcelino-Cruz AM, Ferrolino MC, Zhuravleva A, Gierasch LM. Early folding events protect aggregation-prone regions of a beta-rich protein. Structure. 2013;21:476–485. doi: 10.1016/j.str.2013.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calloni G, Chen T, Schermann SM, Chang HC, Genevaux P, Agostini F, Tartaglia GG, Hayer-Hartl M, Hartl FU. DnaK functions as a central hub in the E. coli chaperone network. Cell Rep. 2012;1:251–264. doi: 10.1016/j.celrep.2011.12.007. [DOI] [PubMed] [Google Scholar]
- Chapman E, Farr GW, Usaite R, Furtak K, Fenton WA, Chaudhuri TK, Hondorp ER, Matthews RG, Wolf SG, Yates JR, et al. Global aggregation of newly translated proteins in an Escherichia coli strain deficient of the chaperonin GroEL. Proc Natl Acad Sci U S A. 2006;103:15800–15805. doi: 10.1073/pnas.0607534103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Marco A. Protocol for preparing proteins with improved solubility by co-expressing with molecular chaperones in Escherichia coli. Nat Protoc. 2007;2:2632–2639. doi: 10.1038/nprot.2007.400. [DOI] [PubMed] [Google Scholar]
- de Marco A, Deuerling E, Mogk A, Tomoyasu T, Bukau B. Chaperone-based procedure to increase yields of soluble recombinant proteins produced in E. coli. BMC Biotechnol. 2007;7:32. doi: 10.1186/1472-6750-7-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickson A, Brooks CL., 3rd Quantifying chaperone-mediated transitions in the proteostasis network of E. coli. PLoS Comput Biol. 2013;9:e1003324. doi: 10.1371/journal.pcbi.1003324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doyle SM, Genest O, Wickner S. Protein rescue from aggregates by powerful molecular chaperone machines. Nat Rev Mol Cell Biol. 2013;14:617–629. doi: 10.1038/nrm3660. [DOI] [PubMed] [Google Scholar]
- Feder ME, Cartano NV, Milos L, Krebs RA, Lindquist SL. Effect of engineering Hsp70 copy number on Hsp70 expression and tolerance of ecologically relevant heat shock in larvae and pupae of Drosophila melanogaster. J Exp Biol. 1996;199:1837–1844. doi: 10.1242/jeb.199.8.1837. [DOI] [PubMed] [Google Scholar]
- Frieden C. Refolding of Escherichia coli dihydrofolate reductase: sequential formation of substrate binding sites. Proc Natl Acad Sci USA. 1990;87:4413–4416. doi: 10.1073/pnas.87.12.4413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gasser B, Saloheimo M, Rinas U, Dragosits M, Rodriguez-Carmona E, Baumann K, Giuliani M, Parrilli E, Branduardi P, Lang C, et al. Protein folding and conformational stress in microbial cells producing recombinant proteins: a host comparative overview. Microb Cell Fact. 2008;7:11. doi: 10.1186/1475-2859-7-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Genest O, Hoskins JR, Camberg JL, Doyle SM, Wickner S. Heat shock protein 90 from Escherichia coli collaborates with the DnaK chaperone system in client protein remodeling. Proc Natl Acad Sci USA. 2011;108:8206–8211. doi: 10.1073/pnas.1104703108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gidalevitz T, Ben-Zvi A, Ho KH, Brignull HR, Morimoto RI. Progressive disruption of cellular protein folding in models of polyglutamine diseases. Science. 2006;311:1471–1474. doi: 10.1126/science.1124514. [DOI] [PubMed] [Google Scholar]
- Gottesman S. Proteases and their targets in Escherichia coli. Annu Rev Genet. 1996;30:465–506. doi: 10.1146/annurev.genet.30.1.465. [DOI] [PubMed] [Google Scholar]
- Guisbert E, Yura T, Rhodius VA, Gross CA. Convergence of molecular, modeling, and systems approaches for an understanding of the Escherichia coli heat shock response. Microbiol Mol Biol Rev. 2008;72:545–554. doi: 10.1128/MMBR.00007-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gur E, Sauer RT. Recognition of misfolded proteins by Lon, a AAA(+) protease. Genes Dev. 2008;22:2267–2277. doi: 10.1101/gad.1670908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hingorani KS, Gierasch LM. Comparing protein folding in vitro and in vivo: foldability meets the fitness challenge. Curr Opin Struct Biol. 2014;24:81–90. doi: 10.1016/j.sbi.2013.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoffmann F, Rinas U. Stress induced by recombinant protein production in Escherichia coli. Adv Biochem Eng Biotechnol. 2004;89:73–92. doi: 10.1007/b93994. [DOI] [PubMed] [Google Scholar]
- Horwich AL, Fenton WA. Chaperonin-mediated protein folding: using a central cavity to kinetically assist polypeptide chain folding. Q Rev Biophys. 2009;42:83–116. doi: 10.1017/S0033583509004764. [DOI] [PubMed] [Google Scholar]
- Kim YE, Hipp MS, Bracher A, Hayer-Hartl M, Hartl FU. Molecular chaperone functions in protein folding and proteostasis. Annu Rev Biochem. 2013;82:323–355. doi: 10.1146/annurev-biochem-060208-092442. [DOI] [PubMed] [Google Scholar]
- Kleywegt GJ, Bergfors T, Senn H, Le Motte P, Gsell B, Shudo K, Jones TA. Crystal structures of cellular retinoic acid binding proteins I and II in complex with all-trans-retinoic acid and a synthetic retinoid. Structure. 1994;2:1241–1258. doi: 10.1016/s0969-2126(94)00125-1. [DOI] [PubMed] [Google Scholar]
- Kuczynska-Wisnik D, Kedzierska S, Matuszewska E, Lund P, Taylor A, Lipinska B, Laskowska E. The Escherichia coli small heat-shock proteins IbpA and IbpB prevent the aggregation of endogenous proteins denatured in vivo during extreme heat shock. Microbiology. 2002;148:1757–1765. doi: 10.1099/00221287-148-6-1757. [DOI] [PubMed] [Google Scholar]
- Landry SJ, Gierasch LM. The chaperonin GroEL binds a polypeptide in an α-helical conformation. Biochemistry. 1991;30:7359–7362. doi: 10.1021/bi00244a001. [DOI] [PubMed] [Google Scholar]
- Langer T, Lu C, Echols H, Flanagan J, Hayer MK, Hartl FU. Successive action of DnaK, DnaJ and GroEL along the pathway of chaperone-mediated protein folding. Nature. 1992;356:683–689. doi: 10.1038/356683a0. [DOI] [PubMed] [Google Scholar]
- Liu Y, Tan YL, Zhang X, Bhabha G, Ekiert DC, Genereux JC, Cho Y, Kipnis Y, Bjelic S, Baker D, et al. Small molecule probes to quantify the functional fraction of a specific protein in a cell with minimal folding equilibrium shifts. Proc Natl Acad Sci USA. 2014;111:4449–4454. doi: 10.1073/pnas.1323268111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Makino T, Skretas G, Georgiou G. Strain engineering for improved expression of recombinant proteins in bacteria. Microb Cell Fact. 2011;10:32. doi: 10.1186/1475-2859-10-32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mayer MP, Bukau B. Hsp70 chaperones: cellular functions and molecular mechanism. Cell Mol Life Sci. 2005;62:670–684. doi: 10.1007/s00018-004-4464-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakamoto H, Fujita K, Ohtaki A, Watanabe S, Narumi S, Maruyama T, Suenaga E, Misono TS, Kumar PK, Goloubinoff P, et al. Physical interaction between bacterial heat shock protein (Hsp) 90 and Hsp70 chaperones mediates their cooperative action to refold denatured proteins. J Biol Chem. 2014;289:6110–6119. doi: 10.1074/jbc.M113.524801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olzscha H, Schermann SM, Woerner AC, Pinkert S, Hecht MH, Tartaglia GG, Vendruscolo M, Hayer-Hartl M, Hartl FU, Vabulas RM. Amyloid-like aggregates sequester numerous metastable proteins with essential cellular functions. Cell. 2011;144:67–78. doi: 10.1016/j.cell.2010.11.050. [DOI] [PubMed] [Google Scholar]
- Pearl LH, Prodromou C. Structure and mechanism of the Hsp90 molecular chaperone machinery. Annu Rev Biochem. 2006;75:271–294. doi: 10.1146/annurev.biochem.75.103004.142738. [DOI] [PubMed] [Google Scholar]
- Powers ET, Balch WE. Diversity in the origins of proteostasis networks--a driver for protein function in evolution. Nat Rev Mol Cell Biol. 2013;14:237–248. doi: 10.1038/nrm3542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Powers ET, Morimoto RI, Dillin A, Kelly JW, Balch WE. Biological and chemical approaches to diseases of proteostasis deficiency. Annu Rev Biochem. 2009;78:959–991. doi: 10.1146/annurev.biochem.052308.114844. [DOI] [PubMed] [Google Scholar]
- Powers ET, Powers DL, Gierasch LM. FoldEco: a model for proteostasis in E. coli. Cell Rep. 2012;1:265–276. doi: 10.1016/j.celrep.2012.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pratt WB, Toft DO. Regulation of signaling protein function and trafficking by the hsp90/hsp70-based chaperone machinery. Exp Biol Med. 2003;228:111–133. doi: 10.1177/153537020322800201. [DOI] [PubMed] [Google Scholar]
- Queitsch C, Sangster TA, Lindquist S. Hsp90 as a capacitor of phenotypic variation. Nature. 2002;417:618–624. doi: 10.1038/nature749. [DOI] [PubMed] [Google Scholar]
- Rudiger S, Germeroth L, Schneider-Mergener J, Bukau B. Substrate specificity of the DnaK chaperone determined by screening cellulose-bound peptide libraries. EMBO J. 1997;16:1501–1507. doi: 10.1093/emboj/16.7.1501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sawaya MR, Kraut J. Loop and subdomain movements in the mechanism of Escherichia coli dihydrofolate reductase: crystallographic evidence. Biochemistry. 1997;36:586–603. doi: 10.1021/bi962337c. [DOI] [PubMed] [Google Scholar]
- Sharma SK, De los Rios P, Christen P, Lustig A, Goloubinoff P. The kinetic parameters and energy cost of the Hsp70 chaperone as a polypeptide unfoldase. Nat Chem Biol. 2010;6:914–920. doi: 10.1038/nchembio.455. [DOI] [PubMed] [Google Scholar]
- Sherman M, Goldberg AL. Involvement of the chaperonin dnaK in the rapid degradation of a mutant protein in Escherichia coli. EMBO J. 1992;11:71–77. doi: 10.1002/j.1460-2075.1992.tb05029.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Squires CL, Pedersen S, Ross BM, Squires C. ClpB is the Escherichia coli heat shock protein F84.1. J Bacteriol. 1991;173:4254–4262. doi: 10.1128/jb.173.14.4254-4262.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas JG, Baneyx F. Roles of the Escherichia coli small heat shock proteins IbpA and IbpB in thermal stress management: comparison with ClpA, ClpB, and HtpG In vivo. J Bacteriol. 1998;180:5165–5172. doi: 10.1128/jb.180.19.5165-5172.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tokuriki N, Tawfik DS. Chaperonin overexpression promotes genetic variation and enzyme evolution. Nature. 2009;459:668–673. doi: 10.1038/nature08009. [DOI] [PubMed] [Google Scholar]
- Wang Z, Feng H, Landry SJ, Maxwell J, Gierasch LM. Basis of substrate binding by the chaperonin GroEL. Biochemistry. 1999;38:12537–12546. doi: 10.1021/bi991070p. [DOI] [PubMed] [Google Scholar]
- Wegele H, Wandinger SK, Schmid AB, Reinstein J, Buchner J. Substrate transfer from the chaperone Hsp70 to Hsp90. J Mol Biol. 2006;356:802–811. doi: 10.1016/j.jmb.2005.12.008. [DOI] [PubMed] [Google Scholar]
- Weibezahn J, Tessarz P, Schlieker C, Zahn R, Maglica Z, Lee S, Zentgraf H, Weber-Ban EU, Dougan DA, Tsai FT, et al. Thermotolerance requires refolding of aggregated proteins by substrate translocation through the central pore of ClpB. Cell. 2004;119:653–665. doi: 10.1016/j.cell.2004.11.027. [DOI] [PubMed] [Google Scholar]
- Weinstock MT, Jacobsen MT, Kay MS. Synthesis and folding of a mirror-image enzyme reveals ambidextrous chaperone activity. Proc Natl Acad Sci USA. 2014;111:11679–11684. doi: 10.1073/pnas.1410900111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Welte MA, Tetrault JM, Dellavalle RP, Lindquist SL. A new method for manipulating transgenes: engineering heat tolerance in a complex, multicellular organism. Curr Biol. 1993;3:842–853. doi: 10.1016/0960-9822(93)90218-d. [DOI] [PubMed] [Google Scholar]
- Wiseman RL, Powers ET, Buxbaum JN, Kelly JW, Balch WE. An adaptable standard for protein export from the endoplasmic reticulum. Cell. 2007;131:809–821. doi: 10.1016/j.cell.2007.10.025. [DOI] [PubMed] [Google Scholar]
- Yosef I, Goren MG, Kiro R, Edgar R, Qimron U. High-temperature protein G is essential for activity of the Escherichia coli clustered regularly interspaced short palindromic repeats (CRISPR)/Cas system. Proc Natl Acad Sci USA. 2011;108:20136–20141. doi: 10.1073/pnas.1113519108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yura T, Guisbert E, Poritz M, Lu CZ, Campbell E, Gross CA. Analysis of σ32 mutants defective in chaperone-mediated feedback control reveals unexpected complexity of the heat shock response. Proc Natl Acad Sci USA. 2007;104:17638–17643. doi: 10.1073/pnas.0708819104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang X, Liu Y, Genereux JC, Nolan C, Singh M, Kelly JW. The heat shock response transcriptional program enables high yield and high quality recombinant protein production in Escherichia coli. ACS Chem Biol. 2014 doi: 10.1021/cb5004477. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






