Abstract
Protein folding is an essential prerequisite for proteins to execute nearly all cellular functions. There is a growing demand for a simple and robust method to investigate protein folding on a large‐scale under the same conditions. We previously developed a global folding assay system, in which proteins translated using an Escherichia coli‐based cell‐free translation system are centrifuged to quantitate the supernatant fractions. Although the assay is based on the assumption that the supernatants contain the folded native states, the supernatants also include nonnative unstructured proteins. In general, proteases recognize and degrade unstructured proteins, and thus we used a protease to digest the unstructured regions to monitor the folding status. The addition of Lon protease during the translation of proteins unmasked subfractions, not only in the soluble fractions but also in the aggregation‐prone fractions. We translated ∼90 E. coli proteins in the protease‐inclusion assay, in the absence and presence of chaperones. The folding assay, which sheds light on the molecular mechanisms underlying the aggregate formation and the chaperone effects, can be applied to a large‐scale analysis.
Keywords: protein folding; protein aggregation; cell‐free translation; protease, chaperone
Introduction
Proper protein folding is essential for protein functions to support cell viability. Protein folding is a spontaneous process, in which the amino acid sequence dictates the native structure.1 However, correct protein folding is often hampered by intermolecular aggregate formation, which usually impairs protein function.2 The risk of aggregation is increased in the crowded cellular environment, where a variety of chaperones has evolved to prevent aggregation.3
In the post‐genome era, understanding how proteins fold into the correct native structures is one of the fundamental questions in biology. Over the past half‐century, tremendous efforts have been made toward elucidating the mechanism of protein folding using a variety of approaches, including enzymatic activity, circular dichroism, NMR, protein engineering techniques, molecular dynamics simulations, and other approaches.4 This long history of the protein folding research has uncovered some details of the folding mechanism. However, only a small fraction of proteins has been investigated in‐depth.5 In general, the proteins with the ability to fold spontaneously have been extensively studied. Therefore, a large‐scale analysis of the protein folding including aggregation‐prone proteins is needed.
To expand the protein repertoire in folding studies, we previously conducted a global analysis of protein aggregation by using a reconstituted Escherichia coli‐based cell‐free translation system, called the PURE system.6 Since the PURE system does not contain chaperones,7 we could evaluate the inherent aggregation propensities of thousands of E. coli proteins in a translation‐coupled manner. We have extended the analysis to address a variety of topics, including the effects of chaperones on aggregation‐prone proteins,8 the solubilization of membrane proteins in the presence of liposomes,9 and the aggregation propensities of hundreds of eukaryotic proteins.10
In our previous large‐scale analyses,6, 8, 9, 10 we used the solubility, defined as the ratio of the supernatant fraction of the synthesized proteins after centrifugation to the uncentrifuged total protein, to estimate the degree of the translation‐coupled protein folding. This method directly evaluated the aggregation‐prone properties under fixed conditions but indirectly provided evidence about the folding state of the proteins in the supernatant fraction. For instance, some soluble proteins might exist in unstructured states. Indeed, we have previously shown that a fraction of proteins translated using the PURE system is soluble, but not functional.11, 12 Therefore, an alternative strategy is needed to gain deeper insight into the folding status in our assay. Measurements of protein functions, such as enzymatic activity, would provide the best evaluation of the folding status. However, this is unrealistic due to the diverse array of protein functions.
The second‐best strategy to monitor the folding status would be a protease sensitivity experiment. In general, unstructured proteins are susceptible to proteases, while stable folded structures are resistant.13, 14, 15 The protease susceptibility would provide hidden information about protein folding, when a protease is present during the translation.
Here we developed a novel assay system to evaluate the folding status after the translation of proteins, using a protease. The addition of chaperones, the GroEL and DnaK systems, to the protease‐inclusion assay revealed the differences between these chaperones, in terms of the folding completion. This method can be applied on a global scale.
Results
Lon protease can be used to assess the translation‐coupled folding status
To monitor the folding status in the translation‐coupled folding assay, we chose Lon protease16 from E. coli for the experiment after an initial screening. We first translated MetF, an extremely aggregation‐prone protein,6, 12 using the PURE system under chaperone‐free conditions in the presence of Lon [Fig. 1(a)]. When Lon was added after the translation (post‐translational addition), the total amount of MetF, which was mostly aggregated, was not significantly decreased under the tested conditions, indicating that Lon cannot degrade the protein aggregates after they are formed. In contrast, when Lon was included during the translation (co‐translational inclusion), we observed a dose‐dependent decrease in the total amount of MetF. These results show that the Lon‐mediated degradation of MetF competes with the aggregate formation.
Figure 1.

Assessment of the translation‐coupled protein folding by Lon protease (a) The degrees of the MetF protein degradation by Lon protease under the co‐translational inclusion and post‐translational addition conditions. The upper panel shows the autoradiogram of the SDS gel after the protein synthesis by the PURE system. One‐half of the protein solutions was centrifuged to collect the supernatant fraction (S). The uncentrifuged remainders (total fractions, T) and the supernatant fractions were separated by SDS‐PAGE. Since MetF has a strong tendency to form protein aggregates in the absence of chaperones, the synthesized proteins were predominantly detected in the total fraction. The lower panel shows the concentration dependence of the degradation by Lon protease in the total fraction under both conditions. (b) The assessment of the translation‐coupled protein folding by Lon protease in the absence of chaperones. The results of two model proteins, DHFR and MetK, are shown. The upper panel shows the autoradiogram of the SDS‐gel. The lower panel shows the standardized amounts of the synthesized proteins in the four fractions: Soluble and undegradative (Sol‐Undeg, blue), soluble and degradative (Sol‐Deg, light blue), aggregation‐prone and degradative (Agg‐Deg, light pink), and aggregation‐prone and undegradative (Agg‐Undeg, red). (c) A schematic illustration of the method. The proteins were individually expressed by the PURE system with and without Lon protease during the translation, and were separated by centrifugation after the translation reaction. The amount of the synthesized protein in each fraction (−Lon Total, −Lon Sup, +Lon Total, and + Lon Sup) was quantified by autoradiography after SDS‐PAGE. Then, the amount in the soluble and undegradative (Sol‐Undeg) fraction was calculated as the amount in the +Lon Sup fraction (in the above example, 20) and the amount in the aggregation‐prone and undegradative (Agg‐Undeg) fraction was calculated as the difference between the +Lon Total and +Lon Sup fractions (in the above example, 60 – 20 = 40). Likewise, the amount in the soluble and degradative (Sol‐Deg) fraction was calculated as the difference between the −Lon Sup and +Lon Sup fractions (in the above example, 50 – 20 = 30) and the amount in the aggregation‐prone and degradative (Agg‐Deg) fraction was calculated as the difference between the (−Lon Total – Lon Sup) and (+Lon Total – +Lon Sup) fractions (in the above example, (100 – 50) – (60 – 20) = 10).
We translated two other proteins, dihydrofolate reductase (DHFR), with over 50% spontaneous folding, and MetK, a moderately aggregation‐prone protein [Fig. 1(b)]. The band intensity of DHFR did not significantly change even with the co‐translational inclusion of Lon, suggesting that the rapid folding of DHFR allows it to resist Lon. This result confirmed that Lon cannot degrade a folded protein. In contrast, the total amount of MetK was drastically reduced in the presence of Lon, indicating again that the Lon effect competes with the aggregate formation of MetK. After the centrifugation of the Lon‐resistant MetK, approximately half of the MetK was soluble. The centrifugation assay revealed that the undegraded MetK was composed of two subfractions: the soluble folded protein (soluble and undegradative: Sol‐Undeg) and the protein aggregate (aggregation‐prone and undegradative: Agg‐Undeg).
Based on the above results, the co‐translational inclusion of Lon monitors the folding status of synthesized proteins. We can calculate the fraction of soluble and degradative proteins (Sol‐Deg) by subtracting the Sol‐Undeg fraction from the soluble fraction. Likewise, we defined the aggregation‐prone and degradative fraction (Agg‐Deg). The Sol‐Deg fraction would be partially folded states that are sensitive to Lon. Since the proteins in the Agg‐Undeg fraction quickly form protein aggregates and thus escape from Lon, the proteins in the Agg‐Deg fraction would be digested during the transition to the aggregate, reflecting the slower rate of aggregate formation. Taken together, the translation‐coupled folding assay in the presence of Lon [Fig. 1(c)] reveals the subcategories of synthesized proteins in terms of the folding status.
Large‐scale analysis of translation‐coupled folding using the Lon‐based assay under chaperone‐free conditions
Next, we analyzed 89 randomly chosen proteins, using the chaperone‐free PURE system in the presence of Lon during the translation. Among them, 76 proteins were evaluated for their folding efficiency and relative rate of the aggregate formation (Fig. 2). Most of the soluble fractions were resistant to the degradation by Lon (Sol‐Undeg) [Figs. 2 and 3(a), S1(A)], although there were some exceptions such as MetK [Fig. 1(b)]. These results suggest that the proteins in the soluble fraction obtained under the chaperone‐free conditions are mainly folded, or nearly folded, into the stable structure [Fig. 3(a)]. In contrast, the proportion of the Agg‐Undeg fraction to the all aggregation‐prone fraction was distributed widely, suggesting that the aggregate formation rates broadly differ among the proteins tested [Fig. 3(a)].
Figure 2.

Evaluation of the translation‐coupled protein folding in the absence of chaperones for 76 proteins. The standardized amounts of the 76 synthesized proteins in the four fractions, defined as in Fig. 1, are shown. The data were sorted by the solubilities evaluated in the absence of Lon.
Figure 3.

Statistical analysis of the translation‐coupled protein folding for several dozens of the proteins under the chaperone‐free conditions. (a) Distributions of the ratios of the Agg‐Undeg and the Sol‐Undeg fractions to the all aggregation‐prone fractions (Agg‐Undeg + Agg‐Deg) and all soluble fractions (Sol‐Undeg + Sol‐Deg), respectively. The solid line (red for Agg and blue for Sol) indicates the distribution calculated by the Kernel density estimation. (b) Comparison between the apparent relative rate of the aggregate formation and the relative contents of specific amino acids. In this comparison, the ratio of the aggregation‐prone and undegradative fraction to the all aggregation‐prone fraction was used for the index of the aggregate formation rates. The value ρ indicates Spearman's rank correlation coefficient (*P < 0.05).
Statistical analyses were performed to investigate which properties of the proteins determine the rate of aggregate formation. The molecular weight and the deduced isoelectric point did not correlate with the proportion of the Agg‐Undeg formation [Fig. S1(B)]. However, several differences were observed in the compositions of the amino acid residues [Fig. 3(b)]. The aromatic residue content (Phe, Tyr, and Trp) showed a negative correlation [Fig. 3(b)]. In contrast, the contents of residues with positive charges (Lys, Arg, and His) and hydrophobic residues (Val, Leu, and Ile) correlated positively with this proportion, although the correlations of these two properties were neither strong nor significant [Fig. 3(b)]. These results suggest that the amino acid composition influences the rate of aggregate formation to a certain degree.
Lon‐based assay reveals differences in chaperone effects
Two E. coli chaperone sets, GroEL/GroES (GroEL/ES)3, 17 and DnaK/DnaJ/GrpE (DnaKJE),3, 18, 19 globally increase the solubility of aggregation‐prone proteins in the PURE system‐based folding assay.8 The increased supernatant fraction does not always mean that the protein has adopted the native state. Indeed, we previously observed that DnaKJE drastically improved the solubility of MetK, but did not increase the enzymatic activity,12 suggesting that the protein in supernatant fraction solubilized by DnaKJE exist in a nonnative state. We applied the assay system using Lon to monitor the folding status of proteins translated by the PURE system. We translated DHFR and MetK in the presence of either GroEL/ES or DnaKJE, with or without Lon [Fig. 4(a)]. The soluble fraction of DHFR was slightly increased in the presence of both GroEL/ES and DnaKJE [Fig. 4(b)]. In the DnaKJE assistance, we found only a slight increase of the Sol‐Deg fraction for DHFR [Fig. 4(b)], suggesting that some of the soluble fraction is in a nonnative state. The subfraction in the soluble fraction, revealed by Lon, was strikingly different in the folding of MetK. GroEL/ES contributed to the folding of MetK by increasing the Sol‐Undeg fraction, while DnaKJE prevented the aggregate formation of MetK by drastically increasing the Sol‐Deg fraction [Fig. 4(b)]. For MetK, the soluble fraction in the DnaKJE assistance would not adopt the native state. Instead, an unstructured form of MetK would be in the soluble fraction. These results with MetK are in good agreement with our previous observations that DnaKJE improved the solubility of MetK but could not recover its enzymatic activity, while GroEL/ES can both prevent the aggregate formation and restore its activity.12
Figure 4.

Evaluation of the translation‐coupled protein folding in the presence of the GroEL/GroES or DnaK/DnaJ/GrpE chaperone for two model proteins. (a) The autoradiogram of the gel electropherogram after the synthesis by the PURE system and subsequent centrifugation. (b) The standardized amounts of the synthesized proteins in the four fractions, defined as in Figure 1. For comparison, the data obtained under the chaperone‐free condition, which is the same as depicted in Figure 1(b), are shown together.
Chaperone effects on the translation‐coupled folding of a variety of proteins
We next evaluated the translation‐coupled chaperone‐assisted protein folding for several dozens of proteins (Figs. 5 and S2). As expected, most proteins were solubilized by each chaperone in the absence of Lon [Fig. S3(A)]. In addition, the distribution of the proportion of the Sol‐Undeg fraction in the presence of GroEL/ES was biased higher, whereas the proportion in the presence of DnaKJE was distributed widely [Figs. S2 and 6(a)]. These results suggest that GroEL/ES tends to protect against the degradation by Lon while facilitating substrates folding, while DnaKJE tends to assist the folding, exposing the substrates to the risk of degradation. Another plausible explanation is that the folding assistance by DnaKJE is simply slower than that by GroEL/ES.
Figure 5.

Evaluations of the translation‐coupled protein folding in the presence of the GroEL/GroES or DnaK/DnaJ/GrpE chaperone for dozens of proteins. The standardized amounts of the synthesized proteins in the four fractions under the chaperone‐free condition (−), GroEL/ES‐added condition (E), and DnaK/DnaJ/GrpE‐added condition (K) are shown. The data obtained under the chaperone‐free condition are the same as depicted in Figure 2.
Figure 6.

Statistical analysis of the translation‐coupled protein folding in the presence of the GroEL/GroES or DnaK/DnaJ/GrpE chaperone. (a) Distributions of the ratio of the Sol‐Undeg fraction to the all soluble fractions under the GroEL/GroES‐added (upper) and the DnaK/DnaJ/GrpE‐added (lower) conditions. The solid blue line indicates the distribution calculated by the Kernel density estimation. (b) Comparisons between the effects of each chaperone on the translation‐coupled folding and the various properties of the proteins. The ratio of the Sol‐Undeg fraction to all soluble fractions was used for the index of the chaperone effects. The value ρ indicates Spearman's rank correlation coefficient (*P < 0.05).
To investigate the relationship between the chaperone effects and the various properties of proteins, we defined the proportion of the Sol‐Undeg fraction as the effect of the chaperones and performed statistical analyses between them. The comparison of protein sizes, defined as molecular weights showed that the effect of DnaKJE tended to become weaker as the protein sizes increased [Fig. 6(b)]. In addition to the fact that DnaKJE tends to prefer larger proteins, as investigated in our previous report,8 our observations suggest that DnaKJE can solubilize larger proteins but tends to have a weaker effect on the folding completion, although this might be a consequence of the simple property that larger proteins are generally more difficult to fold. Similarly, deduced isoelectric point showed a negative correlation with the effect of GroEL/ES [Fig. 6(b)], suggesting the biased preference of GroEL/ES, as mentioned in our previous report.8 In addition, the hydrophobic residue content (Val, Leu, and Ile) correlated weakly but positively with both chaperone effects (although the correlation with the effect of GroEL/ES was not significant), suggesting that the hydrophobicity is one of the key factors for the chaperone‐assisted folding by these two chaperones [Fig. 6(b)]. The higher hydrophobicity may lead to a stronger interaction with the chaperones, or facilitate the formation of the hydrophobic core for faster protein folding. The contents of other amino acids did not show such a significant correlation [Fig. S3(B)].
Discussion
To expand the examined protein repertoire to elucidate the mechanism of protein folding, we previously developed an assay system to evaluate the aggregation‐prone properties of thousands of proteins, individually translated by the PURE system.6, 8, 9, 10 In this report, we incorporated a protease to monitor the folding status during protein translation. The analysis is an extension of our previous assay,6, 8, 9, 10 in which the aggregation propensities of proteins are evaluated by centrifugation to estimate the amount in the soluble fraction. The E. coli Lon protease digests the unfolded fractions even in the supernatant, providing a new layer of information about the folding status in the test tube. Although this is a proof‐of‐concept experiment to show the feasibility of this approach for the application of the method on a global scale, the statistical analysis has implications for the rate of aggregate formation and the differences in the chaperone effects.
The present method could be compared to a “pulse‐proteolysis” experiment,13, 14, 15 which is a method to determine the global stability of proteins by exploiting the difference in the protease susceptibilities between the folded and unfolded states. In terms of using a protease to monitor the folding status, these two methods are similar but have several differences. In the pulse‐proteolysis experiment, the proteolysis is used to estimate the folded fraction by adding the protease for a short time (“pulse”). Therefore, the pulse proteolysis assay is suitable to evaluate the thermodynamic stability of proteins of interest in detail. In contrast, the method described here includes a protease during the translation. Folding and the aggregate formation compete with the protease attack, thus providing information about the folding kinetics. Therefore, these two methods are complementary, depending on the purpose of the study.
Although the current method is simple and can be extended to a large‐scale analysis, there are several limitations. First, because we conducted the analysis under the same conditions, alterations in the experimental conditions, such as the concentration of Lon and the centrifugation speed, would cause some differences. Second, although the specificity of Lon protease is low, it might have some biases in protein degradation. Related to this, a protease can digest the unstructured regions in native proteins. Therefore, although the method can in principle be applied to any protein of interest, we have to carefully consider the use of the assay in the folding of eukaryotic proteins, because eukaryotic proteomes contain a non‐negligible amount of intrinsically disordered regions.20, 21, 22
The DnaKJE and GroEL/ES chaperones globally increased the solubility of proteins by preventing the intermolecular aggregate formation. However, there was a striking difference in the effects between GroEL/ES and DnaKJE in our assay. The DnaKJE‐solubilized fractions had a large portion of the Lon protease‐sensitive fraction (Sol‐Deg), whereas in the presence of GroEL/ES almost all of the soluble proteins were resistant to the protease (Sol‐Undeg). One possible explanation is that the folding assisted by DnaKJE might require a longer time for completion, as compared to that assisted by GroEL/ES, resulting in the accumulation of protease‐susceptible fractions. Alternatively, the difference could be explained by the action mechanism of the chaperones. DnaK and DnaJ, which both do not form higher‐order oligomers, associate with the nonnative protein in a form that is exposed to the bulk solution,3, 18, 19 where the nonnative protein is susceptible to the protease. In contrast, the tetradecameric GroEL, which has a double ring structure, can accommodate the nonnative proteins inside a cavity formed when GroES binds to GroEL in the presence of ATP.3, 17 The nonnative protein inside the cavity would be protected from an outside protease, and thus could complete the folding. The protection of translated proteins from Lon protease in the presence of GroEL/ES suggests that GroEL associates with newly translated proteins in a co‐translational manner, and then promptly accommodates the nascent proteins inside the cavity with the GroES lid. This co‐translational role of GroEL is consistent with our previous reports.12, 23
The rate of aggregate formation and the chaperone effects were weakly correlated with the content of specific amino acids, especially hydrophobic amino acids. These results suggest that the hydrophobic interaction is one of the key factors affecting aggregation formation and chaperone recognition. Our previous global analysis revealed that the structural parameters, such as the SCOP fold,24 contribute to the aggregation propensities.6 However, we cannot analyze the data in terms of such structural properties, due to the small number of the proteins investigated here.
Although the objective of the current assay is to monitor the folding status in test tubes, there might be some implications in the cell. Regarding the concentration of Lon and other components, the Lon concentration relative to the ribosome concentration is expected to be much higher in this analysis than in the in vivo situation: the molar ratio between hexameric Lon and the ribosome was 1:5 in this assay and is 1:∼200 in cells.25 Moreover, several kinds of chaperones, including both GroEL/ES and DnaKJE, exist together in vivo. These facts imply that the folding process with or without the aid of chaperones predominantly occurs relative to the degradation by Lon protease in the in vivo situation. However, the situation for protein–protein interaction may also be drastically different in the cellular environment including the molecular crowding effects.26 Therefore, we should consider our investigation as an in vitro folding analysis, specialized for the translation‐coupled de novo folding in vitro.
In summary, the method described here is a unique approach to investigate protein folding, using a cell‐free translation system combined with a protease. Since the protease recognizes and then degrades unstructured proteins, the protease addition experiment provided us with information on the folding status of synthesized proteins, which cannot be obtained by centrifugation assay we have developed. The inclusion of chaperones in this approach unveiled the hidden features of the GroEL and DnaK systems. The dataset obtained under the same experimental conditions, which is available at figshare repository (https://doi.org/10.6084/m9.figshare.7996346), would be invaluable and thus suitable to analyze the data with bioinformatics, contributing to the elucidation of the folding mechanism of proteins coupled with translation.
Materials and Methods
Protein purification
For the purification of Lon protease, we added 6× His‐tag sequence (CACCATCACCATCACCAT) to the pLon plasmid vector27 at the C‐terminal end of the open reading frame (ORF) of lon by QuickChange site‐directed mutagenesis. Escherichia coli BL21(DE3) cells harboring the modified pLon plasmid were grown at 37°C in 2 × YT broth with 100 μg/mL ampicillin. At log phase, the protein expression was induced by 1 mM isopropyl β‐d‐thiogalactopyranoside at 25°C. The cells were harvested and disrupted by sonication in buffer A (20 mM HEPES‐KOH [pH 7.5], 400 mM NaCl, 20% glycerol, and 3.5 mM β‐mercaptoethanol). The lysate was centrifuged at 15,700 rpm (30,000g) for 45 min with JA‐30.50Ti rotor (Beckman Coulter). After the centrifugation, the supernatant fraction was collected and applied to Ni‐NTA Superflow resin (Quiagen). After a wash with buffer A containing 100 mM imidazole, the 6×His‐tagged proteins were eluted with buffer A containing 300 mM imidazole. After dialysis against buffer A, the proteins were concentrated by ultrafiltration with 50 kDa cut‐off Amicon Ultra filter (Merck Millipore). The concentration of the purified protein was quantified with a Bradford protein assay kit (Bio‐Rad).
All of the factors in the PURE system, including translational factors and ribosomes, and all of the chaperones (GroEL, GroES, DnaK, DnaJ, and GrpE) were purified according to the previous reports.7, 8
Construction of the template DNA for expression by the PURE system
For the expression of the three model proteins (MetF, DHFR, and MetK), each gene (metF, folA, and metK) was cloned into the modified pET15b vector that lacks the 6×His sequence at the N‐terminus. The fragment containing each ORF was amplified by PCR with the T7p primer (CGCGAAATTAATACGACTCACTATAGGG) and the T7t primer (GCTAGTTATTGCTCAGCGG).
For the expression of the other 89 proteins, we used the ASKA plasmid set28, 29 as described in the previous report.6 The fragment containing each ORF was amplified by PCR with primer1 (GGCCTAATACGACTCACTATAGGAGAAATCATAAAAAATTTATTTGCTTTGTGAGCGG) and primer2 (GTTATTGCTCAGCGGTTAGCGGCCGCATAGGCC), as described previously.6
Protein synthesis by the PURE system and investigation of the translation‐coupled protein folding
The transcription‐translation coupled reaction by the PURE system, including [35S] methionine, was performed at 37°C for 1 h. The ribosome concentration was 1 μM. The concentration of Lon protease under the +Lon conditions was set to 0.2 μM (hexamer), unless noted otherwise. Under the post‐translational addition conditions, the translation reaction without Lon was performed at 37°C for 1 h and stopped by the addition of 0.4 mg of RNaseA per 20 μL reaction. Lon was then added into the reaction mixture at each concentration and incubated at 37°C for 1 h. The concentrations of each chaperone under the chaperone‐added conditions were as follows: DnaK, DnaJ, and GrpE: 5.0 (monomer), 2.0 (monomer), and 2.0 (monomer) μM, respectively, GroEL and GroES: 0.5 (tetradecamer) and 1.0 (heptamer) μM, respectively. After the reaction, an aliquot was withdrawn as the total fraction (T) and the remainder was centrifuged at 20,000g for 15 min, and the supernatant fraction (S) was collected. The total and supernatant fractions were separated by sodium dodecyl sulfate‐polyacrylamide gel electrophoresis (SDS‐PAGE), and the synthesized proteins in each fraction were quantified by autoradiography with an FLA7000 image analyzer and the Multi Gauge software (Fujifilm, Japan). The amounts of the synthesized proteins in the four fractions (Sol‐Undeg, Sol‐Deg, Agg‐Deg, and Agg‐Undeg) were calculated as described in Figure 1(c) and its legend. The proteins with extremely low intensities or abnormally high solubility (>120%) were excluded from the analysis. Similarly, the proteins for which the sum of the relative amounts in the four fractions (Sol‐Undeg, Sol‐Deg, Agg‐Deg, and Agg‐Undeg) were over 1.2 were also omitted. Finally, 76 proteins could be quantified under the chaperone‐free conditions, and 63 and 58 proteins could be quantified under the GroEL/GroES‐added and DnaK/DnaJ/GrpE‐added conditions, respectively. All of the quantified proteins were deposited to a public repository site, figshare repository:
Data analysis and repository
The molecular weight and the deduced isoelectric point were calculated as described previously.6 The relative contents of amino acids were calculated from their amino acid sequences with an in‐house script. All the statistical analyses were conducted with the R software (version 3.5.1; http://www.R-project.org). The dataset is available at figshare repository: https://doi.org/10.6084/m9.figshare.7996346
Supporting information
Figure S1 Statistical analysis of the translation‐coupled protein folding for several dozens of the proteins under the chaperone‐free conditions. (A) Histograms of the relative amounts of the synthesized proteins in the aggregation‐prone and undegradative (Agg‐Undeg), the aggregation‐prone and degradative (Agg‐Deg), the soluble and degradative (Sol‐Deg), and the soluble and undegradative (Sol‐Undeg) fractions. (B) Comparisons between the apparent relative rate of the aggregate formation and the molecular weight and the deduced isoelectric point. The ratio of the aggregation‐prone and undegradative (Agg‐Undeg) fraction to all aggregation‐prone fractions was used for the index of the rate of the aggregate formation. The value ρ indicates Spearman's rank correlation coefficient.
Figure S2 Evaluations of the translation‐coupled protein folding in the presence of the GroEL/GroES or DnaK/DnaJ/GrpE chaperone for dozens of proteins. The standardized amounts of the synthesized proteins in the four fractions are shown. The upper and the lower panels show the results in the presence of GroEL/GroES and DnaK/DnaJ/GrpE, respectively. The data were sorted by the solubilities evaluated in the absence of Lon.
Figure S3 Statistical analysis of the translation‐coupled protein folding in the presence of GroEL/GroES or DnaK/DnaJ/GrpE chaperone. (A) Distribution of the solubility in the absence of Lon protease under each chaperone condition. The blue solid line indicates the distribution calculated by the Kernel density estimation. (B) Comparison between the effect of each chaperone on the translation‐coupled folding and the contents of specific amino acids. The ratio of the soluble and undegradative (Sol‐Undeg) fraction to all soluble fractions was used for the index of the chaperone effects. The value ρ indicates Spearman's rank correlation coefficient.
Acknowledgments
We thank Rudi Glockshuber for providing the Lon‐expression plasmid, the Bio‐support Center at Tokyo Tech for DNA sequencing, and The National BioResource Project, E. coli (National Institute of Genetics, Japan) for providing the ASKA library clones. This work was supported by MEXT Grant‐in‐Aid for Scientific Research (Grant Nos. 26116002 to HT and 24115504 to TN).
References
- 1. Anfinsen BC (1973) Principles that govern the folding of protein chains. Science 181:223–230. [DOI] [PubMed] [Google Scholar]
- 2. Dobson CM (2003) Protein folding and misfolding. Nature 426:884–890. [DOI] [PubMed] [Google Scholar]
- 3. Balchin D, Hayer‐Hartl M, Hartl FU (2016) In vivo aspects of protein folding and quality control. Science 353:aac4354. [DOI] [PubMed] [Google Scholar]
- 4. Dill KA, MacCallum JL (2012) The protein‐folding problem, 50 years on. Science 338:1042–1046. [DOI] [PubMed] [Google Scholar]
- 5. Braselmann E, Chaney JL, Clark PL (2013) Folding the proteome. Trends Biochem Sci 38:337–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Niwa T, Ying B‐W, Saito K, Jin W, Takada S, Ueda T, Taguchi H (2009) Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins. Proc Natl Acad Sci U S A 106:4201–4206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Shimizu Y, Inoue A, Tomari Y, Suzuki T, Yokogawa T, Nishikawa K, Ueda T (2001) Cell‐free translation reconstituted with purified components. Nat Biotechnol 19:751–755. [DOI] [PubMed] [Google Scholar]
- 8. Niwa T, Kanamori T, Ueda T, Taguchi H (2012) Global analysis of chaperone effects using a reconstituted cell‐free translation system. Proc Natl Acad Sci U S A 109:8937–8942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Niwa T, Sasaki Y, Uemura E, Nakamura S, Akiyama M, Ando M, Sawada S, Mukai S, Ueda T, Taguchi H, Akiyoshi K (2015) Comprehensive study of liposome‐assisted synthesis of membrane proteins using a reconstituted cell‐free translation system. Sci Rep 5:18025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Uemura E, Niwa T, Minami S, Takemoto K, Fukuchi S, Machida K, Imataka H, Ueda T, Ota M, Taguchi H (2018) Large‐scale aggregation analysis of eukaryotic proteins reveals an involvement of intrinsically disordered regions in protein folding. Sci Rep 8:678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Ying BW, Taguchi H, Ueda H, Ueda T (2004) Chaperone‐assisted folding of a single‐chain antibody in a reconstituted translation system. Biochem Biophys Res Commun 320:1359–1364. [DOI] [PubMed] [Google Scholar]
- 12. Ying BW, Taguchi H, Kondo M, Ueda T (2005) Co‐translational involvement of the chaperonin GroEL in the folding of newly translated polypeptides. J Biol Chem 280:12035–12040. [DOI] [PubMed] [Google Scholar]
- 13. Park C, Marqusee S (2005) Pulse proteolysis: a simple method for quantitative determination of protein stability and ligand binding. Nat Methods 2:207–212. [DOI] [PubMed] [Google Scholar]
- 14. Mallam AL, Jackson SE (2012) Knot formation in newly translated proteins is spontaneous and accelerated by chaperonins. Nat Chem Biol 8:147–153. [DOI] [PubMed] [Google Scholar]
- 15. Jensen MK, Samelson AJ, Marqusee S, Soto RA, Cate JHD (2016) Quantitative determination of ribosome nascent chain stability. Proc Natl Acad Sci U S A 113:13402–13407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Van Melderen L, Aertsen A (2009) Regulation and quality control by Lon‐dependent proteolysis. Res Microbiol 160:645–651. [DOI] [PubMed] [Google Scholar]
- 17. Taguchi H (2015) Reaction cycle of chaperonin GroEL via symmetric “football” intermediate. J Mol Biol 427:2912–2918. [DOI] [PubMed] [Google Scholar]
- 18. Mayer MP, Gierasch LM (2018) Recent advances in the structural and mechanistic aspects of Hsp70 molecular chaperones. J Biol Chem 294:2085–2097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Genevaux P, Georgopoulos C, Kelley WL (2007) The Hsp70 chaperone machines of Escherichia coli: a paradigm for the repartition of chaperone functions. Mol Microbiol 66:840–857. [DOI] [PubMed] [Google Scholar]
- 20. Gsponer J, Futschik ME, Teichmann SA, Babu MM (2008) Tight regulation of unstructured proteins: from transcript synthesis to protein degradation. Science 322:1365–1368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Uversky VN (2015) Functional roles of transiently and intrinsically disordered regions within proteins. FEBS J 282:1182–1189. [DOI] [PubMed] [Google Scholar]
- 22. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337:635–645. [DOI] [PubMed] [Google Scholar]
- 23. Ying BW, Taguchi H, Ueda T (2006) Co‐translational binding of GroEL to nascent polypeptides is followed by post‐translational encapsulation by GroES to mediate protein folding. J Biol Chem 281:21813–21819. [DOI] [PubMed] [Google Scholar]
- 24. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540. [DOI] [PubMed] [Google Scholar]
- 25. Ishihama Y, Schmidt T, Rappsilber J, Mann M, Harlt FU, Kerner MJ, Frishman D (2008) Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9:102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Rivas G, Minton AP (2016) Macromolecular crowding in vitro, in vivo, and in between. Trends Biochem Sci 41:970–981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Fischer H, Glockshuber R (1993) ATP hydrolysis is not stoichiometrically linked with proteolysis in the ATP‐dependent protease La from Escherichia coli. J Biol Chem 268:22502–22507. [PubMed] [Google Scholar]
- 28. Kitagawa M, Ara T, Arifuzzaman M, Ioka‐Nakamichi T, Inamoto E, Toyonaga H, Mori H (2005) Complete set of ORF clones of Escherichia coli ASKA library (a complete set of E. coli K‐12 ORF archive): unique resources for biological research. DNA Res 12:291–299. [DOI] [PubMed] [Google Scholar]
- 29. Riley M, Abe T, Arnaud MB, Berlyn MKB, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett G III, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D, Wanner BL (2006) Escherichia coli K‐12: a cooperatively developed annotation snapshot‐2005. Nucleic Acids Res 34:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1 Statistical analysis of the translation‐coupled protein folding for several dozens of the proteins under the chaperone‐free conditions. (A) Histograms of the relative amounts of the synthesized proteins in the aggregation‐prone and undegradative (Agg‐Undeg), the aggregation‐prone and degradative (Agg‐Deg), the soluble and degradative (Sol‐Deg), and the soluble and undegradative (Sol‐Undeg) fractions. (B) Comparisons between the apparent relative rate of the aggregate formation and the molecular weight and the deduced isoelectric point. The ratio of the aggregation‐prone and undegradative (Agg‐Undeg) fraction to all aggregation‐prone fractions was used for the index of the rate of the aggregate formation. The value ρ indicates Spearman's rank correlation coefficient.
Figure S2 Evaluations of the translation‐coupled protein folding in the presence of the GroEL/GroES or DnaK/DnaJ/GrpE chaperone for dozens of proteins. The standardized amounts of the synthesized proteins in the four fractions are shown. The upper and the lower panels show the results in the presence of GroEL/GroES and DnaK/DnaJ/GrpE, respectively. The data were sorted by the solubilities evaluated in the absence of Lon.
Figure S3 Statistical analysis of the translation‐coupled protein folding in the presence of GroEL/GroES or DnaK/DnaJ/GrpE chaperone. (A) Distribution of the solubility in the absence of Lon protease under each chaperone condition. The blue solid line indicates the distribution calculated by the Kernel density estimation. (B) Comparison between the effect of each chaperone on the translation‐coupled folding and the contents of specific amino acids. The ratio of the soluble and undegradative (Sol‐Undeg) fraction to all soluble fractions was used for the index of the chaperone effects. The value ρ indicates Spearman's rank correlation coefficient.
