Abstract
The spatial structure of proteins, largely determined by their amino acid sequences, is also dependent on the environmental conditions under which the folding process takes place. In aqueous environments, exposure of polar amino acids is the driving factor, whereas protein stabilization in amphipathic membranes requires exposure to hydrophobic residues. This observation can be extended to all other environmental conditions under which proteins exhibit biological activity and, most importantly, to the folding process. The fuzzy oil drop (FOD) model assumes a centric location of hydrophobic residues (hydrophobic core) with exposure of polar residues towards the aqueous environment, as the influence of the aqueous environment is extended to include the contribution of other non-aqueous factors, enabling the assessment of their influence on protein structuring. The application of the modified FOD model (FOD-M) we have developed allows the environment to be represented as an external force field in the form of a continuum. The role of environmental conditions allows modification of the funnel model expressing the localization of the energy minimum as dependent on external conditions expressed by the K scale, where K measures the degree of other than polar water factors participating in folding process.
Keywords: Funnel model, Protein folding, External force field, Membrane proteins, Chaperone, Chaperonin, Hydrophobicity, Down-hill, Fast-folding, Enzymes, in silico analysis
Graphical Abstract
1. Introduction
In terms of the fundamental process of protein folding, the following question remains: Why do they fold the way they do? The general approach to predicting protein structures based on optimization of internal interactions (force field) focuses on the search for structures representing the minimum internal energy state. Combined efforts to apply the best procedures in the WeFold project have not yielded significant progress; the results are comparable to those obtained by individual teams [1]. Similar conclusions concerning the molecular mechanics of protein folding have been drawn from a diversity of interpretations [2], [3], [4].
An obvious prominent aspect of this analysis is the introduction of artificial intelligence (AI) to the AlphaFold model, which has brought significant advances to the field [5]. Protein structure prediction using available and continuously developed computer programs based on ab initio (new fold) [6] and comparative analysis [7] techniques has recently been extended to include tools based on artificial intelligence [8], [9].
The progress seen in the history of Critical Assessment of Structure Prediction (CASP), a research community collaborative [10], raises the problem of variation in the degree of correctness of structure prediction: a given tool can yield a very good result in one case and a very poor result for another protein. It can be assumed that it is the proteins themselves that vary (CASP) project distinguishes ‘easy’ and ‘hard’ proteins [11]). In the present study, we aimed to answer these questions by introducing environmental conditions that direct the folding process differently.
It is a truism that all life (mainly consisting of proteins) is inherently linked to the environment of water, the properties of which are still not fully understood (including the atypical variation in density related to temperature). In fact, only the structure of ice, that is, the solid phase of water, is known to a satisfactory degree.
An aqueous environment is a polar environment that favors interactions with polar systems. An object that is particularly ‘foreign’ to water is a hydrophobic object; thus, the mechanisms operating at the interface of phases (water/hydrophobic surface in particular) is of special interest [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24].
In the fuzzy oil drop (FOD) model, water provides a specific external force field as a continuum [25]. This assumes that the folding process tends towards the maximum possible isolation of hydrophobic residues by concentrating them at the center of the protein molecule with the simultaneous exposure of polar residues. This produces entropically and enthalpically favorable contact with polar water. This phenomenon occurs when micelles composed of bipolar molecules are formed. The mixture of several bipolar molecules leads to the formation of co-micelles. The arrangement of the micelles is highly ordered, producing full surface coverage with polar groups and making the product highly soluble in aqueous environments. These structures lack specific interactions. However, the presence of proteins with a highly ordered distribution of hydrophobicity—based on the micelle-like model—as well as proteins with high solubility and no activity, is important and even critical for the life of some organisms.
The 20 amino acids can be treated as a set of 20 bipolar molecules with varying ratios of polar to hydrophobic parts. In addition, if a significant reduction in the degrees of freedom is introduced in the form of a restriction on free mobility by the present peptide bond, the formation of an optimal micelle-like arrangement is difficult in a polypeptide chain. Nevertheless, micelle-like protein structures have been identified, confirming the validity of the proposed model.
2. Model development
2.1. Expressing the role of water as a basic environment for protein activity
Micelle-like structures can be expressed by a 3D Gaussian function representing the distribution of hydrophobicity, with a maximum at the center and a polarity at the surface.
The protein molecule is encapsulated in an ellipsoid of a particular size and shape (Fig. 1).
Fig. 1.
Protein molecule encapsulation in 3D Gauss function. The graphic representation of σ parameters is shown for each axis. The values of σ parameters follow the three sigma rule: the longest distance (position of particular effective atom – average position of atoms belonging to certain residue) is expressed as 3σ. The polar surface is shown here in red and the hydrophobic core in blue and green.
The size is expressed by the Gaussian function parameter σ. In the 3D Gaussian function, the size with respect to the x, y, and z directions (axes) is defined (Fig. 1). Therefore, the value of the 3D Gaussian function encapsulating the body of the protein (proteins are predominantly globular) can determine the level of idealized hydrophobicity at any point in the protein (at the positions of effective atoms, in particular, the effective atom):
| (1) |
where xi, yi, and zi are the coordinates that express the position of i-th effective atom. The size and shape of the protein are expressed by appropriately chosen values of the parameters σX, σY, and σZ (Fig. 1) where N denotes the number of amino acids in the protein.
The value of the 3D Gaussian function (Ti) assigned to the position of the effective atom expresses the expected level, assuming an idealized micelle-like distribution. However, the actual hydrophobicity distribution may differ depending on the intrinsic hydrophobicity of each amino acid and the distance between interacting residues. This interaction can be expressed by the following model [26]:
| (2) |
where rij is the distance between the positions of the effective atoms with c= 9 Ǻ (cutoff distance) and Hr is the intrinsic hydrophobicity (any scale can be applied). This equation represents the empirical function O, meaning that there is no explanation for the form of the function as introduced by M. Levitt [26]. To compare the distributions T and O, both shall be normalized (sum of all Ti and Oi, respectively, shall be equal to 1.0). This is represented by the first factor in (1), (2). After this operation, both distributions can be compared using divergence entropy [27].
| (3) |
The DKL value expresses entropy that is not interpretable (the index KL stands from Kullback-Leibler). The introduction of a reference distribution that is opposite to the distribution with a central concentration of hydrophobicity (hydrophobic core), in the form of a uniform distribution of hydrophobicity throughout the protein, makes possible the comparable analysis. All effective atoms were assigned an equal level of hydrophobicity, Ri= 1/N, where N is the number of amino acids in the polypeptide chain.
Determining the DKL values for the O|T and O|R relationships allows for an assessment of the degree of similarity of the O distribution to either the T or R distribution. The relationship between them can be expressed by relative distance (RD) in the following manner:
| (4) |
A value of RD < 0.5 indicates a distribution close to that of a hydrophobic core. Otherwise, the protein structure is characterized as lacking the presence of a hydrophobic core. The RD parameter takes real values in the interval [0,1] continuously. Thus, comparative assessment of a set of proteins from the perspective of hydrophobicity distribution within the protein body is possible.
The RD parameter can be determined for a single chain—as well as for a complex containing any number of chains—by spanning a 3D Gaussian function over the entire complex. It is also possible to determine the status of a selected fragment of a structure, such as a chain within a given complex (or a domain as part of a single chain). In such cases, the selected fragment described by the values Ti, Oi, and Ri after normalizing these values for the selected section, can be described by the value RD for the specific fragment. This value determines the extent to which the selected fragment participates locally in the generation of a hydrophobic core or to what extent it locally disrupts this type of ordering.
Ti, Oi, and Ri express the levels of hydrophobicity attributed to the ith residue. The symbols T, O, and R denote the types of distributions under consideration.
The RD parameter can also be determined by stepwise elimination of selected amino acids; for example, amino acids showing a high divergence between Ti and Oi. By eliminating such residues (e.g., in the case of RD > 0.5) until an RD < 0.5 is obtained, it is possible to identify the part of the protein that represents an ordering that matches the micelle-like arrangement. This is important, for example in determining protein solubility. Residues with a large difference between Ti and Oi can be treated as local disorder and eliminated. The next question concerns the cause and effect of this disorder. A local disturbance of micelle-like ordering carries encoded information. Indeed, residues with local hydrophobicity deficits are often catalytic residues located in the substrate-binding cavities [28]. Local excess hydrophobicity may define the site for a potential interaction leading to the formation of a complex and indicates the components of the interface. Permanent and nonpermanent complexes should be distinguished. Permanent complexes are structured as described by the model, whereas non-permanent complexes represent the ordering expressed by higher RD values for monomers as well as for the interface. This is observed for example in tubulin, the folding of which requires lower level the water influence on its structuralization [29].
The elimination of well-defined residues (the source of high RD values, RD > 0.5) lowers the RD value. This procedure allows for the identification of a part of the protein ordered according to the micelle-like organization responsible for protein solubility. Intrinsically disordered proteins identified as lacking secondary structure order appear surprisingly highly ordered based on their hydrophobic distribution [30]. In basic biochemistry, the presence of a hydrophobic core is considered a factor in tertiary structure stabilization. The stability of these proteins is dependent on their surrounding environment. Changes in the external conditions make them mobile and dynamic. It is likely that their biological activities require such specificity.
2.2. Expressing the role of membrane environment influencing the structuralization of membrane proteins
The cell membrane is a separate environment for protein activity. In contrast to the aqueous environment, in the cell membrane the exposure of hydrophobic residues is expected to stabilize the molecule through interaction with the hydrophobic membrane. Thus, the distribution of hydrophobicity (especially in the presence of a channel within the transmembrane protein) may be represented by a function complementary to the 3D Gaussian function in the form of 1-Ti. In practice, the following function is used:
| Mi= TT MAX– Ti· | (5) |
Where TMAX expresses maximal value as found in T distribution. The analysis of many transmembrane proteins suggests that the distribution of hydrophobicity can be expressed as follows:
| Mi = Ti + [K * [TMAX– Ti]n]n | (6) |
where parameter K determines the extent to which the 3D Gaussian function (Ti) is modified by a function complementary to the 3D Gaussian function (Eq. 5), which represents the influence of the hydrophobicity factor. The index n denotes normalization. The M function expresses this type of modified T distribution, representing the effect of factors other than polar water (particularly hydrophobic factors). The M distribution denotes the external force field, which is reproduced by the folding process of the protein adapting to the external water conditions modified by other factors. Eq. 6 forms the basis of the FOD-M model (modified FOD model) [28]. As can be seen in Fig. 2, the Gaussian function changes gradually with an increase in K.
Fig. 2.
The idealized Gaussian distribution (blue) gradually changes by increased K value (as shown in legend). The hydrophobic core decreases (upper arrow) with simultaneous increase of hydrophobicity on the surface (lower arrow).
The degree of this influence varies. Proteins with K = 0 have been identified as having well-defined hydrophobic cores. Proteins with K values in the range 0 < K < 0.4 also turn out to describe the hydrophobicity distribution of water-soluble proteins. In their structures, a local mismatch of the micelle-like distribution (distribution according to the 3D Gaussian function system) can be identified.
In summary, the RD parameter determines the degree to which micelle-like ordering is reproduced in a given protein. The value of the K parameter, on the other hand, determines the extent to which an environment different from polar water contributes to the protein folding process by influencing the structure of the polypeptide chain (Fig. 3).
Fig. 3.
Graphical representation of the principles of the fuzzy oil drop (FOD) model and modified FOD (FOD-M). A) examples of the T, O, and R distributions. B) value of RD (0.84 for this example) in the absence of hydrophobic core. C) determination of the value of parameter K as the value for which the DKL for (O|M) relation takes the minimum value, which is the closest approximation of the O distribution to the M distribution. D) the T, O and M distributions for K= 0.6.
The relevant K value was determined as a result of the iterative procedure, as shown in Fig. 3.C. The purpose of this procedure is to determine the form of the M function with the minimal DKL(O|M) value. The M function represents the form of the most similar O distribution. In other words, the O distribution is obtained following M directional specificity.
3. Characterization of different proteins
Based on the principle of determining the values of RD and K, it is possible to present a detailed description of different proteins based on these parameters, particularly the K parameter.
3.1. Proteins characterized by values close to K= 0.0
K values close to or equal to 0 indicate proteins with structures that achieve a micelle-like distribution of hydrophobicity. Such proteins are characterized by high solubility with minimal possibility of any specific interactions. Antifreeze type II proteins belong to this group. These small globular proteins covered by a polar surface influence the ordering of water molecules, thus preventing ice-ordering in water below 0 °C. The antifreeze protein thus plays a role similar to that of salt applied during the winter season to prevent icing on the road.
Additionally, there are downhill, fast-folding, and ultra-fast-folding proteins in this group [31]. Notably, the vast majority of protein domains belong to this group. The domain treated as an individual structural unit (3D Gaussian function spanned over the domain) shows an ordering of hydrophobicity very close to a micelle-like distribution [32] (see Supplementary Materials).
This group of proteins is represented here by the antifreeze protein from Zoarces elongatus (PDB ID – 2LX2) [33]. The structure of this protein represents an ideal hydrophobicity distribution based on the 3D Gaussian function. This implies that the surface is covered by polar residues, which influence the ordering and structuralization of neighboring water molecules in a form adequate for charge distribution on the protein surface. Consequently, the water molecules undergo structuralization other than that occurring in ice. The source of protein structure stabilization is the well-defined hydrophobic core localized in the center of the molecule (Fig. 4).
Fig. 4.
Characteristics of an antifreeze protein (PDB ID – 2LX2). A) Profiles of T, O, and M for K= 0.0 as they appear in antifreeze protein. The value of K= 0.0 causes the ideal covering of the O (red, nonvisible) profile by the M profile. On the horizontal axis are the residues in the sequence; on the vertical axis the hydrophobicity level. B) The 3D structure with hydrophobic core residues (red) and hydrophilic residues on the surface (blue), showing the localization of the residues.
3.2. Proteins in the range 0.0 < K ≤ 0.5
This group of proteins mainly comprises single-chain and single-domain enzymes. The value of K in this range represents the LOCAL mismatch between the O and T distributions. The RD value for lysozyme exceeds, at a minimum, 0.5. The elimination of three residues, including two catalytic residues, resulted in an RD value of < 0.5. This group also includes enzymes, such as peptidylprolyl isomerase, and others that are abundantly listed in [34]. The residues identified by introducing this local disorder are catalytic residues, allowing the identification of catalytic centers in enzymes [35].
A common feature of proteins belonging to this group is the well-defined localization of residues with a mismatch between Oi and Ti levels. A chain with an evolutionarily defined sequence cannot fully form micelle-like structures. The large proportion of micelle-like ordering—except for clearly defined locums with mismatched distributions—implies the pursuit of micelle-like structuring.
In studies of protein structure, the basic principle that the amino acid sequence in a chain determines the 3D structure can be elaborated and refined as follows: the amino acid sequence in a polypeptide chain determines how structuring is mismatched with a micelle-like form. In this way, the protein can be described as an “intelligent micelle”, due to the “intentionally” encoded presence of a local mismatch with micelle-like structuring [28] (see also Supplementary Materials).
Another example of this group of proteins is the antifreeze protein form Brachyopsis rostratus (PDB ID: 2ZIB) [36]. Here, the K value was found to be slightly higher (K=0.4); however, the molecule still represents a micelle-like organization (Fig. 5).
Fig. 5.
Characteristics of the antifreeze protein PDB ID–2ZIB. A) Profiles of T, O, and M for K= 0.4, revealing a micelle-like organization of hydrophobicity in the protein body. Horizontal axis: residues in the sequence; vertical axis: hydrophobicity. B) 3D structure showing the localization of hydrophobic (red) and hydrophilic (blue) residues.
3.3. Proteins in the range 0.5 < K < 1.0
Examples of this group of proteins include structured proteins such as actin (PDB ID: 1D4X) and tubulin (PDB ID: 1FFX) [29]. Relatively high values of the K parameter (0.7 for actin and 0.6 for tubulins) indicate the involvement of non-aqueous factors. These two proteins represent a group in which prefoldins are involved in folding. Prefoldin is a chaperone protein that provides an external force field to prevent micelle-like structuring by isolating the folding chain from the aqueous environment. The RD values in the range 0.57–0.67 indicate a mismatch state dispersed along the entire chain. The absence of a well-defined hydrophobic core contributes to reduced stability and structural flexibility. Such a state of slight instability seems to make a protein more ready to interact with another protein without forming a permanent structure with it, as is the case in dystrophin, where the outcome is primarily a complex that does not allow destabilization under external stresses in which the protein functions [29], [37].
Research has shown that there are numerous enzymes in which structuring often corresponds to the range of the parameters RD and K [35] (see Supplementary Materials).
Acetyltransferase from Pseudomonas amygdali E.C. 2.3.1, was selected to represent the proteins characterized by K in the discussed range (PDB ID 1J4J) [38]. The structure of this protein is described by: RD= 0.592 and K= 0.6. Elimination of the catalytic residue (E103) lowers the RD value to 0.589. The catalytic residue is localized in the cavity. Ignoring the residues constructing this cavity (87–91, 103–109 and 141,142), the rest of the protein structure is characterized by a micelle-like organization (RD=0.499) (Fig. 6). This is an example of a protein that lacks micelle-like ordering; however, the source of this discordance is very well localized and limited to a few residues.
Fig. 6.
Characteristics of acetyltransferase (PDB ID 1J4J). A) The profiles T, O, and M for K= 0.6 allow the identification of residues causing an RD value above 0.5. The residues causing RD > 0.5, are indicated as: red, catalytic residue and blue, residues creating the cavity. The horizontal axis represents residues in the sequence and the vertical axis, hydrophobicity. B) The 3D structure shows catalytic residues (red) and cavity-generating residues (blue).
A protein of similar status (RD = 0.662) with K= 0.8 represents a different organization of hydrophobicity. The residues representing different statuses of Oi with respect to Ti are distributed all over the protein molecule (Fig. 7), making it impossible to identify any well-defined activity centers. Actin from Plasmodium berghei (PDB ID 7A0H-A) [39] is responsible for filament construction. Therefore, its structure would be expected to be less rigid to allow adaptation to different external conditions, including multicomplex construction. Therefore, the hydrophobic core of actin is not as evident as in the case of the enzyme discussed above.
Fig. 7.
Characteristics of actin (PDB ID 7A0H-A). A) Profiles T, O, and M for K= 0.8, as found for actin. Residues colored blue represent large differences between Ti and Oi. Their positions are distributed along the whole chain (A) and in protein body (B). The horizontal axis represents residues in the sequence and the vertical axis the hydrophobicity. B) Visualization of the 3D structure of the distribution of residues—elimination of which makes the RD < 0.5. (residues distinguished as blue areas, as shown on the horizontal axis in pane A).
The residues that were eliminated to lower the RD value—the blue dots on the horizontal axis in Fig. 7—showed a rather large distribution throughout the polypeptide chain without a well-defined hydrophobic core, making this molecule more elastic. The elimination of a large number of residues yielded an RD< 0.5, which suggests that the general system was not oriented toward a well-defined hydrophobic core construction.
3.4. Proteins with K value range of 1.0–1.5
Proteins in this group are mainly membrane proteins, such as rhodopsin (K = 1.3) as shown in Fig. 8 and in a previous study [28]. The contribution of the membrane environment is significant according to the assumptions introduced in the FOD-M model by the introduction of the TMAX – Ti function, modifying the specificity of the aqueous environment described by the 3D Gaussian function.
Fig. 8.
Characteristics of rhodopsin (PDB ID 3QDC). A) profiles of T, O, and M for K= 0.8, as found for rhodopsin. In A, the horizontal axis represents the residues in the sequence and the vertical axis the hydrophobicity. B) 3D structure with residues representing excess hydrophobicity (red) and hydrophobicity deficiency (blue).
Proteins in this group also include long-chain and multi-domain enzymes, including some lyases, oxygenases, and glycosidases [35]. Large proteins with a hydrophobic distribution different from that of the micelle-like arrangement are considered to provide an external force field for the active center. Catalytic reactions achieve significant modification of the energy barrier in many catalyzed processes in the presence of an appropriate force field, the specificity of which is expressed by the K parameter (and RD). By analyzing the structure of alkaline phosphatase (E.C. 3.1.3.1), which folds with the participation of a chaperone, this observation can be generalized to all proteins with a high K value fold with the participation and support of a chaperone (PDB ID: 6PSI) [40].
Proteins folded with the participation of a chaperone (GroEL) (PDB ID: 7LUP [41]) tend to have K values in this range.
A common feature of proteins whose K values fall within the range discussed here is the dispersed location of residues showing high variability in Oi versus Ti. It is difficult to locate a specific section of a polypeptide chain with such characteristics (which is often the case for chains with a status of 0.0 <K<0.5).
A characteristic feature of these proteins is the profile of the M distribution, which takes a form similar to the R distribution, where there is no variation in hydrophobicity levels across the protein (or the variation is minimal). Such a state in a given protein means folding in a kind of ‘water vacuum,’ without standard water influence on the direction of the folding process. This is achieved through the involvement of chaperone proteins such as GroEL, which completely insulates the folding protein from the water environment and simultaneously introduces a specific field that acts as an external force field for the folding process (see also Supplementary Materials).
The membrane protein selected for analysis was the sensory protein rhodopsin, extracted from Natronomonas pharaonis (PDB ID 3QDC [42]. This protein is a helical transmembrane protein with a high K=1.3. The source of the relatively large discrepancy between the Ti and Oi levels of hydrophobicity was the presence of a channel in the central part of the molecule. The protein molecule is mainly covered by polar residues that allow transport. The surface (In contrast to water-soluble proteins) is covered with hydrophobic residues to introduce stability when in contact with the hydrophobic membrane, as illustrated in Fig. 8. The analysis of whole spectrum of different anchoring systems of membrane proteins is discussed in detail in a previous study [43].
Another transmembrane protein representing the beta-barrel structure selected for analysis is porin, isolated from Pseudomonas aeruginosa (PDB ID 4FSO) [44]. The channel of the protein is well defined by fragments of significantly lower levels of Oi compared with Ti (Fig. 9). The exposure of hydrophobic residues—enabling anchoring of the molecule in the hydrophobic membrane—is different than in previous examples, as there is a broad spectrum of anchoring systems [43].
Fig. 9.
Characteristics of porin (PDB ID 4FSO). A) Profiles of T, O, and M for K= 1.9, as found for porin from Pseudomonas aeruginosa. The horizontal axis represents residues in the sequence and the vertical axis hydrophobicity. B) 3D structure with residues representing excess hydrophobicity (red) and hydrophobicity deficiency (blue).
3.5. Proteins in the range 1.0 < K < 4.0
Membrane proteins that act as ion channels, including the mechanosensitive channel of small conductance HpMscS and translocase, belong to this group [43].
Here, the active contribution of the hydrophobic environment of the membrane to protein structuring, especially with regard to the domains forming the transmembrane part of the complex is very strong and clear. The multi-chain structure of the transmembrane domain shows a structural form strongly dominated by a hydrophobic membrane environment.
Analysis of transmembrane proteins showed significant variation in the distribution of surface residues, exhibiting high levels of Oi. This phenomenon can be linked to the need for mobility of the membrane protein, which, when anchored, for example, on one side only, can move to a limited degree according to the form of the anchorage [43].
Among the group of proteins characterized by high K values are structuralization proteins, which can be categorized as unfolded or partially unfolded. The protein cyanovirin-N from Nostoc ellipsosporum (PDB ID 4J4C) [45] is an example of this class of proteins (Fig. 10).
Fig. 10.
Characteristics of cyanovirin-N (PDB ID – 4J4C). A) Profiles of T, O, and M. The horizontal axis represents residues in the sequence, and the vertical axis hydrophobicity. B) 3D structure with residues representing hydrophobicity excess (red) and hydrophobicity deficiency (blue).
The structure of this protein is treated as domain-swapped oligomeric suggesting “trapped” folding intermediates. Structures of this type require a “permanent chaperone” to keep their structure. This role is usually played by the second chain in domain-swapping system [46]. In other words, the individual chain cannot exist in the presented form.
3.6. The chaperonin structure delivering the external force field K > 4.0
The highest values of K (thus far) have been found for chaperonins delivering extremely different folding conditions with respect to the aquatic environment. The K= 7.0 parameter was found to describe the specificity of the external force field generated by E. coli co-chaperonin GroES (PDB ID - 7VWX) [41], [47] (see Supplementary Materials).
Analysis of T and O values (Fig. 11) shows an almost constant level for the O profile, representing either excess or deficiency of hydrophobicity in nearly all positions. The M profile takes the form of a line parallel to the x-axis. This means that the construction of the entire system generates a structure introducing a significantly different organization with respect to the conditions expected for the water environment. The interpretation of the K value in this case was quite different; it does not express the status of a molecule (complex) due to environmental influences. The GroEl:eS construction represents the environment by delivering an external force field to the proteins folded in its chambers.
Fig. 11.
Characteristics of chaperonin GroEl:eS2. A) The T, O, and M profiles (K = 7.0). The limited fragment (first 400 aa in the PDB file) of the chains participating in the construction is presented because of the large number of residues (9146 aa). Fragments are presented in the form of profiles distinguished in B as red fragments. The horizontal axis represents residues in the sequence and the vertical axis represents hydrophobicity. B) 3D structure of GroEl:eS2 with a fragment shown in A, distinguished in red.
Our analysis of the folding process of chains participating in this construction is focused on the status of each individual chain.
3.7. Predictability of structures using AlphaFold2 (AF2)
The model presented in this study is closely related to the problem of protein folding in silico. Predicting the 3D structure of a given amino acid sequence has been a challenge for years [48]. The CASP project, initiated in 1996, focused on monitoring progress in the discipline of protein folding simulation. The criteria for the similarity of models delivered by the participants with respect to the native structure of the protein under consideration are expressed by the GDT_TS value (the higher the value, the better the model). Many factors are taken under consideration to calculate its value – mostly focusing on geometry comparisons. The similarity can also be estimated using factors such as the hydrophobicity distribution [49].
The newly applied method based on artificial intelligence, AlphaFold2, delivers correct models [50], [51]. The FOD-M model was used to assess the accuracy of the hydrophobicity distribution in the model versus the native forms of the target molecules.
Three proteins representing low, medium, and high K values were selected for this comparable analysis. The selected low-K example is a domain in beta-spectrin (PDB ID 1AA2 UniProt code: Q01082 [52], [53] described by K=0.1; the medium-K example was tubulin gamma-1 chain (PDB ID 7SJ9 UniProt Q13509 [54], [55], described by K=0.6, and the high-K (K=2.2) example was the human cd45 extracellular region, domains d1-d4 (PDB ID 5FMV UniProt code P08575 [56], [57]) (Fig. 12, Fig. 13, Fig. 14).
Fig. 12.
Profiles expressing the status of a domain in beta-spectrin (PDB ID 1AA2 UniProt code: Q01082): model AF2 (blue lines) and WT forms (red lines). X axis: residues in the sequence; Y axis: hydrophobicity. The profiles are: T (panel A), O (B), M (C). The K values are given in the legend. In A, B, and C, the horizontal axis represents the residues and the vertical axis represents the hydrophobicity. In panel D is shown the 3D structure of the complete molecule as delivered by AF2; the domain under consideration is shown in red. In panel E is shown the 3D structure of domain under comparative analysis.
Fig. 13.
Profiles as calculated for AF2 model (blue lines) and WT form (red lines) available in PDB for tubulin gamma-1 chain (PDB ID 7SJ9 Q13509). X axis: residues in the sequence; y axis: hydrophobicity. The profiles are: T (panel A), O (B), M (C). The K values are given in the legends.
In A, B, and C, the horizontal axis represents the residues and the vertical axis represents the hydrophobicity. D) 3D structure of protein under consideration.
Fig. 14.
Profiles as calculated for AF2 model (blue lines) and WT form (red lines) available in PDB for human cd45 extracellular region, domains d1-d4 (PDB ID 5FMV UniProt code P08575. X axis: residues in the sequence; y axis: hydrophobicity. The profiles are: T (panel A), O (B), M (C). The K values are given in the legends. In A, B, and C, the horizontal axis represents the residues and the vertical axis represents the hydrophobicity. In panel D is shown the 3D structure of protein under consideration.
Independent of the K value, all protein models have a high degree of concordance with the experimentally determined structures of given proteins.
Structures were generated using the server described in [58].
The structure generated by AF2 Beta-spectrin, identified by UniProt ID Q01082, delivers the structure of the complete protein. The object of analysis was limited to the domain (173–280 according to UniProt numbering). The structural prediction of this small domain in context with a very large complete molecule (2364 aa) is a challenge that can be characterized as a spectacular success. The agreement of all profiles (T, O, and M) was very high for an equal value of K= 0.1 (Fig. 12).
An example representing the structure described by K= 0.6 is the tubulin gamma-1 chain (PDB ID 7SJ9 UniProt Q13509). A comparison of the T, O, and M profiles is presented in Fig. 13., demonstrating the exact reconstruction of the hydrophobic distribution of the protein.
The examples analyzed show that the AF2 structure prediction is in accordance with WT structures independently of the external force field the proteins generated in native conditions.
4. Discussion
4.1. Specificity of external force field
The involvement of an external force field in the protein folding process warrants assessment of the characteristics of the field itself [28].
Analysis of protein structures emphasizes those categorized as intrinsically disordered proteins, where complex formation is largely based on matching hydrophobicity distributions, leading to structuring based on making the hydrophobic core common, irrespective of the lack of ordering in the sense of a secondary structure [28].
In contrast, the external force field in the form of a chaperone or chaperonin composed of proteins can be characterized using the RD and K parameters.
The external force field of a chaperone involved in the folding process of alkaline phosphatase (E.C. 3.1.3.1) is characterized by K = 1.1. The protein with its participation in folding had the same value K = 1.1.
Chaperonin (GroEL) supporting folding, for example, reovirus mu1/sigma3 o K= 1.3, demonstrates, by its structure, a value K= >4.0 [40]. This is an extremely large contribution from the field TMAX-Ti. This implies that GroEL delivers a local field that is completely devoid of the influence of polar water, delivering an external force field similar to that of the R distribution. The force field in a protein molecule was treated as a two-variable function: internal and external (Fig. 15).
Fig. 15.
The 3D presentation of energy surface as two-variables dependent: internal force field (inter-molecular interaction) and external force field expressed by K parameter. Four minima are shown for different four K values. The internal force field is dependent on the set of inter-atomic distances [rij].
4.2. Protein structure prediction
The CASP project mentioned in the Introduction, in which participants use tools generated for the 3D structure prediction of proteins, reveals the dependence of the structure on the environment. The use of a single procedure (however well developed) cannot reproduce such a diverse world of proteins, with this diversity arising from the contribution of an environmental factor. Therefore, no model based on any form of ‘averaging’ the force field used in a protein prediction procedure will pass the test [59].
To conclude the presented material the summary is shown in Table 1.
Table 1.
Variability in the spatial structure of proteins under different environmental conditions.
| K= 0.0 | 0.0 <K< 0.5 | 0.5 <K< 0.9 | 0.9 <K< 1.5 | 1.5 <K< 3.0 | K> 3.0 | |
|---|---|---|---|---|---|---|
| ENVIRONMENT | WATER | WATER | CHAPERON | MEMBRANE | ||
| AA < 100 | DOWN-HILL |
FAST-FOLDING ANTIFREEZE |
||||
| 100 <AA< 200 |
ENZYMES One domain |
ENZYMES Multi-domains |
||||
| 200 <AA < 300 |
ENZYMES Multi-domains |
MEMBRANE PROTEINS |
CHAPERONE MEMBRANE PROTEINS |
|||
| AA > 300 |
MEMBRANE PROTEINS |
CHAPERONE | CHAPERONIN |
A short description of the applicability of the FOD-M model is supported by a large set of results characterizing proteins using the RD and K parameters presented in the Supplementary Materials. A summary of these results is presented in Fig. 16.
Fig. 16.
The K values as ranges characterizing the selected groups of proteins. The detailed values are given in Supplementary Materials. Category “fast” represents the fast-folding, ultra-fast-folding, down-hill and antifreeze type II. Category “chain” represents the proteins of < 200 aa chain length. Category “complex” represents the IV-order structures; “membrane” are proteins anchored in membrane including beta-barrel as well helical forms of these proteins; “large enzymes” are proteins of > 200 aa of one-domain construction, “chaperone” are supporting proteins including prefoldin; “chaperonins” are GroEL-GroES constructions. All examples are presented in details in Supplementary Materials.
The 3D structure is determined by the amino acid sequence (Anfinsen dogma [60]). It is shown that the environment in which the polypeptide chain folds influences this process. Thus two factors determine the 3D structure. The proposed model introduces quantitative participation of external factors, which together with the aa sequence produces the functional structure of the protein. As it can be seen there are no rigid limits for certain groups of proteins. Fig. 16 visualise the ranges of K values expressing the external conditions which – when applied in simulation of folding process – may deliver the proper structure. This hypothesis shall be checked by many tests.
5. Conclusions
The analysis revealed the importance of the external force field as a factor influencing mechanisms of protein folding and amyloid transformation in particular. This was demonstrated by analyzing the results of protein structure prediction as part of the CASP projects 13, 14, and 15, where the use of uniform force-field parameterization resulted in a variation in the correctness of the obtained models, also for programs assessed as highly reliable [49], [59].
The general conclusion allows the modification of the funnel model [61], [62], [63], [64], [65], [66], [67], [68], [69], [70] by introducing a quantitative scale on the horizontal axis (Fig. 17). This suggests that the optimal status of the protein depends on the balance between the internal and external force fields (Fig. 15).
Fig. 17.
Funnel model expressing the determinants resulting from the environmental specificity expressed by the value of the K parameter.
All presented results are available (recalculation is possible) using the freely available software https://hphob.sano.science. To date, more than 300 proteins have been analyzed (Supplementary Materials). However, large-scale calculations were required.
The vertical axis in Fig. 17 indicates the energy levels that are difficult to compare in terms of the forms created in a variable environment. It cannot be ruled out that a polypeptide folded in an environment of, for example, polar water (K=0.0), can obtain an energy state lower than that in the native form with K > 0.0. However, if the goal is to obtain a biologically active form, the structure representing the energetically lower state is not important because 1) it is not possible to achieve this in a specific environment, and 2) the energetically lower state is irrelevant from the perspective of biological activity [49], [59], [71]. The overall structural stability of the protein appeared to be gradual. The construction of a well-defined hydrophobic core may introduce an almost rigid stability [37], whereas the distributed hydrophobic interaction makes elastic forms available, as observed in cytoskeletal constructions [72], [73], [74]. Construction of a hydrophobic core is critical for biological activity. Such differential stability of proteins during the folding process can be achieved under different environmental conditions. It can be expressed in the form of an external force field defined mathematically, thus allowing the simulation of the folding process in silico.
Significant progress in protein structure prediction introduced by the artificial intelligence approaches expressed by AlphaFold and AlphaFold2 has significant implications in pharmacology, especially in drug design. There has been significant expansion in access to the 3D structures of proteins, including applications in clinical medicine. The CASP project aims to deliver 3D structures in an accurate form to reach a solution. The model presented in this paper focuses on the search for an answer to the question: Why do proteins fold the way they do ? This question also has a practical implication: Why does the optimal force field (approach) that delivers perfect structures for one protein fail in other proteins? The answer is perhaps simple: The structures are generated according to different scenarios, depending on the environment in which they are generated. The FOD-M model attempts to express the participation of external conditions in a mathematical form, allowing its quantitative measurement.
It is suggested in conclusion to introduce the multiple criteria optimization approach:
| F(rij) = F [FINT (rij), FEXT(rij)] |
where FINT (rij) represents internal energy optimization and FEXT (rij) representing the external force field as proposed in presented FOD-M model. The final force field in proteins is the consensus of internal interactions as well as of external influence.
Funding
Jagiellonian University, Medical College: Grant # N41/DBS/001127.
CRediT authorship contribution statement
Mateusz Slupina: Software, Data curation. Irena Roterman: Funding acquisition, Formal analysis, Conceptualization. Leszek Konieczny: Supervision.
Declaration of Competing Interest
I declare that: This The paper entitled: Protein folding: funnel model revised prepared by Irena Roterman, Mateusz Slupina, and Leszek Konieczny has not been published or presented elsewhere in part or in entirety and is not under consideration by another journal. We have read and understood your journal’s policies, and we believe that neither the manuscript nor the study violates any of these. There are no conflicts of interest to declare.
Acknowledgements
Many thanks to Anna Śmietańska and Zdzisław Wiśniowski for technical support. This research was performed within the project of MSHE "Support for the activity of Centers of Excellence established in Poland under Horizon 2020" on the basis of contract number MEiN/2023/DIR/3796. This project received funding from the EU's H2020 Research and Innovation Program under grant agreement No. 857533. This publication was supported by the Sano project, which was conducted within the International Research Agendas program of FNP and co-financed by the EU under the European Regional Development Fund. Many thanks to Piotr Nowakowski and Krzysztof Gądek to make the program hphob available in the network https://hphob.sano.science.
Footnotes
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.csbj.2024.10.030.
Contributor Information
Irena Roterman, Email: myroterm@cyf-kr.edu.pl.
Leszek Konieczny, Email: mbkoniec@cyf-kr.edu.pl.
Appendix A. Supplementary material
Supplementary material
.
Data availability
The datasets used and/or analyzed in the current study are available from the corresponding author upon reasonable request.
References
- 1.Keasar C., McGuffin L.J., Wallner B., Chopra G., Adhikari B., et al. An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12. Sci Rep. 2018;8:9939. doi: 10.1038/s41598-018-26812-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Roterman I.K., Lambert M.H., Gibson K.D., Scheraga H.A. A comparison of the CHARMM, AMBER and ECEPP potentials for peptides. II. Phi-psi maps for N-acetyl alanine N′-methyl amide: Comparisons, contrasts and simple experimental tests. J Biomol Struct Dyn. 1989;7:421–453. doi: 10.1080/07391102.1989.10508503. [DOI] [PubMed] [Google Scholar]
- 3.Collier T.A., Piggot T.J., Allison J.R. Molecular dynamics simulation of proteins. Methods Mol Biol 2073. 2020:311–327. doi: 10.1007/978-1-4939-9869-2_17. [DOI] [PubMed] [Google Scholar]
- 4.Zhou R. Free energy landscape of protein folding in water: explicit vs. implicit solvent. Proteins. 2003;53:148–161. doi: 10.1002/prot.10483. [DOI] [PubMed] [Google Scholar]
- 5.Kryshtafovych A., Schwede T., Topf M., Fidelis K., Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIII. Proteins. 2019;87:1011–1020. doi: 10.1002/prot.25823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Osguthorpe D.J. Ab initio protein folding. Curr Opin Struct Biol. 2000;10:146–152. doi: 10.1016/s0959-440x(00)00067-1. [DOI] [PubMed] [Google Scholar]
- 7.Kinch L.N., Li W., Schaeffer R.D., Dunbrack R.L., Monastyrskyy B., et al. CASP 11 target classification. Proteins. 2016;84:20–33. doi: 10.1002/prot.24982. (Suppl 1)(Suppl 1)(Suppl 1):20-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Senior A.W., Evans R., Jumper J., Kirkpatrick J., Sifre L., et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706–710. doi: 10.1038/s41586-019-1923-7. [DOI] [PubMed] [Google Scholar]
- 9.Cheng J., Choe M.H., Elofsson A., Han K.S., Hou J., et al. Estimation of model accuracy in CASP13. Proteins. 2019;87:1361–1377. doi: 10.1002/prot.25767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Pereira J., Simpkin A.J., Hartmann M.D., Rigden D.J., Keegan R.M., et al. High-accuracy protein structure prediction in CASP14. Proteins. 2021;89:1687–1699. doi: 10.1002/prot.26171. [DOI] [PubMed] [Google Scholar]
- 11.CASP (Critical Assesment Structure Prediction - https://predictioncenter.org/.
- 12.Huang J., Chen W., Liang J., Yang Q., Fan Y., et al. α-keto Acids as triggers and partners for the synthesis of quinazolinones, quinoxalinones, benzooxazinones, and benzothiazoles in water. J Org Chem. 2021;86:14866–14882. doi: 10.1021/acs.joc.1c01497. [DOI] [PubMed] [Google Scholar]
- 13.Widmer-Cooper A., Perry H., Harrowell P., Reichman D.R. Localized soft modes an d the supercooled liquid’s irreversible passage through its configuration space. J Chem Phys. 2009;131 doi: 10.1063/1.3265983. [DOI] [PubMed] [Google Scholar]
- 14.Zhao S., Li K., Sun X., Zha Z., Wang Z. Copper-catalyzed stereoselective [4 + 2] cycloaddition of β,γ-unsaturated α-keto esters and 2-vinylpyrroles in water. Org Lett. 2022;24:4224–4228. doi: 10.1021/acs.orglett.2c01544. [DOI] [PubMed] [Google Scholar]
- 15.Li Z., Fu C.-F., Chen Z., Tong T., Hu J., et al. Electron-induced synthesis of dimethyl ether in the liquid–vapor interface of methanol. J Phys Chem Lett. 2022;13:5220–5225. doi: 10.1021/acs.jpclett.2c00787. [DOI] [PubMed] [Google Scholar]
- 16.Sugimoto Y. Seeing how ice breaks the rule. Science. 2022;377:264–265. doi: 10.1126/science.add0841. [DOI] [PubMed] [Google Scholar]
- 17.Eremin D.B., Fokin V.V. On-water selectivity switch in microdroplets in the 1,2,3-triazole synthesis from Bromoethenesulfonyl fluoride. J Am Chem Soc. 2021;143:18374–18379. doi: 10.1021/jacs.1c08879. [DOI] [PubMed] [Google Scholar]
- 18.Romero-Montalvo E., DiLabio G.A. Computational study of hydrogen bond interactions in water cluster–organic molecule complexes. J Phys Chem A. 2021;125:3369–3377. doi: 10.1021/acs.jpca.1c01377. [DOI] [PubMed] [Google Scholar]
- 19.Hazra S., Kaur G., Handa S. Reactivity of styrenes in micelles: Safe, selective, and sustainable functionalization with azides and carboxylic acids. ACS Sustain Chem Eng. 2021;9:5513–5518. [Google Scholar]
- 20.Nguyen D., Casillas S., Vang H., Garcia A., Mizuno H., et al. Catalytic mechanism of interfacial water in the cycloaddition of quadricyclane and diethyl azodicarboxylate. J Phys Chem Lett. 2021;12:3026–3030. doi: 10.1021/acs.jpclett.1c00565. [DOI] [PubMed] [Google Scholar]
- 21.Sarathkumar S., Kavala V., Yao C.-F. A water-soluble rhenium (I) Org Lett. 2021;23:1960–1965. doi: 10.1021/acs.orglett.0c04068. [DOI] [PubMed] [Google Scholar]
- 22.Duong U., Ansari T.N., Parmar S., Sharma S., Kozlowski P.M., et al. Nanochannels in photoactive polymeric Cu(I) compatible for efficient micellar catalysis: sustainable aerobic oxidations of alcohols in water. ACS Sustain Chem Eng. 2021;9:2854–2860. [Google Scholar]
- 23.Park S.J., Hwang I.-S., Chang Y.J., Song C.E. Bio-inspired water-driven catalytic enantioselective protonation. J Am Chem Soc. 2021;143:2552–2557. doi: 10.1021/jacs.0c11815. [DOI] [PubMed] [Google Scholar]
- 24.Zuo Y.-J., Qu J. How does aqueous solubility of organic reactant affect a water-promoted reaction? J Org Chem. 2014;79:6832–6839. doi: 10.1021/jo500733v. [DOI] [PubMed] [Google Scholar]
- 25.Konieczny L., Brylinski M., Roterman I. Gauss-function-Based model of hydrophobicity density in proteins. Silico Biol. 2006;6:15–22. [PubMed] [Google Scholar]
- 26.Levitt M.A. A simplified representation of protein conformations for rapid simulation of protein folding. J Mol Biol. 1976;104:59–107. doi: 10.1016/0022-2836(76)90004-8. [DOI] [PubMed] [Google Scholar]
- 27.Kullback S., Leibler R.A. On information and sufficiency. Ann Math Stat. 1951;22:79–86. [Google Scholar]
- 28.Roterman I., Konieczny L. Protein is an intelligent micelle. Entropy. 2023;25:850. doi: 10.3390/e25060850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Roterman I., Stapor K., Konieczny L. Model of the external force field for the protein folding process-the role of prefoldin. Front Chem. 2024;12 doi: 10.3389/fchem.2024.1342434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Roterman I., Stapor K., Fabian P., Konieczny L. New insights into disordered proteins and regions according to the FOD-M model. PLOS ONE. 2022;17 doi: 10.1371/journal.pone.0275300. eCollection 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Banach M., Stapor K., Konieczny L., Fabian P., Roterman I. Downhill, ultrafast and fast folding proteins revised. Int J Mol Sci. 2020;21:7632. doi: 10.3390/ijms21207632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sałapa K., Kalinowska B., Jadczyk T., Roterman I. (2012) Measurement of hydrophobicity distribution in proteins – Non-redundant protein data bank. Bio Algor Med Syst, ISSN 1895–9091, e-ISSN 1896–530×8: 327–338.
- 33.Kumeta H., Ogura K., Nishimiya Y., Miura A., Inagaki F., et al. NMR structure note: a defective isoform and its activity-improved variant of a type III antifreeze protein from Zoarces elongates Kner. J Biomol NMR. 2013;55:225–230. doi: 10.1007/s10858-012-9703-9. [DOI] [PubMed] [Google Scholar]
- 34.Roterman I., Konieczny L., Stapor K., Słupina M. Hydrophobicity-based force field in enzymes. ACS Omega. 2024;9:8188–8203. doi: 10.1021/acsomega.3c08728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Roterman I., Stapor K., Konieczny L. New insights on the catalytic center of proteins from peptidylprolyl isomerase group based on the FOD-M model. J Cell Biochem. 2023;124:818–835. doi: 10.1002/jcb.30407. [DOI] [PubMed] [Google Scholar]
- 36.Nishimiya Y., Kondo H., Takamichi M., Sugimoto H., Suzuki M., et al. Crystal structure and mutational analysis of Ca2+-independent type II antifreeze protein from longsnout poacher, Brachyopsis rostratus. J Mol Biol. 2008;382:734–746. doi: 10.1016/j.jmb.2008.07.042. [DOI] [PubMed] [Google Scholar]
- 37.Dygut J., Kalinowska B., Banach M., Piwowar M., Konieczny L., et al. Structural interface forms and their involvement in stabilization of multidomain proteins or protein complexes. Int J Mol Sci. 2016;17:1741. doi: 10.3390/ijms17101741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.He H., Ding Y., Bartlam M., Sun F., Le Y., et al. Crystal structure of tabtoxin resistance protein complexed with acetyl coenzyme A reveals the mechanism for beta-lactam acetylation. J Mol Biol. 2003;325:1019–1030. doi: 10.1016/s0022-2836(02)01284-6. [DOI] [PubMed] [Google Scholar]
- 39.Bendes Á.Á., Kursula P., Kursula I. Structure and function of an atypical homodimeric actin capping protein from the malaria parasite. Cell Mol Life Sci. 2022;79:125. doi: 10.1007/s00018-021-04032-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Roterman I., Stapor K., Konieczny L. Role of environmental specificity in CASP results. BMC Bioinforma. 2023;24:425. doi: 10.1186/s12859-023-05559-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Knowlton J.J., Gestaut D., Ma B., Taylor G., Seven A.B., et al. Structural and functional dissection of reovirus capsid folding and assembly by the prefoldin-TRiC/CCT chaperone network. Proc Natl Acad Sci USA. 2021;118 doi: 10.1073/pnas.2018127118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Gushchin I., Reshetnyak A., Borshchevskiy V., Ishchenko A., Round E., et al. Active state of sensory rhodopsin II: structural determinants for signal transfer and proton pumping. J Mol Biol. 2011;412:591–600. doi: 10.1016/j.jmb.2011.07.022. [DOI] [PubMed] [Google Scholar]
- 43.Roterman I., Stapor K., Konieczny L. Transmembrane proteins – Different anchoring systems. Proteins. 2024;92:593–609. doi: 10.1002/prot.26646. [DOI] [PubMed] [Google Scholar]
- 44.Eren E., van den Berg B. Crystal structures of occk subfamily proteins. PDB: 4FSO.
- 45.Koharudin L.M.I., Liu L., Gronenborn A.M. Different 3D domain-swapped oligomeric cyanovirin-N structures suggest trapped folding intermediates. Proc Natl Acad Sci USA. 2013;110:7702–7707. doi: 10.1073/pnas.1300327110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Roterman I., Stapor K., Dułak D., Konieczny L. (in press) Domain swapping – Mathematical model for quantitative assessment of structural effects. FEBS Open Bio. [DOI] [PMC free article] [PubMed]
- 47.Kim H., Park J., Lim S., Jun S.-H., Jung M., et al. Cryo-EM structures of GroEL:ES2 with RuBisCO visualize molecular contacts of encapsulated substrates in a double-cage chaperonin. iScience. 2022;25 doi: 10.1016/j.isci.2021.103704. (eCollection) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Dill K.A., MacCallum J.L. The protein-folding problem, 50 years on. Science. 2012;338:1042–1046. doi: 10.1126/science.1219021. [DOI] [PubMed] [Google Scholar]
- 49.Gadzała M., Kalinowska B., Banach M., Konieczny L., Roterman I. Determining protein similarity by comparing hydrophobic core structure. Heliyon. 2017;3 doi: 10.1016/j.heliyon.2017.e00235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Tunyasuvunakool K., Adler J., Wu Z., Green T., Zielinski M., et al. Highly accurate protein structure prediction for the human proteome. Nature. 2021;596:590–596. doi: 10.1038/s41586-021-03828-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.https://www.uniprot.org/uniprotkb/Q01082/entry.
- 53.Djinovic Carugo K.D., Bañuelos S., Saraste M. Crystal structure of a calponin homology domain. Nat Struct Biol. 1997;4:175–179. doi: 10.1038/nsb0397-175. [DOI] [PubMed] [Google Scholar]
- 54.https://www.uniprot.org/uniprotkb/Q13509/entry.
- 55.LaFrance B.J., Roostalu J., Henkin G., Greber B.J., Zhang R., et al. Structural transitions in the GTP cap visualized by cryo-electron microscopy of catalytically inactive microtubules. Proc Natl Acad Sci USA. 2022;119 doi: 10.1073/pnas.2114994119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.https://www.uniprot.org/uniprotkb/P08575/entry.
- 57.Chang V.T., Fernandes R.A., Ganzinger K.A., Lee S.F., Siebold C., et al. Initiation of T cell signaling by CD45 segregation at “close contacts”. Nat Immunol. 2016;17:574–582. doi: 10.1038/ni.3392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.https://alphafold.ebi.ac.uk.
- 59.Roterman I., Stapor K., Dułak D., Konieczny L. External force field for protein folding in chaperonines – Potential application in silico protein folding. ACS Omega. 2024;9:18412–18428. doi: 10.1021/acsomega.4c00409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Anfinsen C.B. Principles that govern the folding of protein chains. Science. 1973;181(4096):223–230. doi: 10.1126/science.181.4096.223. [DOI] [PubMed] [Google Scholar]
- 61.Giri Rao V.V.H., Gosavi S. Using the folding landscapes of proteins to understand protein function. Curr Opin Struct Biol. 2016;36:67–74. doi: 10.1016/j.sbi.2016.01.001. [DOI] [PubMed] [Google Scholar]
- 62.Kozak J.J., Gray H.B., Garza-López R.A. Funneled angle landscapes for helical proteins. J Inorg Biochem. 2020;208 doi: 10.1016/j.jinorgbio.2020.111091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Nassar R., Dignon G.L., Razban R.M., Dill K.A. The protein folding problem: the role of theory. J Mol Biol. 2021;433 doi: 10.1016/j.jmb.2021.167126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Fain B., Levitt M. Funnel sculpting for in silico assembly of secondary structure elements of proteins. Proc Natl Acad Sci USA. 2003;100:10700–10705. doi: 10.1073/pnas.1732312100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Bastida A., Zúñiga J., Requena A., Cerezo J. Energetic self-folding mechanism in α-helices. J Phys Chem B. 2019;123:8186–8194. doi: 10.1021/acs.jpcb.9b05860. [DOI] [PubMed] [Google Scholar]
- 66.Smeller L. Folding superfunnel to describe cooperative folding of interacting proteins. Proteins. 2016;84:1009–1016. doi: 10.1002/prot.25051. [DOI] [PubMed] [Google Scholar]
- 67.Lewinson O., Lee A.T., Rees D.C. The funnel approach to the precrystallization production of membrane proteins. J Mol Biol. 2008;377:62–73. doi: 10.1016/j.jmb.2007.12.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Janković B.G., Polović N.D. Protein Fold Probl Biol Serbica. 2017;39:105–111. [Google Scholar]
- 69.Hartl F.U., Bracher A., Hayer-Hartl M. Molecular chaperones in protein folding and proteostasis. Nature. 2011;475:324–332. doi: 10.1038/nature10317. [DOI] [PubMed] [Google Scholar]
- 70.Leppert A., Poska H., Landreh M., Abelein A., Chen G., et al. A new kid in the folding funnel: molecular chaperone activities of the BRICHOS domain. Protein Sci. 2023;32 doi: 10.1002/pro.4645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Roterman I., Konieczny L. (submitted) Environment conditions influencing the protein folding process.
- 72.Carman P.J., Barrie K.R., Rebowski G., Dominguez R. Structures of the free and capped ends of the actin filament. Science. 2023;380:1287–1292. doi: 10.1126/science.adg6812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Korobova F., Ramabhadran V., Higgs H.N. An actin-dependent step in mitochondrial fission mediated by the ER-associated Formin INF2. Science. 2013;339:464–467. doi: 10.1126/science.1228360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Oda T., Iwasa M., Aihara T., Maéda Y., Narita A. The nature of the globular- to fibrous-actin transition. Nature. 2009;457:441–445. doi: 10.1038/nature07685. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary material
Data Availability Statement
The datasets used and/or analyzed in the current study are available from the corresponding author upon reasonable request.


















