Skip to main content
Bioinformation logoLink to Bioinformation
. 2005 Sep 2;1(2):42–49. doi: 10.6026/97320630001042

Structural features differentiate the mechanisms between 2S (2 state) and 3S (3 state) folding homodimers

Lei Li 1, Kannan Gunasekaran 2, Jacob Gah-Kok Gan 1, Cui Zhanhua 1, Paul Shapshak 3, Meena Kishore Sakharkar 1, Pandjassarame Kangueane 1,*
PMCID: PMC1891634  PMID: 17597851

Abstract

The formation of homodimer complexes for interface stability, catalysis and regulation is intriguing. The mechanisms of homodimer complexations are even more interesting. Some homodimers form without intermediates (two-state (2S)) and others through the formation of stable intermediates (three-state (3S)). Here, we analyze 41 homodimer (25 `2S` and 16 `3S`) structures determined by X-ray crystallography to estimate structural differences between them. The analysis suggests that a combination of structural properties such as monomer length, subunit interface area, ratio of interface to interior hydrophobicity can predominately distinguish 2S and 3S homodimers. These findings are useful in the prediction of homodimer folding and binding mechanisms using structural data.

Keywords: Homodimer, structural difference, 2 state, 3 state, stable intermediate, folding mechanism

Abbreviations

2S = 2 state homodimer; 3S = 3 state homodimer; 3SMI = 3 state homodimer with monomer intermediate; 3SDI = 3 state homodimer with dimer intermediate; ML = monomer length; B/2 = subunit interface area; Fhp = fraction of interface to interior hydrophobicity

Background

Equilibrium denatruation experiments (using temperature and chemical agents) are employed to analyze the unfolding of proteins. These studies are useful in understanding monomeric protein folding. Recently, such techniques have been used to study the mechanism of homodimer formation. [1] Dimer folding involves both intra­molecular and inter­molecular interaction, unlike monomer folding that involves only intra­molecular interaction. It is known that some dimers denature from native dimer to unfolded monomers with no thermodynamically stable intermediates, whereas others have folded intermediates during the process. [123] Based on the unfolding patterns, homodimers are known to exist in three different states. They are (1) two­state (2S), (2) three­state with dimeric intermediate (3SDI) and (3) three­state with monomeric intermediate (3SMI). 2S refers to N2 ↡ 2U mechanism, 3SDI refers to N2 ↡ I2 ↡ 2U and 3SMI refers to N2 ↡ 2I ↡ 2U, where N2 is the native dimer state, I is the intermediate monomeric species, I2 is the intermediate dimeric species, and U is unfolded monomeric state. 3SDI and 3SMI are commonly considered as three­state (3S). It is found that 2S interfaces are similar to protein cores and 3SMI interfaces resemble the monomer surfaces. [4 ] 2S and 3SMI dimerization were also studied by following the evolution of two identical 20­letter residue chains within the framework of a lattice model, using Monte Carlo simulations. [5 ] It is found that folding of 2S sequences depend on a significantly larger number of conserved amino acids than 3SMI sequences. The effects of the monomer and interface geometry on 2S and 3S association mechanism were also studied by the energetically minimally frustrated Gō model. [6] It is found that the native protein 3D structure is the major factor that governs the choice of binding mechanism.

Mei and colleagues investigated the importance of 2S and 3S dimers using structural and folding data. [2] Apiyo and colleagues proposed (using 13 obligomers (multimers with permanent interfaces)) that small obligomers (molecular mass < 20 kDa) unfold through 2S. [7] On the other hand, large obligomers (molecular mass > 35 kDa) unfold through oligomeric intermediate (3SDI) and those with intermediate size unfold through monomeric intermediate (3SMI). Moreover, Levy and colleagues proposed (using 21 homodimers) that 2S and 3SMI dimers can be effectively classified based on the ratio of intra­molecular/inter­molecular contacts and interface hydrophobicity. [6] Here, we created an extended dataset of 41 homodimers (2S: 25; 3SDI: 6; and 3SMI: 10) to design a methodology for the discrimination of 2S, 3SDI and 3SMI dimers using 3D structural properties.

Methods

Dataset creation

We created a dataset consisting of 41 homodimer complex structures (2S: 25; 3SDI: 5; and 3SMI: 10) from Protein Databank (PDB). [8] The unfolding pathways for these dimers observed using thermodynamic experiments were obtained from literature (Table 1). The selected homodimers are at least 40 residues per monomer.

Table 1. Dataset of homodimeric proteins divided into three groups according to their unfolding pathways.

Hydrophobicity
PDB ID Chain Protein name Cofactors Source ML (aa) B/2(Å 2) H int H inf H surf Reference
2S (25)
2cpg A&B transcriptional repressor CopG - Streptococcus agalactiae 45 1632 0.37 0.68 0.23 [ 14]
1arr A&B arc repressor - Bacteriophage P22 53 1962 0.47 0.58 0.23 [ 15]
1rop (Sym) repressor of protein Rop - E. coli 63 1345 0.41 0.51 0.24 [ 16]
5cro A&C Cro repressor - Bacteriophage lambda 66 648 0.49 0.73 0.29 [ 17]
1bfm A&B Histone B - Metdanotdermus fervidus 69 1593 0.50 0.72 0.30 [ 18]
1a7g (Sym) E2 DNA­binding domain - HPV strain 16E2 82 918 0.6 0.52 0.29 [ 19]
1vqb (Sym) gene V protein - Bacteriophage f1 87 850 0.58 0.66 0.31 [ 20]
1b8z A&B histone­like protein HU - Thermotoga maritima 90 1894 0.26 0.67 0.23 [ 21]
1ety A&B FIS protein - E. coli 98 2079 0.49 0.5 0.25 [ 22]
1y7q A&B SCAN domain of ZNF 174 - Homo sapiens 98 1508 0.67 0.54 0.26 [ 23]
1a8g A&B HIV­1 protease - HIV type 1 99 1785 0.63 0.49 0.33 [ 24]
1siv A&B SIV protease - SIV 99 1684 0.59 0.53 0.33 [ 24]
1vub A&B CcdB E. coli 101 1074 0.51 0.36 0.33 [ 25]
1cmb A&B Met repressor - E. coli 104 1813 0.35 0.54 0.27 [ 26]
3ssi (Sym) subtilisn inhibitor - Streptomyces albogriseolus 108 866 0.51 0.46 0.32 [ 27]
1wrp (Sym) trp repressor - E. coli 108 2243 0.69 0.54 0.29 [ 28]
1bet (Sym) .­nerve growth factor - Mus musculus 107 1366 0.47 0.47 0.31 [ 29]
1buo (Sym) Btb domain from PLZF protein - Homo sapiens 121 1972 0.56 0.41 0.28 [ 30]
1oh0 A&B ketosteroid isomerase - Pseudomonas putida 131 1036 0.49 0.41 0.31 [ 31]
2gsr A&B class π glutathione s­transferase - Sus scrofa 207 1331 0.5 0.3 0.27 [ 32]
1gsd A&B glutathione transferase A1­1 - Homo sapiens 208 1477 0.57 0.4 0.27 [ 33]
1gta (Sym) glutathione transferase - Schistosoma japonica 218 1505 0.5 0.42 0.26 [ 34]
2bqp A&B pea lectin Mn & Ca ion Garden pea 234 955 0.49 0.37 0.27 [ 35]
1hti A&B triosephosphate isomerase - Homo sapiens 248 1685 0.54 0.35 0.27 [ 36]
1ee1 A&B Nh(3)­dependent Nad(+) syntdetase - Bacillus subtilis 271 2507 0.51 0.43 0.24 [ 37]
3SDI (6)
1mul (Sym) histone­like protein hu­α - E. coli 90 1739 0.47 0.61 0.3 [ 38]
1hqo A&B Ure2 Protein - Saccharomyces cerevisiae 258 1656 0.54 0.44 0.3 [ 39]
1psc A&B paratdion hydrolase Cd ion Brevundimonas diminuta 329 1353 0.5 0.3 0.3 [ 40]
1cm7 A&B 3­isopropylmalate dehydrogenase - E. coli 363 2317 0.5 0.47 0.28 [ 41]
1aoz A&C ascorbate oxidase Cu ion Green zucchini 552 1817 0.43 0.47 0.29 [ 42]
1nl3 A&B SecA - Mycobacterium tuberculosis 835 1351 0.46 0.64 0.28 [ 43]
3SMI (10)
1a43 (Sym) C­terminal domain of HIV­1 capsid protein - HIV type 1 72 921 0.47 0.42 0.26 [ 44]
1qll A&B lysine­49 phospholipase A2 - Bothrops jararacussu 121 432 0.38 0.17 0.27 [ 45]
1dfx (Sym) desulfoferrodoxin Fe & Ca ion Desulfovibrio desulfuricans 125 1472 0.44 0.44 0.29 [ 47]
1yai B&C cu,zn superoxide dismutase Cu & Zn ion Photobacterium leiognathi 151 309 0.48 0.41 0.28 [ 46]
1spd A&B cu,zn superoxide dismutase Cu & Zn ion Homo sapiens 154 658 0.49 0.4 0.28 [ 47]
1run A&B cAMP receptor protein - E. coli 197 1542 0.66 0.47 0.28 [ 48]
11gs A&B glutathione­s­transferase - Homo sapiens 209 1197 0.5 0.28 0.3 [ 49]
1tya (Sym) tyrosyl­tRNA synthetase - Bacillus stearothermophilus 319 1513 0.48 0.55 0.26 [ 50]
1nd5 A&B prostatic acid phosphatase - Homo sapiens 354 1512 0.43 0.44 0.27 [ 51]
2crk (Sym) creatine kinase - Oryctolagus cuniculus 381 1119 0.46 0.18 0.25 [ 52]
Average for 2S 125 1509 0.51 0.50 0.28
SD 65 475 0.10 0.12 0.03
Average for 3SDI 405 1705 0.48 0.49 0.29
SD 259 358 0.04 0.14 0.01
Average for 3SMI 208 1067 0.48 0.38 0.27
SD 107 468 0.07 0.13 0.02

ML=monomer length; B/2 = subunit interface area. 2S=two-state; 3SDI=three-state with dimeric intermediate; 3SMI=three-state with monomeric intermediate. SIV = Simian immunodeficiency virus; HIV=Human immunodeficiency virus; HPV=Human papillomavirus; Ccdb = controller of cell division or death B protein; PLZF=promyelocytic leukemia zinc finger protein; FIS=factor for inversion stimulation. (sym) indicates that the dimer is generated from a single chain in the PDB by Protein Quaternary Structure Server (PQS) [53]. Interior hydrophobicity (Hint), interface hydrophobicity (Hinf) and surface hydrophobicity (Hsurf) for each dimer were calculated, separately, by the single equation Σnihi/Σni, where ni is the number of residue type i and hi is ASA hydrophobicity factor (based on SES (solvent excluded surface) & SAS (solvent accessible surface)) of residue type i from Pacios. [12]

Analyses of 2S and 3S homodimers

Interface area

The solvent accessible surface area (ASA) was computed using the program NACCESS. [9 ] The dimeric interface area (B) was calculated as ΔASA (change in ASA upon complex formation from monomer to dimer state). [10] We then calculated subunit interface area (B/2), due to the two­fold symmetry of homodimer complexes.

Interior, interface and exterior residues

Homodimer residues were classified into three categories (interior, interface and exterior) based on relative ASA. The percentage relative ASA was obtained by dividing the accessible surface area by the total surface area of a side­chain in an extended conformation in the tripeptide GXG. Exterior residues were defined as having a relative ASA > 5%, interior residues were defined as having a relative ASA < 5% and interface residues were defined satisfying the conditions ΔASA > 1Å2 & relative ASA > 5%. The 5% cut­off was optimized elsewhere by Miller et al. [11]

Fraction of interface to interior Hydrophobicity (Fhp)

Fhp (Fraction of interface to interior hydrophobicity) was defined by the equation (Hinf­Hext) /(Hint­Hext), where Hint is interior hydrophobicity, Hinf is interface hydrophobicity and Hext is exterior hydrophobicity. The individual hydrophobicity values were calculated using the equation Σnihi/Σni, where ni is the number of residue type i and hi is hydrophobicity value (based on SES (solvent excluded surface) & SAS (solvent accessible surface)) of type I, as described elsewhere. [12]

Small and large homodimers

By definition, small homodimers were defined as those with ML (monomer length) less than the dataset mean length (185 residues). By definition, large homodimers were defined as those with ML larger than the dataset mean length (185 residues).

Homodimers with small and large B/2

By definition, homodimers with small B/2 were defined as those whose B/2 is less than the dataset mean B/2 (1424 Å2). By definition, homodimers with large B/2 were defined as those whose B/2 is larger than the dataset mean B/2 (1424 Å2).

Results

Distribution of 2S and 3S in a Cartesian plane of monomer length and subunit interface area

Figure 1 shows the distribution of 2S and 3S in the Cartesian plane consisting of ML (monomer length) and B/2 (subunit interface area). It shows that 76% of small proteins form 2S and 60% of large proteins form 3S homodimers. Figure 1 also shows that 68% of 2S have large interface area and 45% of 3S have small interface area. 2S have ML in the range of 45­270 residues and 3S have ML in the range of 70­850 residues. However, 3SMI lie within 90­380 residues and 3SDI lie within 70­850 residues. 2S and 3S dimers have significantly different ML range (p = 0.05 in F test). Nonetheless, 2S and 3SMI have similar ML range (p = 0.05 in F test). The dataset mean ML is 185 residues. This lies between 2S mean (125 residues) and 3S mean (282 residues). Data also show that 2S and 3S ML means are different (p < 0.05). The mean ML for 3SDI is 405 and this is much greater than the mean ML for 2S (125) and 3SMI (208).

Figure 1.

Figure 1

Correlation between monomer length (ML) and subunit interface area (B/2) for three groups of homodimers. 2S: two­state; 3SDI: three­state with dimeric intermediate; 3SMI: three­state with monomeric intermediate. The two dash lines through 185 aa and 1424Å2 represent mean monomer length and mean B/2 for all homodimers, respectively. They classify the dimers into four regions (G1, G2, G3 and G4). The distributions of 2S, 3SDI and 3SMI dimers are given for each region. The value within parentheses is hydrophobicity factor (Fhp), calculated by the equation (Hinf ­ Hsurf) /(Hint ­Hsurf), where Hinf is interface hydrophobicity, Hint is interior hydrophobicity and Hsurf is surface hydrophobicity.

The B/2 range for 2S (650 ­ 2500 Å2) and 3S (300 ­ 2317 Å2) are overlapping and are not significantly different (p = 0.21). However, 3SMI and 3SDI are distinguished by the B/2 range (p < 0.05). 3SMI having small B/2 range (300­1550 Å2) and 3SDI having large B/2 range (1350­2317 Å2) are distinguished from each other. The dataset mean for B/2 is 1424Å2, which lies between 2S mean (1509 Å2) and 3S mean (1239 Å2). Interestingly, the 3SMI mean (1068 Å2) is close to 3S mean B/2 (p = 0.25) and 3SDI mean (1705 Å2) is close to 2S mean B/2 (p = 0.35).

In Figure 1, the distribution of 2S and 3S were divided into four regions (G1 to G4) based on the dataset mean of ML and B/2. Entries in G1 are small proteins with large B/2 and entries in G4 are large proteins with small B/2 (refer to methodology section for definition of small and large proteins). However, entries in G2 are small proteins with small B/2 and those in G3 are large proteins with large B/2. This grouping shows 84% of homodimers in G1 are 2S and 66% of homodimers in G4 are 3S. Nevertheless, homodimers in G3 there are 44% 2S and 56% 3S. Homodimers in G2 have 67% 2S and 33% 3S. It should be observed that 3S in G2 are solely 3SMI. The results show that 2S and 3S are distinctly and prevalently distinguished in G1 and G4 but not as much in G2 and G3. The distribution of 2S and 3S in regions G1 to G4 provide insight to their structural preference in terms of ML and B/2.

Exterior, interior and interface hydrophobicity in 2S and 3S

Table 1 gives the hydrophobicity of interior, interface and exterior residues for 2S, 3SDI and 3SMI. It also gives the mean hydrophobicity of interior, interface and exterior residues for 2S, 3SDI and 3SMI in the dataset. Very small 2S (≤ 90 residues) have greater interface hydrophobicity compared to interior hydrophobicity. However, this is not true with very large 2S (> 90 residues). It is also interesting to observe that majority of 3SMI have less interface hydrophobicity compared to interior hydrophobicity. Nonetheless, this is not true with a majority of 3SDI. Table 1 shows that the mean interface hydrophobicity values satisfy a condition (2S > 3SDI > 3SMI). However, the mean interior hydrophobicity satisfy a different condition (2S > (3SDI = = 3SMI)). The ratio of interface to interior hydrophobicity is ˜1 for 2S and 3SDI, while it is < 1 for 3SMI.

Fhp (Factor of interface to interior hydrophobicity) value in 2S and 3S

Figure 1, shows that 92% of entries in G1 have high Fhp value (> 0.5) and 83% of entries in G4 have low Fhp value (< 0.5). It also shows that 3S in G1 have high Fhp value and 2S in G4 have low Fhp value. Interestingly, 75% of entries in G2 have high Fhp value and 78% of entries in G3 have high Fhp value. Moreover, Figure 1 show that 91% 2S in G1 have high Fhp value and 75% 3S in G4 have low Fhp value. However, 100% 3S (2 entries) in G1 have high Fhp value and 100% 2S (2 entries) in G4 have low Fhp value. In G2, 75% of 2S have high Fhp value and 67% of 3S have high Fhp value. Nonetheless, 100% 3S have high Fhp value and 50% of 2S have high Fhp value in G3. The mean Fhp value for 2S and 3SDI is 1, while it is 0.5 for 3SMI. Thus, the distribution of 2S and 3S in the G1 to G4 regions is described.

Discussion

The mechanism of homodimer folding and binding has been investigated using denaturation experiments.[14­13 ] 3 dimensional structures are also available for many of these homodimers with known folding and binding mechanisms (Table 1). The folding and binding homodimer data collected from the literature is classified into three 2S, 3SMI and 3SDI. The study of homodimer folding and binding using energy models is computational intensive and time consuming. Alternatively, study on their folding and binding using structural data is found useful. [2] Recently, Mei and collegues documented the differences between 2S, 3SMI and 3SDI homodimers using 3S structure data. [2 ] The study provided structural insight to the mechanism of 2S and 3S folding. However, the analysis did not document parameters to differentiate 2S, 3SMI and 3SDI homodimers using structural data. In this study, we study an extended dataset of homodimer complexes to distinguish 2S and 3S homodimers using structural features. Results show that 76% of small proteins are 2S homodimers and 60% of large proteins are 3S homodimers. Thus, protein size plays an important role in determining the pathways of homodimer folding and binding. The result also shows that 68% of 2S have large subunit interface area and 45% of 3S have small subunit interface area. These observations suggest the importance of protein size and subunit interface area in determining the mechanism of homodimer formation.

The distribution of 2S and 3S in the G1 and G4 regions of Figure 1 show difference between them based on protein size, subunit interface area and Fhp. In G1, 84% dimers are 2S and 92% of dimers have high Fhp (> 0.5). Thus, entries with high Fhp are grouped in G1 and this region represents small proteins with large subunit interface area. Moreover, 91% of 2S in G1 have high Fhp. This implies that a majority of small proteins with large subunit interface area and high Fhp are 2S. 3S in G1 have high Fhp and this explains the presence of exceptional 3S entries in G1. Similarly, 66% of dimers are 3S and 83% of dimers have low Fhp (< 0.5) in G4. Thus, entries with low Fhp are grouped into G4 and this region represents large proteins with small subunit interface area. Furthermore, 75% 3S in G4 have low Fhp. 2S in G4 have low Fhp and this explains the presence of unusual 2S entries in G4. Entries in G2 and G3 have a mixture of 2S and 3S with low and high Fhp values. This is different to the distribution in G1 and G4. 100% 3S and 50% 2S in G3 have high Fhp and thus dimers in G3 are not distinguished by their folding mechanisms using structural parameters. The mean Fhp for 2S and 3SDI is 1, while it is 0.5 for 3SMI. The similarity between 2S and 3SDI in Fhp is interesting. It implies that binding after folding displayed by 3SMI resembles the association of protein­protein complexes. [13] However, the cooperative folding­binding displayed by 2S and 3SDI resembles a single­chain folding.

Thus, we show that small homodimers with large interface area and high Fhp are prevalently 2S. Similarly, large homodimers with small interface area and low Fhp are prevalently 3S. Hence, it is possible to distinguish 2S and 3S dimers using 3D structural data. However, small homodimers with small interface area and large homodimers with large interface area are not significantly distinguished into 2S and 3S using structural parameters ML, B/2 and Fhp. It should be noted that the conclusion made in the report are based on a limited set of homodimers given in Table 1.

Conclusions

The mechanisms of homodimer complexations have implications in drug discovery. However, elucidation of homodimer mechanism using unfolding experiments is difficult. Prediction of homodimer folding and binding using structural data has application in target validation. Here, we show that small proteins with large interface area and high Fhp form 2S. We also show that large proteins with small interface area and low Fhp form 3S. Therefore, it is feasible to differentiate 2S and 3S homodimers using structural data.

Acknowledgments

Lei Li acknowledges Nanyang Technological University, Singapore for support

Footnotes

Citation:Li et al., Bioinformation 1(2): 42-49 (2005)

References


Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group

RESOURCES