Reassessing buried surface areas in protein–protein complexes

Devlina Chakravarty; Mainak Guharoy; Charles H Robert; Pinak Chakrabarti; Joël Janin

doi:10.1002/pro.2330

. 2013 Aug 12;22(10):1453–1457. doi: 10.1002/pro.2330

Reassessing buried surface areas in protein–protein complexes

Devlina Chakravarty ¹, Mainak Guharoy ², Charles H Robert ², Pinak Chakrabarti ^1,^*, Joël Janin ^3,^*

PMCID: PMC3795504 PMID: 23934783

Abstract

The buried surface area (BSA), which measures the size of the interface in a protein–protein complex may differ from the accessible surface area (ASA) lost upon association (which we call DSA), if conformation changes take place. To evaluate the DSA, we measure the ASA of the interface atoms in the bound and unbound states of the components of 144 protein–protein complexes taken from the Protein–Protein Interaction Affinity Database of Kastritis et al. (2011). We observe differences exceeding 20%, and a systematic bias in the distribution. On average, the ASA calculated in the bound state of the components is 3.3% greater than in their unbound state, and the BSA, 7% greater than the DSA. The bias is observed even in complexes where the conformation changes are small. An examination of the bound and unbound structures points to a possible origin: local movements optimize contacts with the other component at the cost of internal contacts, and presumably also the binding free energy.

Keywords: protein–protein interaction, solvent accessible surface, conformation changes, binding free energy

Introduction

Protein–protein recognition is essential to all aspects of life. Its physical chemical basis has long been known to reside in desolvated atoms that form non-covalent electrostatic and van der Waals interactions at molecular interfaces,¹^–⁶ which the many entries of the Protein Data Bank (PDB⁷) reporting structures of protein–protein complexes illustrate. Yet, structure-based models of binding thermodynamics and kinetics are still far from quantitative, and they have little predictive value,⁸^,⁹ in part because they take into consideration the structure of the complex but not that of its components, and thus, they ignore conformation changes which contribute to the mechanism and affect the free energy balance of the reaction.

Here, we examine the effect of conformation changes on a geometric quantity, the area of the protein surface that becomes buried upon association. The buried surface area (BSA), commonly used to estimate the size of the interface between two macromolecules, is calculated on atomic coordinates of the complex alone.¹ We introduce a novel quantity, the decrease in surface area (DSA), derived from solvent accessible surface areas (ASA) measured on both the bound (in the complex) and the unbound (the free protein) states of the components. In a set of 281 bound/unbound structure pairs taken from the Protein–Protein Interaction Affinity Database (PPIAD),⁹ we observe differences between the two values that exceed 20% as the result of conformation changes between the two states. Even in complexes that show small conformation changes, the ASA of interface atoms tends to be greater in bound than unbound state, and thus, the BSA tends to be larger than the DSA, due to local movements of interface atoms that augment their contacts with the partner molecule. These findings have implications on the mechanism of protein–protein recognition and the thermodynamics of the association reaction.

Results

Dataset and definitions

The PPIAD comprises 144 binary protein–protein complexes and their free components.⁹ As the antibody moiety of seven antibody–antigen complexes are missing, this yields 281 B (bound in the complex) and 281 U (unbound, free) component structures. ASA values are measured separately in U, in B and in the complex C.

Atom i contributes to the BSA:

Its contribution to the DSA is the ASA lost upon association:

This differs from the contribution to the BSA by:

Summing over all the interface atoms of a component yields ASA(B), ASA(U), BSA, DSA, and ΔASA values for each protein in the dataset.

We also consider the fractional excess ASA of the interface atoms in B vs. U state:

Those calculations involve mapping atom i of B to the corresponding atom in U, a non-trivial process when B and U are from different PDB entries. Although they represent the same protein, B and U may be different genetic constructs, or have disordered segments with missing atoms. Local alignments reveal that 134 of the 281 U/B pairs have sequence identity >98%, with the rest in the range 90–98% referring to the shorter sequence. In addition, 36 interface residues (0.5% of all interface residues) belonging to 22 components are absent or different in the U and B sequences, and therefore excluded from the analysis.

Ambiguous labels, such as OD1/OD2 in Asp, occur in all PDB entries. If an interface atom is marked OD1 in B and OD2 in U, a spurious ΔASA will result. We circumvent the problem by taking both to be part of the interface. While this increases the number of the interface atoms by 8%, the BSA is unchanged, because the added atoms do not contribute to it. On the other hand, 4% of the interface atoms, belonging to 95 of the 281 components, have no counterpart in U. They are excluded from the summation yielding ASA(B) to make it consistent with ASA(U).

The ASA of interface atoms in the bound and unbound states

Figure 1 shows a histogram of the δA values observed in the database component proteins. Individual values and overall statistics are reported in Supporting Information. δA ranges from −50% to +50%, ΔASA, from −340 Å² to 940 Å². However, the proteins yielding such extreme values appear as grossly different constructs in B and U, or contain disordered segments in U, which leads to abnormal values of ASA(U). After excluding five proteins for those reasons, the range of ΔASA is still −250 Å² to 610 Å², that of δA −21% to 28%, which proves that the interface solvent accessibility can be very different in the U and B states, and therefore, the BSA very different from the DSA. The difference can be of either sign, but ΔASA is positive, and the BSA greater than the DSA, in 69% of cases. As a result, the average δA is positive: 3.3 ± 7.2% (mean ± standard deviation), and the BSA, which represents about 61% of ASA(B), exceeds the DSA by 7.2% on average.

Distribution of δA values. δA is the fractional excess ASA of the interface atoms in B vs. U state. The histogram represents its distribution over the 281 protein components of the Protein–Protein Interaction Affinity Database. Dark columns represent the 91 components of the core set where conformation changes are small. The average δA is the same (3.3%) as in the whole sample, but the standard deviation is less (4.9% vs. 7.2%). Five components (empty columns) are excluded from those statistics for reasons explained in the text.

These differences are only marginally related to the chemical and amino acid composition of the interfaces. Dividing interface groups into polar (N, O, and S containing) and nonpolar (C-containing), δA averages 2.2% for polar, 3.6% for non-polar groups. Polar groups contribute 42%, non-polar groups 58% of the BSA in protein–protein complexes.⁴ Moreover, the average δA calculated for each of the 20 types is positive in the range 1–7% for all types except histidine (δA = −1.1%), and it exceeds 4% for all non-polar residue types (Fig. 2).

Average δA per residue type. Here, δA is the ratio [ASA(B) − ASA(U)]/ASA(B), where the ASA of interface atoms in the B or U state is summed over all the residues of a given type present in the data set, which contains 7206 interface residues in total.

Missing atoms and different constructs in U and B state entries certainly contribute artifacts to ΔASA in some of the proteins of the dataset. To test the possibility that they introduce a bias toward positive values, we check the ASA of surface residues, which shows no significant excess in the bound state (δA = 0.9 ± 6.0%). We also consider a core set of 91 component proteins that have sequences at least 99% identical in B and U, no more than two missing interface atoms in U, and a RMSD less than 1.5 Å (see Supporting Information). The histogram of their δA values is shown alongside that of the larger set in Figure 1. It is narrower, but the average is the same (3.3 ± 4.9%) and only 25 out of 91 values are negative, a proportion that would have P < 10⁻⁵ if the two signs were equally probable.

Local conformation changes make interface atoms more accessible in the bound state

In the core set, B and U contain the same atoms, and the different accessibilities of the interface atoms in the two states must result from small conformation changes. PPIAD derives from a docking benchmark in which the complexes that display a RMSD less than 1.5 Å are described as undergoing rigid-body association.¹⁰ No disorder–order transition, secondary structure change, or domain rotation occurs in the core set, but side chain rotations and local movements of the polypeptide chain do, and Figure 3 shows how they can affect the accessibility of interface atoms. When ferredoxin binds to ferredoxin-NADP reductase,¹¹ its conformation remains unchanged (the RMSD is only 0.8 Å), but the movement of a tyrosine side chain allows its phenol group to make an interface H-bond and optimize contacts with atoms of the reductase [Fig. 3(A)]. In doing so, the group loses some contacts with neighboring ferredoxin atoms, and this causes ASA(B) to be larger than ASA(U). When glycoprotein IB-α binds to the von Willebrand factor,¹² a loop bearing interface residues 235–236 shifts by about 5 Å toward the partner protein [Fig. 3(B)], which accounts for most of the 1.7 Å RMSD. The shift increases the accessibility of the loop residues, and also that of Phe 199, part of the interface but not involved in the main chain movement.

Local movements that make interface residues more accessible in bound state. A: Tyr 25 of ferredoxin in U state (1CZP) and B state (the complex with ferredoxin-NADP reductase,¹¹ 1EWY); the interface atoms (lightly shaded) see their ASA increase by 17 Å² in B state. B: A surface loop of glycoprotein IB-α shifts position when it binds to von Willebrand factor domain A1 (cyan). Interface residues Asp 235, Val 236, and Phe 199 are colored red and the main chain green in B state (the complex,¹² 1M10), blue with the main chain pink in the free glycoprotein (1M0Z). Their ASA increases by 45 to 75 Å² from U to B, which accounts for most of the ΔASA seen in the complex.

Discussion

Whereas the BSA calculation uses only the coordinates of the complex, computing the DSA also requires the coordinates of the free components, which come from separate X-ray or NMR studies. Then, experimental errors and random changes in atomic positions lead to different ASA values in the B and U states, although the conformation changes may be insignificant. The difference can be of either sign, and it could be expected to cancel after summing over all the atoms of an interface (about 100 per component⁴). The present study shows that this is not the case. Even when conformation changes are small, the interface atoms tend to have a larger ASA in the B than the U state. The excess is only a few per cents, but the bias toward positive values is highly significant. Greater differences of either sign are observed when the conformation changes are large, and their average is also positive.

Figure 3 suggests a possible explanation for the bias toward a positive ΔASA. In a complex, the interface atoms tend to make fewer contacts within their component as they interact with the other component. As a consequence, some internal energy must be spent going from the U to the B state, and therefore, any model calculation that is done on the complex assuming the unbound components to have the same structure will overestimate the contribution of atom–atom interactions to the binding free energy. In a first approximation, we may take this contribution to scale linearly with the number of atom pairs at the interface, itself is a linear function of the BSA. Experimental values of the Gibbs free energy of dissociation (ΔG_d) are available for all the complexes of the PPIAD.⁹ They display no correlation with the BSA in the whole dataset, but the correlation is significant (R = 0.54) for a subset of 70 complexes with a RMSD < 1 Å, and it remains the same after replacing the BSA with the DSA. The two components of 29 of those complexes also belong to the core set of the present study. If the linear correlation reported in Ref.9 is applied to their ΔASA, the cost of the U to B transition evaluates to only between −0.2 and 0.7 kcal mol⁻¹_, but significant contributions are expected in complexes that undergo larger conformation changes.

Methods

Atomic coordinates were obtained from the PDB for the 144 complexes of the PPIAD (http://bmm.cancerresearchuk.org/∼bmmadmin/Affinity/), and for 281 of their free components. In 17 entries that report NMR structures, only the first model was retained. Ubiquitin entry 1yj1, which contains a d-glutamine, was replaced by 1ubq with the natural residue. Modified residues were converted to the original (e.g., selenomethionine to methionine) reported in the sequence. All other HETATM lines were omitted.

Sequence alignments used the Smith–Waterman algorithm and the EMBOSS software.¹³ Conformation changes were estimated by the root-mean-square distance (RMSD) between equivalent Cα atoms after least-square superposition with PROFIT.¹⁴ Solvent ASA were calculated with NACCESS,¹⁵ which implements the Lee and Richards algorithm.¹⁶ Any atom losing more than 0.1 Å² ASA between states B and C was considered as part of the interface. If this concerned an atom with an ambiguous label, the other atom in the pair was made part of the interface: Asp OD1/OD2, Glu OE1/OE2, Phe and Tyr CD1/CD2 and CE1/CE2, and also Arg NH1/NH2, Asn OD1/ND2, Gln OE1/NE, His CD1/ND2 and CE1/NE2, which are difficult to distinguish in X-ray structures.

Supplementary material

Additional Supporting Information may be found in the online version of this article.

pro0022-1453-SD1.pdf^{(106.2KB, pdf)}

pro0022-1453-SD2.txt^{(12.7KB, txt)}

References

1.Chothia C, Janin J. Principles of protein–protein recognition. Nature. 1975;256:705–708. doi: 10.1038/256705a0. [DOI] [PubMed] [Google Scholar]
2.Jones S, Thornton JM. Principles of protein–protein interactions. Proc Natl Acad Sci USA. 1996;93:13–20. doi: 10.1073/pnas.93.1.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Lo Conte L, Chothia C, Janin J. The atomic structure of protein–protein recognition sites. J Mol Biol. 1999;285:2177–2198. doi: 10.1006/jmbi.1998.2439. [DOI] [PubMed] [Google Scholar]
4.Chakrabarti P, Janin J. Dissecting protein–protein recognition sites. Proteins. 2002;47:334–343. doi: 10.1002/prot.10085. [DOI] [PubMed] [Google Scholar]
5.Nooren IM, Thornton JM. Diversity of protein-protein interactions. EMBO J. 2003;22:3486–3492. doi: 10.1093/emboj/cdg359. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Janin J, Bahadur RP, Chakrabarti P. Protein–protein interaction and quaternary structure. Q Rev Biophys. 2008;41:133–180. doi: 10.1017/S0033583508004708. [DOI] [PubMed] [Google Scholar]
7.Berman H, Henrick K, Nakamura H, Markley JL. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 2007;35:D301–D303. doi: 10.1093/nar/gkl971. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Kastritis PL, Bonvin AM. Are scoring functions in protein–protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. J Proteome Res. 2010;9:2216–2225. doi: 10.1021/pr9009854. Erratum in: J Proteome Res (2011) 10:921–922. [DOI] [PubMed] [Google Scholar]
9.Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AM, Janin J. A structure-based benchmark for protein–protein binding affinity. Protein Sci. 2011;20:482–491. doi: 10.1002/pro.580. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hwang H, Vreven T, Janin J, Weng Z. Protein–protein docking benchmark version 4.0. Proteins. 2010;78:3111–3114. doi: 10.1002/prot.22830. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Morales R, Kachalova G, Vellieux F, Charon MH, Frey M. Crystallographic studies of the interaction between the ferredoxin-NADP+ reductase and ferredoxin from the cyanobacterium Anabaena: looking for the elusive ferredoxin molecule. Acta Crystallogr D. 2000;56:1408–1412. doi: 10.1107/s0907444900010052. [DOI] [PubMed] [Google Scholar]
12.Huizinga EG, Tsuji S, Romijn RA, Schiphorst ME, de Groot PG, Sixma JJ, Gros P. Structures of glycoprotein Ibalpha and its complex with von Willebrand factor A1 domain. Science. 2002;297:1176–1179. doi: 10.1126/science.107355. [DOI] [PubMed] [Google Scholar]
13.Rice P, Longden I, Bleasby A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000;6:276–277. doi: 10.1016/s0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
14.McLachlan AD. Rapid comparison of protein structures. Acta Crystallogr A. 1982;38:871–873. [Google Scholar]
15.Hubbard SJ, Thornton JM. “NACCESS”, computer program. Department of Biochemistry and Molecular Biology. University College London; 1993. [Google Scholar]
16.Lee B, Richards FM. The interpretation of protein structures: estimation of static accessibility. J Mol Biol. 1971;3:379–400. doi: 10.1016/0022-2836(71)90324-x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

pro0022-1453-SD1.pdf^{(106.2KB, pdf)}

pro0022-1453-SD2.txt^{(12.7KB, txt)}

[b1] 1.Chothia C, Janin J. Principles of protein–protein recognition. Nature. 1975;256:705–708. doi: 10.1038/256705a0. [DOI] [PubMed] [Google Scholar]

[b2] 2.Jones S, Thornton JM. Principles of protein–protein interactions. Proc Natl Acad Sci USA. 1996;93:13–20. doi: 10.1073/pnas.93.1.13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b3] 3.Lo Conte L, Chothia C, Janin J. The atomic structure of protein–protein recognition sites. J Mol Biol. 1999;285:2177–2198. doi: 10.1006/jmbi.1998.2439. [DOI] [PubMed] [Google Scholar]

[b4] 4.Chakrabarti P, Janin J. Dissecting protein–protein recognition sites. Proteins. 2002;47:334–343. doi: 10.1002/prot.10085. [DOI] [PubMed] [Google Scholar]

[b5] 5.Nooren IM, Thornton JM. Diversity of protein-protein interactions. EMBO J. 2003;22:3486–3492. doi: 10.1093/emboj/cdg359. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b6] 6.Janin J, Bahadur RP, Chakrabarti P. Protein–protein interaction and quaternary structure. Q Rev Biophys. 2008;41:133–180. doi: 10.1017/S0033583508004708. [DOI] [PubMed] [Google Scholar]

[b7] 7.Berman H, Henrick K, Nakamura H, Markley JL. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 2007;35:D301–D303. doi: 10.1093/nar/gkl971. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b8] 8.Kastritis PL, Bonvin AM. Are scoring functions in protein–protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. J Proteome Res. 2010;9:2216–2225. doi: 10.1021/pr9009854. Erratum in: J Proteome Res (2011) 10:921–922. [DOI] [PubMed] [Google Scholar]

[b9] 9.Kastritis PL, Moal IH, Hwang H, Weng Z, Bates PA, Bonvin AM, Janin J. A structure-based benchmark for protein–protein binding affinity. Protein Sci. 2011;20:482–491. doi: 10.1002/pro.580. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b10] 10.Hwang H, Vreven T, Janin J, Weng Z. Protein–protein docking benchmark version 4.0. Proteins. 2010;78:3111–3114. doi: 10.1002/prot.22830. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b11] 11.Morales R, Kachalova G, Vellieux F, Charon MH, Frey M. Crystallographic studies of the interaction between the ferredoxin-NADP+ reductase and ferredoxin from the cyanobacterium Anabaena: looking for the elusive ferredoxin molecule. Acta Crystallogr D. 2000;56:1408–1412. doi: 10.1107/s0907444900010052. [DOI] [PubMed] [Google Scholar]

[b12] 12.Huizinga EG, Tsuji S, Romijn RA, Schiphorst ME, de Groot PG, Sixma JJ, Gros P. Structures of glycoprotein Ibalpha and its complex with von Willebrand factor A1 domain. Science. 2002;297:1176–1179. doi: 10.1126/science.107355. [DOI] [PubMed] [Google Scholar]

[b13] 13.Rice P, Longden I, Bleasby A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000;6:276–277. doi: 10.1016/s0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]

[b14] 14.McLachlan AD. Rapid comparison of protein structures. Acta Crystallogr A. 1982;38:871–873. [Google Scholar]

[b15] 15.Hubbard SJ, Thornton JM. “NACCESS”, computer program. Department of Biochemistry and Molecular Biology. University College London; 1993. [Google Scholar]

[b16] 16.Lee B, Richards FM. The interpretation of protein structures: estimation of static accessibility. J Mol Biol. 1971;3:379–400. doi: 10.1016/0022-2836(71)90324-x. [DOI] [PubMed] [Google Scholar]

PERMALINK

Reassessing buried surface areas in protein–protein complexes

Devlina Chakravarty

Mainak Guharoy

Charles H Robert

Pinak Chakrabarti

Joël Janin

Abstract

Introduction

Results

Dataset and definitions

The ASA of interface atoms in the bound and unbound states

Figure 1.

Figure 2.

Local conformation changes make interface atoms more accessible in the bound state

Figure 3.

Discussion

Methods

Supplementary material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Reassessing buried surface areas in protein–protein complexes

Devlina Chakravarty

Mainak Guharoy

Charles H Robert

Pinak Chakrabarti

Joël Janin

Abstract

Introduction

Results

Dataset and definitions

The ASA of interface atoms in the bound and unbound states

Figure 1.

Figure 2.

Local conformation changes make interface atoms more accessible in the bound state

Figure 3.

Discussion

Methods

Supplementary material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases