Summary
A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments.
1 INTRODUCTION
As protein–protein interactions are fundamental to all biological processes, several attempts have been made recently to understand the specificity of the contacting residues (Bock and Gough, 2001; Caffrey et al., 2004; Glaser et al., 2003; Hoskins et al., 2006; Jones and Thornton, 1997; Miguel, 2004; Neuvirth et al., 2004). Studies investigating the role of hydrogen bond formation, hydrophobic residues and overall electrostatics (Gao et al., 2004; Janin and Chothia, 1990; Jones and Thornton, 1996) have not revealed any unique pattern that could be used to predict the potential protein–protein interactions sites (DeLano, 2002). Hence, a combination of different types of information is needed to accurately predict areas of proteins involved in interactions. Over the past few years, substantial progress has been made towards predicting 3D structures of protein complexes by docking known structures of the individual unbound subunits, as demonstrated in CAPRI competitions (Janin, 2005; Mendez et al., 2005). However, the quality of the models is still dependent on additional available biochemical data or homologous structures.
We implemented a new method that can be useful for guiding docking calculations by locating potential binding sites on protein surfaces. The InterProSurf website can be used to analyze interacting sites in 3D structures of known protein complexes. In practice, InterProSurf can be most efficiently used in combination with evolutionary information on protein sequences (Glaser et al., 2003; Innis et al., 2000; Res and Lichtarge, 2005; Schein et al., 2005) and data from mutagenesis experiments to locate functional important sites on the protein surface. We have used this methodology to guide mutagenesis experiments of the E1 envelope protein of the Venezuelan Equine Encephalitis Virus and to design entry sensitive mutants (Negi et al., 2006). We illustrate here the use of InterProSurf for finding potential interacting regions of the Bacillus anthracis toxins with the protective antigen membrane transport protein, and potential receptor binding sites of the measles virus hemagglutinin (MV H).
2 PROGRAM FEATURES
2.1 Computational method
We calculated the propensity of amino acid residues using 72 protein complexes (Negi and Braun, 2007), which includes: protease- and proteinase-inhibitors, enzyme complexes, antibody–antigen, hormone-receptor, G-protein, viral protein, etc. Furthermore, a cluster algorithm was used to locate regions on the protein surface with high interface propensities. Each cluster of surface residues was ranked by a scoring function defined by the average propensity of a cluster weighted with the accessible surface area (ASA). The number of high-ranking clusters predicted as interface regions were empirically determined to achieve an optimal balance between sensitivity and precision. The overall accuracy of the method is ~70% for a test data set of 21 protein complexes not used in deriving the interface propensity scale.
2.2 InterProSurf user interface
InterProSurf can be used to: (1) predict interacting residues on a protein subunit and (2) locate interface residues in either a protein complex available from the Protein Data Bank (PDB) or in a user-defined protein complex. To predict functional residues in a protein subunit, InterProSurf predicts a list of amino acid residues based on their ASA and propensities most likely to be responsible for protein interaction. To analyze the protein interface within a protein complex, users can input the PDB codes or upload the co-ordinate files of complexes. InterProSurf analyzes each chain within the complex and prints out interface residues, interface area of each residue and a change in the surface area of each residue upon complex formation. All input files are supported in standard PDB format and the predicted residues on the protein surface are visualized by Jmol (http://jmol.sourceforge.net/).
3 APPLICATIONS
3.1 Interacting sites of the Anthrax toxin complex
One of the catalytic toxins of B.anthracis, lethal factor (LF), enters cells by binding to the protective antigen (PA) toxin. This interaction is mediated by the protective antigen binding domain (PABD); the N-terminal domain of LF. We have used InterProSurf to predict residue clusters on the surface of PABD of LF (Pannifer et al., 2001) (PDB: 1J7N; N-terminal domain) that were most likely to interact with other proteins. This result indicated a conserved ridge in the PABD that was most likely to be the area of binding to PA (Fig. 1A). This area coincided with a region found previously by extensive point mutagenesis of surface exposed residues (Lacy et al., 2005). Two of the residues that reduce or eliminate binding to PA (D187, Y236; magenta) are in the ridge of residues identified computationally, while others (L188, Y223, H229, L235, D182; red) lie immediately adjacent. Recent docking calculations and complementary charge reversal mutations demonstrated (Lacy et al., 2005) that D187 and the charged residues E135 and E142 in this area (in yellow) form ion pairs with specific residues of PA.
Fig. 1.
Interface predictions by InterProSurf of the PA-binding domain of LF (A) and MV H (B). Predicted residues by InterProSurf and confirmed by experimental results (magenta), additional predicted residues (blue) and additional residues important for binding in red and yellow (see text).
3.2 Measles virus hemagglutinin-binding sites for two receptors
Measles virus (MV) infection leads to an immune suppression, and secondary infections cause more than 600 000 deaths worldwide especially of children in developing countries. MV enters the host cell by binding to the immune-cell-specific protein SLAM or the ubiquitous protein CD46 receptor via the MV H protein (Dorig et al., 1993; Tatsuo et al., 2000). We have modeled the 3D structure of the MV H protein based on the X-ray crystal structure of the Newcastle disease virus (NDV) hemagglutinin-neuramidase (HN) (sequence identity equal to 14%). Mutagenesis experiments guided by this model identified two separate areas important for SLAM or CD46 binding (Vongpunsawad et al., 2004). Here, we analyze the predictions of InterProSurf for interface residues of MV H (Fig. 1B). The InterProSurf predictions correctly identified residues known to be important for CD46 fusion such as A428, F431, L464, Y481 and F552, Y553 and P554 for SLAM binding (in magenta) as determined by previous mutagenesis studies. Additional predicted residues (in blue) are near these two interacting sites. Further experimental mutagenesis studies and computational docking calculations should lead to a more precise determination of the two interaction sites of MV H to its receptors.
Supplementary Material
Acknowledgments
This project is supported by NIH grants R21 AI055746 and R01 AI064913 to W.B. and NIH grant (5UO1-AI053858-03; Johnny Peterson, PI) to C.H.S.
Footnotes
Supplementary information: Other test examples are available as Supplementary Material at Bioinformatics online.
Conflict of Interest: none declared.
References
- Bock JR, Gough DA. Predicting protein-protein interactions from primary structure. Bioinformatics. 2001;17:455–460. doi: 10.1093/bioinformatics/17.5.455. [DOI] [PubMed] [Google Scholar]
- Caffrey DR, et al. Are protein-protein interfaces more conserved in sequence than the rest of the protein surface? Protein Sci. 2004;13:190–202. doi: 10.1110/ps.03323604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeLano WL. Unraveling hot spots in binding interfaces: progress and challenges. Curr Opin Struct Biol. 2002;12:14–20. doi: 10.1016/s0959-440x(02)00283-x. [DOI] [PubMed] [Google Scholar]
- Dorig RE, et al. The human Cd46 molecule is a receptor for measles-virus (Edmonston strain) Cell. 1993;75:295–305. doi: 10.1016/0092-8674(93)80071-l. [DOI] [PubMed] [Google Scholar]
- Gao Y, et al. Structure-based method for analyzing protein-protein interfaces. J Mol Model. 2004;10:44–54. doi: 10.1007/s00894-003-0168-3. [DOI] [PubMed] [Google Scholar]
- Glaser F, et al. ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics. 2003;19:163–164. doi: 10.1093/bioinformatics/19.1.163. [DOI] [PubMed] [Google Scholar]
- Hoskins J, et al. An algorithm for predicting protein-protein interaction sites: abnormally exposed amino acid residues and secondary structure elements. Protein Sci. 2006;15:1017–1029. doi: 10.1110/ps.051589106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Innis CA, et al. Evolutionary trace analysis of TGF-beta and related growth factors: implications for site-directed mutagenesis. Protein Eng. 2000;13:839–847. doi: 10.1093/protein/13.12.839. [DOI] [PubMed] [Google Scholar]
- Janin J. Assessing predictions of protein-protein interaction: the CAPRI experiment. Protein Sci. 2005;14:278–283. doi: 10.1110/ps.041081905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janin J, Chothia C. The structure of protein-protein recognition sites. J Bio Chem. 1990;265:16027–16030. [PubMed] [Google Scholar]
- Jones S, Thornton JM. Principles of protein-protein interactions. Proc Natl Acad Sci USA. 1996;93:13–20. doi: 10.1073/pnas.93.1.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones S, Thornton JM. Analysis of protein-protein interaction sites using surface patches. J Mol Biol. 1997;272:121–132. doi: 10.1006/jmbi.1997.1234. [DOI] [PubMed] [Google Scholar]
- Lacy DB, et al. A model of anthrax toxin lethal factor bound to protective antigen. Proc Natl Acad Sci USA. 2005;102:16409–16414. doi: 10.1073/pnas.0508259102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mendez R, et al. Assessment of CAPRI predictions in rounds 3–5 shows progress in docking procedures. Proteins Struct Funct Bioinformatics. 2005;60:150–169. doi: 10.1002/prot.20551. [DOI] [PubMed] [Google Scholar]
- Miguel RN. Sequence patterns derived from the automated prediction of functional residues in structurally-aligned homologous protein families. Bioinformatics. 2004;20:2380–2389. doi: 10.1093/bioinformatics/bth255. [DOI] [PubMed] [Google Scholar]
- Negi SS, Braun W. Statistical analysis of physical-chemical properties and prediction of protein-protein interfaces. J Mol Model. 2007 doi: 10.1007/s00894-007-0237-0. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Negi SS, et al. Determining functionally important amino acid residues of the E1 protein of Venezuelan equine encephalitis virus. J Mol Model. 2006;12:921–929. doi: 10.1007/s00894-006-0101-7. [DOI] [PubMed] [Google Scholar]
- Neuvirth H, et al. ProMate: a structure based prediction program to identify the location of protein-protein binding sites. J Mol Biol. 2004;338:181–199. doi: 10.1016/j.jmb.2004.02.040. [DOI] [PubMed] [Google Scholar]
- Pannifer AD, et al. Crystal structure of the anthrax lethal factor. Nature. 2001;414:229–233. doi: 10.1038/n35101998. [DOI] [PubMed] [Google Scholar]
- Res I, Lichtarge O. Character and evolution of protein-protein interfaces. Phys Biol. 2005;2:S36–S43. doi: 10.1088/1478-3975/2/2/S04. [DOI] [PubMed] [Google Scholar]
- Schein CH, et al. Molego-based definition of the architecture and specificity of metal-binding sites. Proteins Struct Func Bioinformatics. 2005;58:200–210. doi: 10.1002/prot.20253. [DOI] [PubMed] [Google Scholar]
- Tatsuo H, et al. SLAM (CDw150) is a cellular receptor for measles virus. Nature. 2000;406:893–897. doi: 10.1038/35022579. [DOI] [PubMed] [Google Scholar]
- Vongpunsawad S, et al. Selectively receptor-blind measles viruses: identification of residues necessary for SLAM- or CD46-induced fusion and their localization on a new hemagglutinin structural model. J Virol. 2004;78:302–313. doi: 10.1128/JVI.78.1.302-313.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.