Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2011 Feb 28;6(2):e16774. doi: 10.1371/journal.pone.0016774

Predictions of Hot Spot Residues at Protein-Protein Interfaces Using Support Vector Machines

Stefano Lise 1, Daniel Buchan 1, Massimiliano Pontil 1, David T Jones 1,*
Editor: Collin Stultz2
PMCID: PMC3046169  PMID: 21386962

Abstract

Protein-protein interactions are critically dependent on just a few ‘hot spot’ residues at the interface. Hot spots make a dominant contribution to the free energy of binding and they can disrupt the interaction if mutated to alanine. Here, we present HSPred, a support vector machine(SVM)-based method to predict hot spot residues, given the structure of a complex. HSPred represents an improvement over a previously described approach (Lise et al, BMC Bioinformatics 2009, 10:365). It achieves higher accuracy by treating separately predictions involving either an arginine or a glutamic acid residue. These are the amino acid types on which the original model did not perform well. We have therefore developed two additional SVM classifiers, specifically optimised for these cases. HSPred reaches an overall precision and recall respectively of 61% and 69%, which roughly corresponds to a 10% improvement. An implementation of the described method is available as a web server at http://bioinf.cs.ucl.ac.uk/hspred. It is free to non-commercial users.

Introduction

Alanine scanning mutagenesis is a powerful experimental methodology for investigating the structural and energetic characteristics of protein complexes [1]. Individual amino-acids are systematically mutated to alanine and changes in free energy of binding (Inline graphic) measured. As alanine amino acids do not have a side-chain beyond the Inline graphic-carbon, this procedure in effect tests the importance of individual side-chain groups for complex formation, providing a map of the so-called functional epitope. Results from a number of experiments indicate that only a small subset of contact residues contribute significantly to the binding free energy. These residues have been termed ‘hot spots’ and if mutated they can disrupt the interaction. For the majority of interface residues instead, the effect of an alanine mutation is minimal [2].

Hot spots are typically defined as those residues for which Inline graphic. In recent years, several computational approaches have been developed to identify them at protein-protein interfaces [3][16]. Accurate predictive models provide a valuable complement to experimental studies and add to our understanding of the factors that influence affinity and specificity in protein-protein interfaces. In addition, they can have important applications in the field of drug discovery. A number of recent studies have been successful in developing (drug-like) small molecules that bind at hot spots and inhibit complex formation [17]. Reliable hot spots predictions could therefore represent the first step in rational drug design projects [18].

In a previous work, we presented a machine learning strategy to identify hot spot residues in protein-protein interfaces, given the structure of the complex [12]. We considered the basic energetic terms that contribute to hot spot interactions, i.e. van der Waals potentials, solvation energy, hydrogen bonds and Coulomb electrostatics, and treated them as input features of a Support Vector Machine (SVM) classifier. We found that the method could predict hot spots with overall good accuracy, comparing favourably to other available approaches. However, by grouping mutations according to the amino acid type, we observed that in some cases the SVM model did not perform too well, for example on predictions involving arginine or glutamic acid residues.

In this paper, we report the development of HSPred, a hot spot prediction method that aims to overcome the limitations highlighted above. For this purpose, we have integrated the original approach with two additional SVM classifiers, specifically built for mutations involving Arg and Glu residues. The two additional models are trained on the same data set as the ‘general’ model but are biased to perform well on Arg and Glu due to a different choice of input features. Employing a strict cross-validation scheme, we show that this strategy leads to a significant improvement over the previous version of the method. We further validate the results by applying HSPred to an external test case, which is not part of the original data set.

Results and Discussion

The problem we have investigated is the prediction of hot spot residues at a protein-protein interface using a machine learning approach. As input variables, we have considered basic energy terms (van der Waals, hydrogen bond, electrostatic and desolvation potentials) calculated from the complex structure. We have distinguished contributions from different structural regions in the complex, leading to Inline graphic distinct types of interactions: side-chain inter-molecular, environment inter-molecular and side-chain intra-molecular (see Figure 1). To each of them, we have associated Inline graphic input features, corresponding to the energy terms above. In total therefore there are Inline graphic input features but some of them have not been included in our models because scarcely informative (see Materials and Methods for more details). Support Vector Machines (SVMs) have then be used to learn from a training set to classify residues as hot spots Inline graphic or non hot spots Inline graphic.

Figure 1. Schematic overview of protein structural regions which define the different energy contributions.

Figure 1

The red filled area, (a), corresponds to side-chain atoms of the mutated residue; the red and blue striped regions, (b) and (c) respectively, correspond to atoms within Inline graphic of the Inline graphic of the mutated residue. We distinguish Inline graphic types of interactions: side-chain inter-molecular between (a) and (c), environment inter-molecular between (b) and (c), side-chain intra-molecular between (a) and (b).

We have built a classifier, SVMX, based on the following Inline graphic features: van der Waals, hydrogen bond and solvation side-chain inter-molecular energies; van der Waals, hydrogen bond and solvation environment inter-molecular energies; van der Waals side-chain intra-molecular energy. A summary of the results is reported in Table 1 according to various performance measures. The precision Inline graphic is the fraction of true hot spots among the set of residues predicted to be hot spots; the recall Inline graphic is the fraction of correctly identified hot spots relative to all those present in the data set; the Inline graphic score is a weighted average of the precision and recall; the Matthews Correlation Coefficient (Inline graphic) is a commonly used measure of the quality of binary classifications (see Methods section for more details). SVMX is very similar in its design and performance to the model described in [12]. With respect to the latter, SVMX does not rely on any electrostatic term but it includes the van der Waals side-chain intra-molecular energy. We report in Table 2 the weight of each energy term in the linear scoring function.

Table 1. Summary of results.

Model Precision Recall F1 score MCC
SVMInline graphic Inline graphic Inline graphic Inline graphic Inline graphic
HSPred Inline graphic Inline graphic Inline graphic Inline graphic

Cross-validated estimates of performances for SVMInline graphic and HSPred. MCC is the Matthews correlation coefficient (see Methods section for definition of the various performance measures).

Table 2. Weight of energy terms in the scoring functions.

Feature (energy term) SVMInline graphic SVMInline graphic SVMInline graphic
Side-chain inter-molecular
van der Waals Inline graphic Inline graphic Inline graphic
hydrogen bond Inline graphic Inline graphic Inline graphic
electrostatics Inline graphic Inline graphic Inline graphic
desolvation Inline graphic Inline graphic Inline graphic
Environment inter-molecular
van der Waals Inline graphic Inline graphic Inline graphic
hydrogen bond Inline graphic Inline graphic Inline graphic
electrostatics Inline graphic Inline graphic Inline graphic
desolvation Inline graphic Inline graphic Inline graphic
Side-chain intra-molecular
van der Waals Inline graphic Inline graphic Inline graphic
hydrogen bond Inline graphic Inline graphic Inline graphic
electrostatics Inline graphic Inline graphic Inline graphic
desolvation Inline graphic Inline graphic Inline graphic
Threshold Inline graphic Inline graphic Inline graphic

We report the absolute value of the weight associated to each feature in the scoring functions, together with the threshold that defines the decision boundary. Energy terms which are not included in the scoring function are denoted with the Inline graphic symbol.

We have analysed the SVMX predictions by grouping mutations according to the amino acid type. In Figure 2(a) we report the results for the most frequent amino acids in the database. SVMX has a good accuracy over most of amino acid types and is not biased toward some specific amino acid property (e.g. hydrophobic or charged residues). At the same time, however, it does not perform so well on mutations involving Arg and Glu. To tackle this problem, we have developed two additional classifiers, respectively SVMInline graphic and SVMInline graphic, specifically optimised for these amino acids. SVMInline graphic and SVMInline graphic have been trained using the whole data set but differ from SVMX for the choice of input features and the associated weights (see Table 2).

Figure 2. Predictions results for different amino acids.

Figure 2

Only the most frequent amino acid in the database are reported. In (a) are the results for SVMInline graphic, in (b) for HSPred, which includes SVMInline graphic and SVMInline graphic.

As can be seen in Figure 2(b), SVMInline graphic and SVMInline graphic achieve significantly improved results on Glu and Arg predictions. A further confirmation of the improvement comes from analysing the correlation coefficients Inline graphic between the classifiers output scores and the observed Inline graphic values. For Glu residues, Inline graphic increases from Inline graphic for SVMX to Inline graphic for SVMInline graphic; for Arg, Inline graphic increases from Inline graphic for SVMX to Inline graphic for SVMInline graphic. This suggests that SVMInline graphic and SVMInline graphic are indeed more effective than SVMX in describing mutations involving Glu and Arg residues, respectively, and that the observed improvement is genuine and not due to over-fitting.

We have combined SVMX, SVMInline graphic and SVMInline graphic into a unique classifier, HSPred. SVMInline graphic and SVMInline graphic act respectively on Glu and Arg amino acids, SVMInline graphic on all other amino acid types. We report a summary of the results for HSPred in Table 1. HSPred performs significantly better than SVMX, reflecting the inclusion of SVMInline graphic and SVMInline graphic. As can be seen from Figure 2(b), predictions on Arg and Glu are roughly as accurate as for the other residues. HSPred therefore successfully overcomes the major limitations of the previously proposed method [12]. Most notable is the improvement on Glu predictions.

To further validate HSPred, we have applied it to the protein-protein complex Ras/RalGDS (PDB code: 1LFD). The Ras/RalGDS complex is not homologous to any of the complexes in the original data set and it can then be regarded as an independent external test case. Experimental Inline graphic values are available in [19], from which we have taken the data corresponding to Inline graphic interface alanine mutations (Inline graphic on Ras and Inline graphic on RalGDS). HSPred correctly identifies Inline graphic hot spot (true positives) and Inline graphic non hot spot residues (true negatives). However, Inline graphic residues are wrongly predicted as hot spots (false positives). The predictions are illustrated in Figure 3. These results are in line with the cross-validated estimates in Table 1 and confirm the accuracy of HSPred.

Figure 3. Ras/RalGDS complex.

Figure 3

Mapping of HSPred predictions onto the the complex (PDB code: 1LFD). The monomers have been rotated to display the interface. Red residues are correctly predicted hot spots (true positives); blue residues are correctly predicted non hot spots (true negatives); yellow residues are non hot spots erroneously predicts as hot spots (false positives).

We have implemented HSPred as a fully automatic web server, available at http://bioinf.cs.ucl.ac.uk/hspred. As input it requires a PDB formatted file containing the structure of the protein-protein complex. The user needs to define the interface to analyse by specifying the chain identifiers for each protein on either side of the interface. The output consists of two components: (i) a Jmol applet to visualise and explore the predictions using the protein structures and (ii) a table listing HSPred scores for each interface amino acid. The output page for an illustrative example is reported in Figure 4. The complex tested is Interleukin 4 (IL-4) bound to its receptor Inline graphic chain (IL-4RInline graphic) (PDB code: 1IAR). Alanine mutational data from experiments are available for this complex [20], [21]. Out of Inline graphic interface mutations, HSPred predicts Inline graphic true positives, Inline graphic true negatives, Inline graphic false positive and Inline graphic false negatives. These results further validate the predictive accuracy of HSPred.

Figure 4. Sample output for the HSPred server.

Figure 4

Screenshot of the results page for the IL-4/IL-4RInline graphic complex (PDB code: 1IAR). On top, predictions are visualised using a Jmol applet. On the left is IL-4 (chain A), on the right IL-4RInline graphic (chain B). Predicted hot spots are in red, non hot spots in white. Residues not part of the interface are in blue. Below, predictions scores for each interface residues (excluding Pro and Gly amino acids) are reported (note that only the first few residues are displayed here). Scores greater than zero corresponds to predicted hot spots.

To conclude, in this paper we have described HSPred, an accurate and reliable computational method to predict hot spot residues at protein-protein interfaces, given the structure of a complex. HSPred is available as a web server and it is free for non-commercial users. We believe that HSPred predictions will be useful in guiding biomedical experiments. In particular, we are currently testing its capacity to identify druggable binding sites at protein-protein interfaces [22].

Materials and Methods

Data sets

In our study, we have used the same data set as in [12]. It consists of Inline graphic protein complex structures for which alanine mutational data are available. Only protein-protein interactions involving an extended interface are included (we have therefore ignored protein-peptide complexes). Following previous publications [23], we define hot spots as those alanine mutations for which Inline graphic (Inline graphic is the change in binding free energy). Only mutations occurring at the complex interface are retained. In total the data set comprises Inline graphic mutations, of which Inline graphic correspond to hot spots. For cross-validation purposes, we have grouped homologous complexes and formed Inline graphic non-homologous clusters. Accordingly, we have implemented a Inline graphic-fold cross-validation strategy. A detailed description of the data set, individual mutations and clustering criteria can be found in [12].

In addition, we have applied HSPred to the Ras/RalGDS protein-protein complex (PDB code: 1LFD) for which experimental Inline graphic values are available [19]. From the original reference, we have taken the data corresponding to Inline graphic interface alanine mutations. As the Ras/RalGDS complex is not homologous to any of the complexes in the original data set, it can be regarded as an independent external test case. A similar data set had been used previously in [13] for validation purposes.

As a further illustrative example we have applied HSPred to Interleukin-4 (IL-4) bound to its receptor Inline graphic chain (IL-4RInline graphic) (PDB code: 1IAR). Experimental Inline graphic values are available for this complex too [20], [21]. The IL-4/IL-4RInline graphic complex is likely a remote homologue of the complex between human growth hormone (hGH) and its binding protein (hGHbp), which is part of our training set (PDB code: 1A22). IL-4 and hGH share only Inline graphic sequence identity by optimal structural alignment but belong to the same homologous superfamily group (H-level) according to the CATH database [24]. Similarly, the sequence identity between IL-4RInline graphic and hGHbp is only Inline graphic but structural similarity suggests a homology relationship. It has however been pointed out that the IL-4/IL-4RInline graphic complex differs in several important functional and structural aspects from the hGH/hGHbp complex [20], [21], [25]. It could therefore in effect be regarded as an additional independent test case.

Input features

As input features for the Support Vector Machines we have used basic energy terms that have been found to be important for the stability of protein complexes. These are van der Waals potential, hydrogen bonds, Coulomb electrostatics and desolvation energy. We distinguish contributions from Inline graphic different structural regions (schematised in Figure 1):

  • Side-chain inter-molecular energies: interaction energies between side-chain atoms of the mutated residue and atoms in the partner protein (respectively atoms in the red filled area and blue striped area in Figure 1).

  • Environment inter-molecular energies: interaction energies between atoms in the two proteins that are within Inline graphic of the Inline graphic of the mutated residue (respectively atoms in the red striped area and blue striped area in Figure 1). We do not include the contribution from the mutated side-chain in this term.

  • Side-chain intra-molecular energies: interaction energies between side-chain atoms of the mutated residue and other atoms in the same protein (respectively atoms in the red filled area and red striped area in Figure 1).

In total therefore there are Inline graphic input features (Inline graphic), although not all of them have been used to build our SVM models (we discuss our feature selection below). A detailed description of how energy components are calculated from the PDB structures is reported in [12].

Support Vector Machines models

We have used the program package SVMInline graphic [26], which is available at the website http://svmlight.joachims.org/. As in [12], we have opted for a linear kernel and implemented a nested-loop cross-validation scheme. The latter consists of two nested cross-validation loops: an outer one for testing, an inner one for choosing hyper-parameters. In the inner cycle, the hyper-parameters are optimised by applying a grid search and the model performance is assessed by means of the F1 score. The nested-loop cross-validation scheme allows also to estimate statistical errors on performance measures (see [12] for details).

Models construction and feature selection

We have analysed the correlation coefficients Inline graphic between energy features and the observed Inline graphic values (see Table 3). We have then built a ‘baseline’ model, SVMX, including only the Inline graphic features for which Inline graphic. These are: van der Waals, hydrogen bond and solvation side-chain inter-molecular energies, van der Waals, hydrogen bond and solvation environment inter-molecular energies, and van der Waals side-chain intra-molecular energy. Note that the values of the correlation coefficients do not vary sensibly in the Inline graphic different training sets, implying that this choice of features is robust.

Table 3. Correlation of energy terms with observed Inline graphic values.
Feature (energy term) Inline graphic
Side-chain inter-molecular
van der Waals Inline graphic
hydrogen bond Inline graphic
electrostatics Inline graphic
desolvation Inline graphic
Environment inter-molecular
van der Waals Inline graphic
hydrogen bond Inline graphic
electrostatics Inline graphic
desolvation Inline graphic
Side-chain intra-molecular
van der Waals Inline graphic
hydrogen bond Inline graphic
electrostatics Inline graphic
desolvation Inline graphic

We report the absolute values of the correlation coefficients Inline graphic between energy features and the observed Inline graphic (values greater than 0.2 are in bold).

We have analysed the predictions of SVMX by grouping mutations according to the amino acid type. In particular we have focused on the most frequent amino acids in our data set, i.e. those occurring more than Inline graphic times with at least Inline graphic hot spot examples. The list comprises the following Inline graphic amino acid types: Arg, Asn, Asp, Glu, Lys, Trp and Tyr. We observe a good performance for all amino acids except Arg and Glu for which Inline graphic (see Figure 2). To overcome these limitations, we have built two separate SVM classifiers, SVMInline graphic and SVMInline graphic, for mutations involving respectively Arg and Glu.

In theory, one could use the amino acid identity as input feature or build a model using only, e.g., Glu mutations. In practice, at present this is not feasible as there are not enough mutational data. We have reasoned instead that SVMInline graphic and SVMInline graphic should not be completely different from SVMX, rather they should differ only marginally from the latter. In this spirit, we have trained several different but related models. All models are trained using the whole data set (comprising therefore mutations from all amino acid types) but each of them corresponds to a different choice of input features. Within this ensemble of classifiers we have selected those that best perform on Arg and Glu.

Our strategy has been to bias the classifiers to perform well on Arg and Glu by selecting a specific subset of features. This reflects the observation that some energy features appear to be more important for some amino acids than for others, i.e. for some amino acid they correlate better with the observed Inline graphics. Note that the hyper-parameters in each of the models in the ensemble are optimised over all the mutations in the training set. The identity of the amino acid of interest enters only when selecting the best model within the ensemble. We find this to be a robust strategy, i.e. it is not too sensitive to small modifications in the training set.

Given the starting Inline graphic features, there is a huge number of possible combinations that can be selected and it is clearly not feasible to test them all. To simplify the problem, we have considered only combinations with Inline graphic or Inline graphic features, taken from the Inline graphic features used for SVMInline graphic. We have further constrained the selection by excluding pairs of highly correlated features, i.e. features for which Inline graphic, because they would be redundant. For example, only one term between the van der Waals and solvation side-chain inter-molecular energies can be included. Similarly only one term among the Inline graphic environment energies can be chosen. With these constraints, there are a total of Inline graphic different feature combinations (Inline graphic combinations having Inline graphic features and Inline graphic having Inline graphic features). We have built a classifier for each of them and then selected the one performing best on, e.g., Glu. In the case of Arg, the intra-molecular coulomb term appears to be also important (correlation coefficient with observed Inline graphic Inline graphic). We have therefore tested additional Inline graphic combinations which are obtained by adding the intra-molecular coulomb term to the set above.

It is important to underline that when assessing the results for SVMInline graphic and SVMInline graphic by cross-validation, the choice of the best model (feature combination) is performed within the inner loop of the nested-loop cross-validation scheme (i.e. using the training set only), similarly to the choice of hyper-parameters. This ensures that the optimal feature combination for either Arg or Glu is selected without ever considering the performance on the test set. It is worth noting that for both Arg and Glu the feature combination that gives the best results is consistent in the Inline graphic different training sets. For example for Glu the optimal feature combination is always hydrogen bond side-chain inter-molecular, hydrogen bond environment and van der Waals side-chain intra-molecular. It is also worth mentioning that Glu and Arg can be singled out based on the performance of SVMX in the training sets, therefore complying to the cross-validation scheme. We have not explicitly stated it above to keep the discussion as simple as possible.

Measures of prediction performance

We primarily assess the prediction performances of our method using the F1 score. Let Inline graphic, Inline graphic, Inline graphic refer to the number of true positives, false positives and false negative respectively. Precision (P, also called specificity) and recall (R, also called sensitivity) are defined as

graphic file with name pone.0016774.e188.jpg (1)

The F1 score is the harmonic mean of precision and recall

graphic file with name pone.0016774.e189.jpg (2)

We also calculate the Matthew's correlation coefficient (Inline graphic) given by

graphic file with name pone.0016774.e191.jpg (3)

where Inline graphic is the number of true negative and Inline graphic, Inline graphic and Inline graphic are as above.

Acknowledgments

SL, DB, MP and DTJ acknowledge financial support from the BBSRC.

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: This work was funded by grant reference BB/E017452/1, from the Biotechnology and Biological Sciences Research Council (BBSRC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Cunningham BC, Wells JA. High-resolution epitope mapping of hGH-receptor interactions by alanine-scanning mutagenesis. Science. 1989;244:1081–1085. doi: 10.1126/science.2471267. [DOI] [PubMed] [Google Scholar]
  • 2.Moreira IS, Fernandes PA, Ramos MJ. Hot spots–a review of the protein-protein interface determinant amino-acid residues. Proteins. 2007;68:803–812. doi: 10.1002/prot.21396. [DOI] [PubMed] [Google Scholar]
  • 3.Kortemme T, Baker D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc Natl Acad Sci U S A. 2002;99:14116–14121. doi: 10.1073/pnas.202485799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Guerois R, Nielsen JE, Serrano L. Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol. 2002;320:369–387. doi: 10.1016/S0022-2836(02)00442-4. [DOI] [PubMed] [Google Scholar]
  • 5.Gao Y, Wang R, Lai L. Structure-based method for analyzing protein-protein interfaces. J Mol Model. 2004;10:44–54. doi: 10.1007/s00894-003-0168-3. [DOI] [PubMed] [Google Scholar]
  • 6.Li L, Zhao B, Cui Z, Gan J, Sakharkar MK, et al. Identification of hot spot residues at protein-protein interface. Bioinformation. 2006;1:121–126. doi: 10.6026/97320630001121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ofran Y, Rost B. Protein-protein interaction hotspots carved into sequences. PLoS Comput Biol. 2007;3:e119–e119. doi: 10.1371/journal.pcbi.0030119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Darnell SJ, Page D, Mitchell JC. An automated decision-tree approach to predicting protein interaction hot spots. Proteins. 2007;68:813–823. doi: 10.1002/prot.21474. [DOI] [PubMed] [Google Scholar]
  • 9.Grosdidier S, Fernandez-Recio J. Identification of hot-spot residues in protein-protein interactions by computational docking. BMC Bioinformatics. 2008;9:447–447. doi: 10.1186/1471-2105-9-447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Benedix A, Becker CM, de Groot BL, Caflisch A, Böckmann RA. Predicting free energy changes using structural ensembles. Nat Methods. 2009;6:3–4. doi: 10.1038/nmeth0109-3. [DOI] [PubMed] [Google Scholar]
  • 11.Cho Ki, Kim D, Lee D. A feature-based approach to modeling protein-protein interaction hot spots. Nucleic Acids Res. 2009;37:2672–2687. doi: 10.1093/nar/gkp132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lise S, Archambeau C, Pontil M, Jones DT. Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods. BMC Bioinformatics. 2009;10:365–365. doi: 10.1186/1471-2105-10-365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Krüger DM, Gohlke H. DrugScorePPI webserver: fast and accurate in silico alanine scanning for scoring protein-protein interactions. Nucleic Acids Res. 2010;38(Suppl):W480–W486. doi: 10.1093/nar/gkq471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tuncbag N, Keskin O, Gursoy A. HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 2010;38(Suppl):W402–W406. doi: 10.1093/nar/gkq323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Meireles LMC, Dömling AS, Camacho CJ. ANCHOR: a web server and database for analysis of protein-protein interaction binding pockets for drug discovery. Nucleic Acids Res. 2010;38(Suppl):W407–W411. doi: 10.1093/nar/gkq502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Xia JF, Zhao XM, Song J, Huang DS. APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinformatics. 2010;11:174–174. doi: 10.1186/1471-2105-11-174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wells JA, McClendon CL. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. doi: 10.1038/nature06526. [DOI] [PubMed] [Google Scholar]
  • 18.González-Ruiz D, Gohlke H. Targeting protein-protein interactions with small molecules: Challenges and perspectives for computational binding epitope detection and ligand finding. Curr Med Chem. 2006;13:2607–2625. doi: 10.2174/092986706778201530. [DOI] [PubMed] [Google Scholar]
  • 19.Kiel C, Serrano L, Herrmann C. A detailed thermodynamic analysis of ras/effector complex interfaces. J Mol Biol. 2004;340:1039–1058. doi: 10.1016/j.jmb.2004.05.050. [DOI] [PubMed] [Google Scholar]
  • 20.Wang Y, Shen BJ, Sebald W. A mixed-charge pair in human interleukin 4 dominates high-affinity interaction with the receptor alpha chain. Proc Natl Acad Sci U S A. 1997;94:1657–1662. doi: 10.1073/pnas.94.5.1657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhang JL, Simeonowa I, Wang Y, Sebald W. The high-affinity interaction of human IL-4 and the receptor alpha chain is constituted by two independent binding clusters. J Mol Biol. 2002;315:399–407. doi: 10.1006/jmbi.2001.5243. [DOI] [PubMed] [Google Scholar]
  • 22.Lise S, Jones DT. Predicting druggable binding sites at protein-protein interfaces by computational alanine scanning. In preparation.
  • 23.Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J Mol Biol. 1998;280:1–9. doi: 10.1006/jmbi.1998.1843. [DOI] [PubMed] [Google Scholar]
  • 24.Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, et al. The CATH classification revisited–architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res. 2009;37:D310–D314. doi: 10.1093/nar/gkn877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hage T, Sebald W, Reinemer P. Crystal structure of the interleukin-4/receptor alpha chain complex reveals a mosaic binding interface. Cell. 1999;97:271–281. doi: 10.1016/s0092-8674(00)80736-9. [DOI] [PubMed] [Google Scholar]
  • 26.Joachims T. Schölkopf B, Burges C, Smola AJ, editors. Making large-scale svm learning practical. Advances in Kernel Methods - Support Vector Learning, MIT Press. 1999.

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES