Skip to main content
Bioinformatics logoLink to Bioinformatics
. 2013 Aug 19;29(21):2806–2807. doi: 10.1093/bioinformatics/btt483

HippDB: a database of readily targeted helical protein–protein interactions

Christina M Bergey 1,, Andrew M Watkins 2,, Paramjit S Arora 2,*
PMCID: PMC3799476  PMID: 23958730

Abstract

Summary: HippDB catalogs every protein–protein interaction whose structure is available in the Protein Data Bank and which exhibits one or more helices at the interface. The Web site accepts queries on variables such as helix length and sequence, and it provides computational alanine scanning and change in solvent-accessible surface area values for every interfacial residue. HippDB is intended to serve as a starting point for structure-based small molecule and peptidomimetic drug development.

Availability and implementation: HippDB is freely available on the web at http://www.nyu.edu/projects/arora/hippdb. The Web site is implemented in PHP, MySQL and Apache. Source code freely available for download at http://code.google.com/p/helidb, implemented in Perl and supported on Linux.

Contact: arora@nyu.edu

1 INTRODUCTION

Protein–protein interactions (PPIs) mediate fundamental signaling pathways and cellular processes. Although PPIs are highly promising pharmaceutical targets, they are not preferred targets in conventional drug development because of their extended flat interfaces. In particular, compound libraries for high-throughput screening that offer attractive lead compounds for enzymatic targets lack the topological and functional complexity necessary for PPI inhibition (Hajduk and Greer, 2007; Raj et al., 2013; Wells and McClendon, 2007). One successful method to inhibit PPIs is the mimicry of secondary structure motifs that contribute to complex formation (Azzarito et al., 2013; Boersma et al., 2012; Jochim and Arora, 2010; Moellering et al., 2009; Patgiri et al., 2011).

Often a subset of the residues at a protein–protein interface can contribute significantly to the binding interaction (Clackson and Wells, 1995). Because solubility and specificity are eternal problems in drug design, it is advantageous to identify and prioritize most important residues, leaving less important positions free for fine-tuning (Bullock et al., 2011; Jochim and Arora, 2010).

Conventional computational methods to predict important residues include alanine scanning (Jochim and Arora, 2010; Kortemme and Baker, 2002; Kortemme et al., 2004) and solvent-accessible surface area (ΔSASA) analysis (Koes and Camacho, 2012). Alanine scanning provides the change in ΔG resulting from a contact residue being mutated to alanine, while a ΔSASA value describes how much of the residue is buried from solvent on binding. We have previously developed a scoring strategy to rank protein interfaces by their promise for synthetic inhibition (Bullock et al., 2011; Jochim and Arora, 2010) and designed inhibitors of formerly ‘undruggable’ PPIs (Patgiri et al., 2011).

To follow-up this work, we sought to derive a readily accessible resource for the chemical biology community. Research groups with potent small-molecule scaffolds might be interested in small interfaces with hotspot residues in two consecutive positions or in the i and i+4 positions, whereas those developing peptoid or beta-peptide foldamers might be more interested in long interfaces with high total ΔΔG. HippDB—a database of helical interfaces in PPIs—lists all the helical PPIs in the Protein Data Bank (PDB) and catalogs computational alanine scanning results (ΔΔG in Rosetta energy units) and ΔSASA for each interfacial residue. We expect this dataset will be a useful resource for PPI inhibition.

Figure 1 depicts a typical workflow in HippDB. The user might first search for interface helices found in humans by constraining the organism name. Next, the user might trim the results for complexes with exactly three hotspot residues, then for helices <10 residues long, then for a ΔΔG average >2. By clicking on the PDB codes that result, the user can view any of the five complexes fitting these criteria in JMol, with their hotspot residues displayed in wireframe.

Fig. 1.

Fig. 1.

A typical HippDB query. The five resulting complexes are depicted with the qualifying chain in green and the partner chain in orange. 1YCR is the native p53/mdm2 complex; 3FDO and 3JZO are complexes of p53-like synthetic peptides with mdm4 and 3G03 and 3JZR are complexes of synthetic peptides with mdm2

2 METHODS

Structures of multi-entity protein complexes’ asymmetric units were obtained from the PDB (Berman et al., 2000). We identified all interacting interface chains within each PDB file and created a new PDB file for each chain and each pair of interacting chains. If the original PDB file contained more than one model, only the lowest-scoring model was used (according to Rosetta’s ‘Relax’ protocol).

Each qualifying pair of chains was analyzed using the RosettaScripts AlaScan filter, averaging 100 runs (Baker and Sali, 2001; Fleishman et al., 2011). Following alanine scanning, we isolated all interface helices containing two or more hotspot residues (ΔΔG >1.0 Rosetta energy units, which approximately scale as 1 kcal/mol) and computed ΔSASA using NACCESS (Hubbard and Thornton, 1993). Interface helices were required to possess at least four consecutive residues, each assigned as helical by Dictionary of Secondary Structure Prediction acquired from the Center for Molecular and Biomolecular Informatics Web site (Kabsch and Sander, 1983). For each interface helix, parameters including average and total ΔΔG and ΔSASA, the percentage of the complex’s ΔΔG and ΔSASA contributed by the helix, the helix length, hotspot distance and sequence and the organism of origin are recorded in the database.

The Web site interface uses original JavaScript for constructing queries, a standard AJAX protocol to execute the queries and a JQuery extension (DataTables) to format the query results (Table 1).

Table 1.

A selection of the fields found in HippDBa

Field name Description
Average ΔΔG, helix Average ΔΔG contributed by a residue in the helix
Percentage ΔΔG, helix Percentage of the chain’s total ΔΔG due to the helix
Percentage ΔSASA, helix Percentage of the chain’s total ΔSASA due to the helix
Helix sequence Sequence of the interface helix
Hotspot IDs List of hotspot residues with residue type and ΔΔG
MimeticScore Sum of the top three hotspot ΔΔG values

aHippDB includes standard search fields such as the PDB code, organism along with specific fields listed above. The fields are searchable and sortable.

3 RESULTS

From 11 818 multiprotein entries in the PDB, 379 877 files of two protein chains were produced and subjected to alanine scanning. Of these interfaces, we found 7308 helices of four residues or longer with the two hotspots necessary to qualify for the database. A qualifying alpha helix is, on average, 13.2 residues long and contains 2.7 hotspots. The end-to-end distance separating these hotspots is 7.3 residues. On average, the three best hotspot residues sum to a ΔG of 3.9 in Rosetta energy units, and the helix overall contributes 48% of the chain’s total ΔΔG and 37% of its ΔSASA.

The rational design of PPI inhibitors involves a systematic analysis of native interactions. By cataloging the results of this analysis for every known structure, and by describing second-order metrics to help prioritize design efforts, this database will eliminate often-reduplicated effort and greatly accelerate the design process. In this way, HippDB complements existing resources such as PocketQuery and HotSprint, the former of which catalogues regions of high solvent burial and the latter of which highlights evolutionary conservation to evaluate the role, functional or structural, of individual hotspots (Camacho and Koes, 2012; Guney et al., 2008).

ACKNOWLEDGEMENTS

The authors thank Andrea Jochim for her work on the project that inspired HippDB.

Funding: This work was supported by the National Institutes of Health [R01 GM073943]. A.M.W. thanks the NYU Chemistry Department for a Kramer Fellowship. C.M.B. thanks the National Science Foundation for a Graduate Research Fellowship.

Conflict of Interest: none declared.

REFERENCES

  1. Azzarito V, et al. Inhibition of [alpha]-helix-mediated protein-protein interactions using designed molecules. Nat. Chem. 2013;5:161–173. doi: 10.1038/nchem.1568. [DOI] [PubMed] [Google Scholar]
  2. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. [DOI] [PubMed] [Google Scholar]
  3. Berman HM, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Boersma MD, et al. Evaluation of diverse alpha/beta-backbone patterns for fnctional alpha-helix mimicry: analogues of the Bim BH3 domain. J. Am. Chem. Soc. 2012;134:315–323. doi: 10.1021/ja207148m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bullock BN, et al. Assessing helical protein interfaces for inhibitor design. J. Am. Chem. Soc. 2011;133:14220–14223. doi: 10.1021/ja206074j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Camacho CJ, Koes DR. PocketQuery: protein–protein interaction inhibitor starting points from protein–protein interaction structure. Nucleic Acids Res. 2012;40:W387–W392. doi: 10.1093/nar/gks336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Clackson T, Wells JA. A hot-spot of binding-energy in a hormone-receptor interface. Science. 1995;267:383–386. doi: 10.1126/science.7529940. [DOI] [PubMed] [Google Scholar]
  8. Fleishman SJ, et al. RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One. 2011;6:e20161. doi: 10.1371/journal.pone.0020161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Guney E, et al. HotSprint: database of computational hot spots in protein interfaces. Nucleic Acids Res. 2008;36:D662–D666. doi: 10.1093/nar/gkm813. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hajduk PJ, Greer J. A decade of fragment-based drug design: strategic advances and lessons learned. Nat. Rev. Drug Discov. 2007;6:211–219. doi: 10.1038/nrd2220. [DOI] [PubMed] [Google Scholar]
  11. Hubbard SJ, Thornton JM NACCESS. Department of Biochemistry and Molecular Biology. London: University College; 1993. [Google Scholar]
  12. Jochim AL, Arora PS. Systematic analysis of helical protein interfaces reveals targets for synthetic inhibitors. ACS Chem. Biol. 2010;5:919–923. doi: 10.1021/cb1001747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Kabsch W, Sander C. Dictionary of protein secondary structure. Biopolymers. 1983;22:2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  14. Koes DR, Camacho CJ. Small-molecule inhibitor starting points learned from protein-protein interaction inhibitor structure. Bioinformatics. 2012;28:784–791. doi: 10.1093/bioinformatics/btr717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Kortemme T, Baker D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Natl Acad. Sci. USA. 2002;99:14116–14121. doi: 10.1073/pnas.202485799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kortemme T, et al. Computational alanine scanning of protein-protein interfaces. Sci. STKE. 2004;2004:pl2. doi: 10.1126/stke.2192004pl2. [DOI] [PubMed] [Google Scholar]
  17. Moellering RE, et al. Direct inhibition of the NOTCH transcription factor complex. Nature. 2009;462:182–188. doi: 10.1038/nature08543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Patgiri A, et al. An orthosteric inhibitor of the Ras-Sos interaction. Nat. Chem. Biol. 2011;7:585–587. doi: 10.1038/nchembio.612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Raj M, et al. Plucking the high hanging fruit: a systematic approach for targeting protein-protein interactions. Bioorg. Med. Chem. 2013;21:4051–4057. doi: 10.1016/j.bmc.2012.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Wells JA, McClendon CL. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature. 2007;450:1001–1009. doi: 10.1038/nature06526. [DOI] [PubMed] [Google Scholar]

Articles from Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES