Evaluating template-based and template-free protein–protein complex structure prediction

Thom Vreven; Howook Hwang; Brian G Pierce; Zhiping Weng

doi:10.1093/bib/bbt047

. 2013 Jul 1;15(2):169–176. doi: 10.1093/bib/bbt047

Evaluating template-based and template-free protein–protein complex structure prediction

Thom Vreven ^*, Howook Hwang ^*, Brian G Pierce, Zhiping Weng ^✉

PMCID: PMC3956070 PMID: 23818491

Abstract

We compared the performance of template-free (docking) and template-based methods for the prediction of protein–protein complex structures. We found similar performance for a template-based method based on threading (COTH) and another template-based method based on structural alignment (PRISM). The template-based methods showed similar performance to a docking method (ZDOCK) when the latter was allowed one prediction for each complex, but when the same number of predictions was allowed for each method, the docking approach outperformed template-based approaches. We identified strengths and weaknesses in each method. Template-based approaches were better able to handle complexes that involved conformational changes upon binding. Furthermore, the threading-based and docking methods were better than the structural-alignment-based method for enzyme–inhibitor complex prediction. Finally, we show that the near-native (correct) predictions were generally not shared by the various approaches, suggesting that integrating their results could be the superior strategy.

Keywords: protein–protein structure, template-based prediction, protein–protein docking, ZDOCK, PRISM, COTH

INTRODUCTION

The interaction between pairs of proteins is critical in many biological processes, including enzyme inhibition, signaling pathways and the immune response. Although experiments have shown that most proteins interact with at least one other protein, the determination of atomic resolution structures of protein–protein complexes is laborious and not always successful. As an alternative to experimental approaches, computational algorithms have been developed to predict the bound structures of protein–protein complexes. These computational approaches can be divided into two main classes of algorithms—template-free or docking [1–12] and template-based [13–28]. The various approaches for predicting protein–protein complex structures were recently reviewed by Tuncbag et al. [29]. The docking approaches start with the unbound structures of the component proteins, which are typically obtained using x-ray crystallography or nuclear magnetic resonance (NMR), but can also be built using homology modeling. The translational and rotational space is then searched for favorable binding orientations. Searching the 6-dimensional space is computationally expensive, and often carried out with rapidly computable scoring functions and efficient grid-based search algorithms such as fast Fourier transform (FFT) [1, 4–6] or geometric hashing [3]. In contrast with docking approaches, which are based on the physical properties of the component proteins for the prediction, template-based algorithms use similarities with known complex structures for generating the prediction. A wide spectrum of template-based methods has been introduced in recent years, often with components adopted from monomer protein structure prediction approaches. The methods differ in the way similarity is defined, which can be based predominantly on sequence identity [19, 20], sequence-structure ‘threading’ [13, 26] or structural alignments, the last often for the interfacial regions only [22, 25, 27, 28].

Each approach has its strengths and weaknesses, with template-based approaches critically depending on template availability, and docking approaches being sensitive to conformational changes upon binding. In this work, we examined the strengths and weaknesses of the respective approaches, and identify aspects that affect the likelihood of success. Specifically, we compared the ZDOCK [7, 9, 30] algorithm for protein–protein docking that was developed in our lab with two template-based algorithms: COTH by Mukherjee and Zhang [26] and PRISM by Gursoy, Keskin and co-workers [25, 27]. COTH and PRISM represent two main approaches in template-based complex structure prediction, with COTH requiring sequences only and using threading to build the predictions, and PRISM relying on structural alignments of the interface regions to select the templates.

Finally, we assess the availability of templates in the Protein Data Bank (PDB) [31] for protein–protein complexes. The relationship between sequence identity and binding modes was addressed earlier by Aloy et al. [32]. Similar work on template availability, but with a focus on the structural space of the PDB, has recently been published by others [33, 34]. We developed a protocol (ZTEM) that uses sequence alignment and structural alignment to determine templates that match native complexes. ZTEM and COTH have in common that the binding partners are aligned (or threaded) globally, whereas PRISM performed local alignments.

METHODS

Dataset

For testing the algorithms, we used the complexes from a protein–protein docking benchmark developed earlier by our lab [35]. The latest version of the benchmark contains the bound and unbound structures of 176 protein–protein complexes, and is nonredundant at the SCOP family level [36]. We classify the complexes based on biochemical function and docking difficulty. According to biochemical function, we have 52 enzyme–inhibitor complexes, 25 antibody–antigen complexes and 99 other complexes (referred to as the ‘others’ category). Judged by docking difficulty, 121 are rigid-body, 30 are of medium difficulty and 25 are difficult.

Template-based methods are not suitable for antibody–antigen complexes. Multiple antibodies, which differ only in their complementarity-determining loops, can recognize a variety of epitopes on an antigen; thus, template-based approaches would result in false positives. Therefore, we excluded the antibody–antigen complexes from the analyses in this work.

COTH

COTH takes sequences as input and generates predictions for the complex structures based on threading a sequence onto a template structure [26]. COTH follows a two-stage procedure. In the first step, the sequences of both component proteins are threaded using a library of nonredundant complex templates. This yields a selection, typically 10, of templates that describe potential binding modes. In the second step, the sequences of the monomers are threaded separately, using a library of monomer templates. This yields a prediction for each monomer, which is then superposed onto the complex templates. To generate the COTH predictions, we used the Web server described in [26].

PRISM

PRISM takes structures of the unbound component proteins as input, and performs structural alignments of the surfaces of the monomers with a library of binding-interface templates. The library is constructed from the PDB complex structures and nonredundant. To improve predictive accuracy, the alignment results are subjected to various filters, such as a threshold for the alignment root mean square deviation (RMSD), a minimum number of matching residues and residue pairs between the template and the predicted structures, clashing thresholds and a matching template hotspot residue in the prediction. After the alignment and filtering, the predictions are refined and scored using FiberDock [37]. To generate the PRISM results, we used the collection of scripts provided by the authors [25], in combination with the required external programs [15, 37–39]. We used the larger of the two available template libraries, which contains 7922 interfaces.

ZDOCK

For the docking algorithm, we used ZDOCK3.0 [30], which was developed in our lab and includes the IFACE statistical pair potential [30]. ZDOCK is a grid-based rigid-body approach that uses FFT, and samples the three Euler angles with 6° or 15° spacing and the three translational degrees of freedom with an 1.2 Å spacing. For each set of rotational angles, only the best scoring translation is retained, which results in 3600 or 54 000 predictions for 15° or 6° rotational sampling, respectively. The predictions are subsequently ranked according to the ZDOCK scoring function. In the current work, we used 6° sampling, and we varied the total number of predictions considered per test case to match those of template-based methods, as described in the text below.

ZTEM

We developed the ZTEM (Zlab TEMplates) protocol to investigate the availability of templates in the PDB for protein–protein complexes. We applied ZTEM to the 151 enzyme–inhibitor and ‘other’ cases from the Benchmark. A BLAST [40] sequence alignment was used to search for matches to the sequences of protein structures in the PDB (downloaded Oct 4, 2012). Complex templates from the PDB were retained if they showed a sequence alignment within the BLAST significance threshold (E-value ≤ 10.0) for each chain of the query. Finally, the FAST structural alignment program [41] was used for superposing the query proteins onto the complex templates. Note that ZTEM only performs sequence alignments to find candidate templates. Recently, it was shown that complexes can possess structural similarity without sequence similarity [32, 33], and a structural alignment approach would result in more candidate templates. However, because the purpose of ZTEM is to provide a baseline using the simplest way of finding templates, we only considered sequence alignments in this work.

Scoring

We note that each of the methods that were compared uses its own scoring function, and the quality of the function can affect the performance of the method. Unfortunately, it is not straightforward to remove the scoring component of the method, or to use a single scoring function for all approaches. However, because the scoring functions presumably optimize the performance of the respective methods, we have assumed that the scoring function is an integral part of each method and that we could compare the overall performance of the methods.

RESULTS AND DISCUSSION

COTH

We could not test COTH using all the cases from our Benchmark for two reasons. First, COTH can only predict complexes formed by single-chain monomers. Second, we wanted to exclude predictions with both monomers having sequence identities >95% with the complex template (a template was allowed only if at least one of the monomers has <95% sequence identity to the target). The latter poses a problem, as the COTH Web server makes a fixed set of 10 predictions for each case, and does not allow sequence identity cutoffs between the input monomers and the templates to be specified. As a result, most cases yielded <10 COTH predictions that were considered valid (<95% sequence identity for at least one monomer) in our analysis. To avoid bias against cases that have few valid predictions, we only retained the Benchmark cases that had eight or more valid predictions, and for each of these cases included only the top eight valid predictions, ensuring that all retained Benchmark cases had the same number of predictions. Applying these filters, we retained 111 test cases (Table 1), of which 42 and 69 were of the enzyme–inhibitor and ‘other’ complex types, respectively. Seventy cases were rigid-body, and 23 and 18 cases were of the medium and difficult categories, respectively. Although this filter excluded a relative large number of rigid cases and complexes of the ‘other’ type, the remaining numbers are large enough to consider the test set well balanced.

Table 1:

Summary of the number of hits found for COTH and ZDOCK (using 5 Å IRMSD cutoff for hits)

Complex type	All cases	COTH cases with hits	ZDOCK(1)^a cases with hits	ZDOCK(8)^a cases with hits
Enzyme–inhibitor	42	13	13	18
‘Other’	69	6	5	14
Rigid-body	70	14	15	25
Medium difficulty	23	3	3	6
Difficult	18	2	0	1
Total cases	111	19	18	32
Cases with hits shared with COTH hits			4	7

Open in a new tab

^aNumber of ZDOCK predictions considered for each case in parentheses.

We define a hit as a prediction with interface root mean square deviation (IRMSD) of ≤5 Å. When assessing docking approaches, we typically use a 2.5 Å cutoff, but template-based prediction has no or limited sampling in the conformational space and a looser cutoff is appropriate (for a discussion of complex prediction metrics see [42]). Furthermore, COTH predictions may not contain all the residues specified in the input, and therefore we require a hit to have at least 50% of the native interface residues present in the structure of each binding partner. Of the 111 test cases, 19 cases had at least one hit. Thirteen were enzyme–inhibitor, and six of the ‘other’ type. Thus, COTH has a much higher success rate (percentage of cases with hits; 31%) for enzyme–inhibitor complexes than for the ‘other’ category (9%), suggesting this a particular strength of COTH.

When we consider the highest ranked hit for each case, we see for 13 cases a hit was within the top 3 (out of 8) ranked predictions, and that for nearly half (9 out of 19) of the test cases one of the monomers had a >95% sequence identity with the complex template that was used for the prediction. These results show that, despite the COTH approach being based on threading, sequence identity is an important factor in the prediction of near-native complex structure, and that COTH’s ranking is able to distinguish true positives from false positives. Based on these findings, it seems that considering more predictions per test case would only moderately improve COTH’s performance.

Comparison of COTH with ZDOCK

For comparing COTH with ZDOCK, we used the 111 Benchmark cases retained in the COTH analysis. The top ranked ZDOCK prediction was a hit for 18 cases (again using 5 Å IRMSD cutoff for hit definition); thus, the overall performance of COTH with eight predictions per test case (19 cases with hits) is similar to ZDOCK performance with one prediction per case. Of the Benchmark cases for which ZDOCK found hits, 15 are of the rigid category, and 3 of the medium difficulty category (for COTH, we found 14 rigid, 3 medium and 2 difficult). Furthermore, 13 of the ZDOCK hits are enzyme–inhibitor cases, and 5 are of the ‘other’ type (for COTH, we found hits for 13 enzyme–inhibitor cases and 6 of the ‘other’ type). Thus, COTH with eight predictions per test case has similar performance to ZDOCK with the top ranked prediction considered; moreover, the two methods show similar patterns regarding the complex type and expected docking difficulty. The most notable difference is that ZDOCK produced no hits for cases of the difficult category while COTH predicted hits for two difficult cases. This agrees with the observation that rigid-body docking algorithms generally do not perform well when there are large conformational changes on forming the complex, whereas conformational changes should have a smaller impact on template-based approaches. Of the 19 COTH and 18 ZDOCK cases with hits, only 4 are shared. This suggests that ‘pooling’ the predictions of COTH and ZDOCK could yield a higher hit-to-prediction ratio than either of the approaches alone. When we allowed ZDOCK to make the same number of predictions as COTH for each test case, we obtained 32 cases with hits, representing a success rate >50% higher than that of COTH. The number of cases with hits shared between COTH and ZDOCK is still small compared with the total number of cases where COTH has hits (increases from 4 to 7), indicating that pooling approaches can still be beneficial when larger numbers of ZDOCK predictions are considered.

PRISM

In contrast with COTH, PRISM is based on structural alignment and is not limited to single-chain monomers. Therefore, we used all enzyme–inhibitor and ‘other’ test cases that contain single and multi-chain component proteins for the analysis. As suggested in [27], we relaxed some of the filters used in PRISM to increase the number of hits. First, we lifted the requirement that at least one predicted hotspot from the template has an equivalent residue in the prediction. Second, we reduced the number of residues that are required to match between the template and prediction to 12 (default is 15). Note that these changes are similar but not identical to those suggested in [27] (despite discussion with the authors of PRISM, we were not able to determine the relaxed filter settings to achieve the reported performance).

After removing the predictions for which both monomers have a sequence identity >95% with the complex template, we retained an average of 33 predictions for each case. There is, however, a large variation; e.g. PRISM produced 600 predictions for 1N2C. Using the 5 Å IRMSD cutoff, PRISM yielded hits for 26 cases, with 11 enzyme–inhibitor, and 15 of the ‘other’ type (Table 2). Separated by expected docking difficulty there were 21 rigid-body cases and 5 with medium or high difficulty. Of the Benchmark cases with hits, nearly half (11 out of 26) only had hits with a sequence identity >95% for one of the monomers. This shows that, despite PRISM being based on structural alignment, the sequence identity often is a determining factor for the identification of near-native complex structures.

Table 2:

Summary of the number of hits found for PRISM and ZDOCK (using 5 Å IRMSD cutoff for hits)

Complex type	All cases	PRISM cases with hits	ZDOCK(1)^a cases with hits	ZDOCK(33)^a cases with hits
Enzyme–inhibitor	52	11	14	25
‘Other’	99	15	11	32
Rigid-body	99	21	22	44
Medium difficulty	29	3	3	9
Difficult	23	2	0	4
Total	151	26	25	57
Cases with hits shared with PRISM hits			8	15

Open in a new tab

^aNumber of ZDOCK predictions considered for each case in parentheses.

We note that the template library used in PRISM is designed based on the structure space of binding interfaces. When we excluded predictions based on sequence identity (both monomers having a sequence identity >95% with the complex template), we may effectively reduce the structure space of the template library that is relevant for the solution. This is illustrated by the following findings. Based on the 95% sequence identity cutoff, we exclude at least one prediction for 63 Benchmark cases. PRISM identified hits for 10% of these 63 cases. For the remaining 88 cases for which we did not need to exclude any predictions, PRISM reported hits for 23% of the cases. This is the result of the structure space of the template library constructed to be nonredundant.

Comparison of PRISM with ZDOCK and COTH

Among the set of Benchmark cases used for PRISM, the top ranked ZDOCK prediction was a hit for 25 cases, compared with 26 cases with hits obtained by PRISM. The difference in success rates for the enzyme–inhibitor and ‘other’ complex types is smaller for PRISM (21%−15% = 6%) than for ZDOCK (27%−11% = 16%). Thus, the performance of PRISM depends less on the type of complex than ZDOCK. Because ZDOCK and COTH showed similar success rates for the different complex types, PRISM’s performance also depends less on the complex type than COTH. Of the PRISM and ZDOCK cases with hits, only eight are shared, which again suggests that pooling the predictions from the two methods may be a successful way to obtain the optimum hit-to-predictions ratio. When we allow ZDOCK to make the same number of 33 predictions to make for each case as PRISM does on average, the number of cases with hits more than doubles. However, the PRISM program returns many predictions that are identical, and 33 predictions is an upper limit we used for this comparison. The number of hits that are shared by PRISM and ZDOCK with 33 predictions is still moderate, and a pooling approach still promising.

PRISM and COTH showed similar overall performance, and both found hits for cases in the ‘difficult’ docking category, whereas ZDOCK does not. This is as expected, as template-based methods are less sensitive to conformational changes than rigid-body docking approaches.

ZTEM for determining template availability

To provide a baseline of template-based prediction methods using sequence alignment, we investigated the availability of templates in the PDB. As with the other template-based methods, we excluded templates that had >95% sequence identity with both monomers from the Benchmark case (here we calculated the identity for the chains separately, and used the largest value when a monomer had multiple chains).

The results obtained with ZTEM are summarized in Table 3. For 53 cases, we found at least one template onto which the monomers could be superimposed to produce a near-native complex structure. This represents 35% of the Benchmark cases considered, which is considerably higher than the roughly 20% found with PRISM and COTH. This suggests that template-based methods may be improved further. Because the template search is entirely based on sequence identity, it again shows that sequence identity is a determining factor for the chance of a template resulting in a hit. Such information can be used to develop a confidence measure for template-based docking.

Table 3:

Template availability using ZTEM (using 5 Å IRMSD cutoff for defining near-native structures)

Complex type	All cases	Number of cases with near-native templates
Enzyme–inhibitor	52	24
‘Other’	99	29
Rigid-body	99	39
Medium difficulty	29	10
Difficult	23	4
Total	151	53

Open in a new tab

When we considered the complex types, we observed a large difference in potential to predict enzyme–inhibitor complexes and complexes of the ‘other’ type (46%−29% = 17%). This is comparable with COTH and ZDOCK. Thus, only PRISM’s performance depends less on the complex type, possibly because of the approaches we considered, that is the one that relies the most on structural alignment. As with COTH and PRISM, the ZTEM results suggest that medium and difficult Benchmark cases may be better suited for template-based approaches than rigid-body docking. Eventually protein flexibility analysis may be incorporated to assess the confidence level of the various approaches [43].

CONCLUSIONS

The performance we obtained for template-based methods based on threading and based on structural alignment are comparable. In addition, the template-based methods showed a similar performance as the docking method that was allowed a single prediction for each complex, but when the same number of predictions was allowed for each method, the docking approach outperformed template-based approaches. With the test cases separated by expected docking difficulty, template-based approaches were better able to handle complexes that involved conformational changes upon binding. When we separated the test cases by complex type, we observed that threading and docking approaches were somewhat better for enzyme–inhibitor structure prediction than structural alignment template-based prediction and the reverse for other test cases.

Most importantly, the set of correct predictions from one method only moderately overlapped with the set of correct predictions of another method. This suggests that integrating their results could be the superior strategy for obtaining useful predictions in practical situations. For such an approach to be successful, it would be essential to develop scoring or other confidence metrics that can compare complex structure predictions from different sources.

Finally, we want to stress that in the current work, we used the methods in their standard form, even though performance improvements could possibly be gained by additional computation. For example, rigid-body docking results are often re-ranked or refined using more accurate but slower to compute algorithms. In previous work, we developed the ZRANK and IRAD functions for re-ranking initial-stage docking predictions, and increased the chance to find a near-native structure [44, 45]. Structural refinement generally leads to more accurate predictions [37, 46], and improving the performance of docking approaches is a continuing effort in our lab [43, 47]. For template based methods, improvements can be achieved via algorithmic development or by extending the library of templates. For example, the PRISM interface dataset was constructed in 2006, and since then the number of entries in the PDB has more than doubled. Although the increase of the template dataset is likely <2-fold owing to structural redundancy, the addition of any templates should improve the performance.

Key points.

Success rates of template-based and template-free methods for protein–protein complex structure prediction are similar.
Correct predictions are often not shared between the two types of approaches; thus, their results are complementary.
Each method has its strengths and weaknesses.

ACKNOWLEDGEMENTS

We thank Srayanta Mukherjee (University of Kansas Medical Center), Nurcan Tuncbag (M.I.T.) and Guray Kuzu (Koc University, Istanbul) for helpful discussions.

Biographies

Thom Vreven is a research assistant professor in Zhiping Weng’s lab, and works on the analysis and prediction of protein–protein interactions.

Howook Hwang studied protein–protein interactions as a post-doc in Zhiping Weng’s lab, and recently joined Barry Honig’s lab at Columbia University.

Brian G. Pierce is a research assistant professor in Zhiping Weng’s lab, and works on the analysis, prediction and design of protein–protein interactions.

Zhiping Weng is Director and Professor of Program in Bioinformatics and Integrative Biology at the University of Massachusetts Medical School. Her lab develops and applies computational methods to study protein docking, gene regulation and small silencing RNAs.

FUNDING

National Institutes of Health [R01 GM084884] (to Z.W.).

References

1.Katchalski-Katzir E, Shariv I, Eisenstein M, et al. Molecular-surface recognition - determination of geometric fit between proteins and their ligands by correlation techniques. Proc Natl Acad Sci USA. 1992;89:2195–9. doi: 10.1073/pnas.89.6.2195. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Vakser IA. Protein docking for low-resolution structures. Protein Eng. 1995;8:371–7. doi: 10.1093/protein/8.4.371. [DOI] [PubMed] [Google Scholar]
3.Norel R, Lin SL, Wolfson HJ, et al. Molecular-surface complementarity at protein-protein interfaces–the critical role played by surface normals at well places, sparse, points in docking. J Mol Biol. 1995;252:263–73. doi: 10.1006/jmbi.1995.0493. [DOI] [PubMed] [Google Scholar]
4.Gabb HA, Jackson RM, Sternberg MJE. Modelling protein docking using shape complementarity, electrostatics and biochemical information. J Mol Biol. 1997;272:106–20. doi: 10.1006/jmbi.1997.1203. [DOI] [PubMed] [Google Scholar]
5.Vakser IA, Matar OG, Lam CF. A systematic study of low-resolution recognition in protein-protein complexes. Proc Natl Acad Sci USA. 1999;96:8477–82. doi: 10.1073/pnas.96.15.8477. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Mandell JG, Roberts VA, Pique ME, et al. Protein docking using continuum electrostatics and geometric fit. Protein Eng. 2001;14:105–13. doi: 10.1093/protein/14.2.105. [DOI] [PubMed] [Google Scholar]
7.Chen R, Weng Z. Docking unbound proteins using shape complementarity, desolvation, and electrostatics. Proteins. 2002;47:281–94. doi: 10.1002/prot.10092. [DOI] [PubMed] [Google Scholar]
8.Dominguez C, Boelens R, Bonvin A. HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125:1731–7. doi: 10.1021/ja026939x. [DOI] [PubMed] [Google Scholar]
9.Chen R, Li L, Weng Z. ZDOCK: an initial-stage protein-docking algorithm. Proteins. 2003;52:80–7. doi: 10.1002/prot.10389. [DOI] [PubMed] [Google Scholar]
10.Andrusier N, Mashiach E, Nussinov R, et al. Principles of flexible protein-protein docking. Proteins. 2008;73:271–89. doi: 10.1002/prot.22170. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Lyskov S, Gray JJ. The RosettaDock server for local proteinprotein docking. Nucleic Acids Res. 2008;36:W233–8. doi: 10.1093/nar/gkn216. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Ritchie DW, Kozakov D, Vajda S. Accelerating and focusing protein-protein docking correlations using multi-dimensional rotational FFT generating functions. Bioinformatics. 2008;24:1865–73. doi: 10.1093/bioinformatics/btn334. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Lu L, Lu H, Skolnick J. MULTIPROSPECTOR: an algorithm for the prediction of protein-protein interactions by multimeric threading. Proteins. 2002;49:350–64. doi: 10.1002/prot.10222. [DOI] [PubMed] [Google Scholar]
14.Aloy P, Russell RB. Interrogating protein interaction networks through structural biology. Proc Natl Acad Sci USA. 2002;99:5896–901. doi: 10.1073/pnas.092147999. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Shatsky M, Nussinov R, Wolfson HJ. A method for simultaneous alignment of multiple protein structures. Proteins. 2004;56:143–56. doi: 10.1002/prot.10628. [DOI] [PubMed] [Google Scholar]
16.Aloy P, Bottcher B, Ceulemans H, et al. Structure-based assembly of protein complexes in yeast. Science. 2004;303:2026–9. doi: 10.1126/science.1092645. [DOI] [PubMed] [Google Scholar]
17.Aytuna AS, Gursoy A, Keskin O. Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces. Bioinformatics. 2005;21:2850–5. doi: 10.1093/bioinformatics/bti443. [DOI] [PubMed] [Google Scholar]
18.Gunther S, May P, Hoppe A, et al. Docking without docking: ISEARCH-Prediction of interactions using known interfaces. Proteins. 2007;69:839–44. doi: 10.1002/prot.21746. [DOI] [PubMed] [Google Scholar]
19.Launay G, Simonson T. Homology modelling of protein-protein complexes: a simple method and its possibilities and limitations. BMC Bioinformatics. 2008;9:427. doi: 10.1186/1471-2105-9-427. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Kundrotas PJ, Lensink MF, Alexov E. Homology-based modeling of 3D structures of protein-protein complexes using alignments of modified sequence profiles. Int J Biol Macromol. 2008;43:198–208. doi: 10.1016/j.ijbiomac.2008.05.004. [DOI] [PubMed] [Google Scholar]
21.Chen HL, Skolnick J. M-TASSER: an algorithm for protein quaternary structure prediction. Biophys J. 2008;94:918–28. doi: 10.1529/biophysj.107.114280. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Sinha R, Kundrotas PJ, Vakser IA. Docking by structural similarity at protein-protein interfaces. Proteins. 2010;78:3235–41. doi: 10.1002/prot.22812. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Kundrotas PJ, Vakser IA. Accuracy of protein-protein binding sites in high-throughput template-based modeling. Plos Comput Biol. 2010;6:e1000727. doi: 10.1371/journal.pcbi.1000727. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Ghoorah AW, Devignes MD, Smail-Tabbone M, et al. Spatial clustering of protein binding sites for template based protein docking. Bioinformatics. 2011;27:2820–7. doi: 10.1093/bioinformatics/btr493. [DOI] [PubMed] [Google Scholar]
25.Tuncbag N, Gursoy A, Nussinov R, et al. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM. Nat Protoc. 2011;6:1341–54. doi: 10.1038/nprot.2011.367. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Mukherjee S, Zhang Y. Protein-protein complex structure predictions by multimeric threading and template recombination. Structure. 2011;19:955–66. doi: 10.1016/j.str.2011.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Tuncbag N, Keskin O, Nussinov R, et al. Fast and accurate modeling of protein-protein interactions by combining template-interface-based docking with flexible refinement. Proteins. 2012;80:1239–49. doi: 10.1002/prot.24022. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Sinha R, Kundrotas PJ, Vakser IA. Protein docking by the interface structure similarity: how much structure is needed? Plos One. 2012;7:e31349. doi: 10.1371/journal.pone.0031349. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Tuncbag N, Gursoy A, Keskin O. Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces. Phys Biol. 2011;8:035006. doi: 10.1088/1478-3975/8/3/035006. [DOI] [PubMed] [Google Scholar]
30.Mintseris J, Pierce B, Wiehe K, et al. Integrating statistical pair potentials into protein complex prediction. Proteins. 2007;69:511–20. doi: 10.1002/prot.21502. [DOI] [PubMed] [Google Scholar]
31.Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–42. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Aloy P, Ceulemans H, Stark A, et al. The relationship between sequence and interaction divergence in proteins. J Mol Biol. 2003;332:989–98. doi: 10.1016/j.jmb.2003.07.006. [DOI] [PubMed] [Google Scholar]
33.Garma L, Mukherjee S, Mitra P, et al. How many protein-protein interactions types exist in nature? Plos One. 2012;7:e38913. doi: 10.1371/journal.pone.0038913. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Kundrotas PJ, Zhu ZW, Janin J, et al. Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci USA. 2012;109:9438–41. doi: 10.1073/pnas.1200678109. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Hwang H, Vreven T, Janin J, et al. Protein-protein docking benchmark version 4.0. Proteins. 2010;78:3111–14. doi: 10.1002/prot.22830. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Murzin AG, Brenner SE, Hubbard T, et al. SCOP–a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–40. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
37.Mashiach E, Nussinov R, Wolfson HJ. FiberDock: a web server for flexible induced-fit backbone refinement in molecular docking. Nucleic Acids Res. 2010;38:W457–61. doi: 10.1093/nar/gkq373. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci USA. 1988;85:2444–8. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Hubbard SJ, Thornton JM. ‘NACCESS computer program’, 1993. Department of Biochemistry and Molecular Biology, University College of London, UK. Program is available at: http://www.bioinf.manchester.ac.uk/naccess/ (11 May 2004, date last accessed) [Google Scholar]
40.Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
41.Zhu JH, Weng Z. FAST: a novel protein structure alignment algorithm. Proteins. 2005;58:618–27. doi: 10.1002/prot.20331. [DOI] [PubMed] [Google Scholar]
42.Gao M, Skolnick J. New benchmark metrics for protein-protein docking methods. Proteins. 2011;79:1623–34. doi: 10.1002/prot.22987. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Hwang H, Vreven T, Whitfield TW, et al. A machine learning approach for the prediction of protein surface loop flexibility. Proteins. 2011;79:2467–74. doi: 10.1002/prot.23070. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Pierce B, Weng Z. ZRANK: reranking protein docking predictions with an optimized energy function. Proteins. 2007;67:1078–86. doi: 10.1002/prot.21373. [DOI] [PubMed] [Google Scholar]
45.Vreven T, Hwang H, Weng Z. Integrating atom-based and residue-based scoring functions for protein-protein docking. Protein Sci. 2011;20:1576–86. doi: 10.1002/pro.687. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Pierce B, Weng Z. A combination of rescoring and refinement significantly improves protein docking performance. Proteins. 2008;72:270–9. doi: 10.1002/prot.21920. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Vreven T, Hwang H, Weng Z. Exploring angular distance in protein-protein docking algorithms. PLoS ONE. 2013;8:e56645. doi: 10.1371/journal.pone.0056645. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B1] 1.Katchalski-Katzir E, Shariv I, Eisenstein M, et al. Molecular-surface recognition - determination of geometric fit between proteins and their ligands by correlation techniques. Proc Natl Acad Sci USA. 1992;89:2195–9. doi: 10.1073/pnas.89.6.2195. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B2] 2.Vakser IA. Protein docking for low-resolution structures. Protein Eng. 1995;8:371–7. doi: 10.1093/protein/8.4.371. [DOI] [PubMed] [Google Scholar]

[bbt047-B3] 3.Norel R, Lin SL, Wolfson HJ, et al. Molecular-surface complementarity at protein-protein interfaces–the critical role played by surface normals at well places, sparse, points in docking. J Mol Biol. 1995;252:263–73. doi: 10.1006/jmbi.1995.0493. [DOI] [PubMed] [Google Scholar]

[bbt047-B4] 4.Gabb HA, Jackson RM, Sternberg MJE. Modelling protein docking using shape complementarity, electrostatics and biochemical information. J Mol Biol. 1997;272:106–20. doi: 10.1006/jmbi.1997.1203. [DOI] [PubMed] [Google Scholar]

[bbt047-B5] 5.Vakser IA, Matar OG, Lam CF. A systematic study of low-resolution recognition in protein-protein complexes. Proc Natl Acad Sci USA. 1999;96:8477–82. doi: 10.1073/pnas.96.15.8477. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B6] 6.Mandell JG, Roberts VA, Pique ME, et al. Protein docking using continuum electrostatics and geometric fit. Protein Eng. 2001;14:105–13. doi: 10.1093/protein/14.2.105. [DOI] [PubMed] [Google Scholar]

[bbt047-B7] 7.Chen R, Weng Z. Docking unbound proteins using shape complementarity, desolvation, and electrostatics. Proteins. 2002;47:281–94. doi: 10.1002/prot.10092. [DOI] [PubMed] [Google Scholar]

[bbt047-B8] 8.Dominguez C, Boelens R, Bonvin A. HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125:1731–7. doi: 10.1021/ja026939x. [DOI] [PubMed] [Google Scholar]

[bbt047-B9] 9.Chen R, Li L, Weng Z. ZDOCK: an initial-stage protein-docking algorithm. Proteins. 2003;52:80–7. doi: 10.1002/prot.10389. [DOI] [PubMed] [Google Scholar]

[bbt047-B10] 10.Andrusier N, Mashiach E, Nussinov R, et al. Principles of flexible protein-protein docking. Proteins. 2008;73:271–89. doi: 10.1002/prot.22170. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B11] 11.Lyskov S, Gray JJ. The RosettaDock server for local proteinprotein docking. Nucleic Acids Res. 2008;36:W233–8. doi: 10.1093/nar/gkn216. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B12] 12.Ritchie DW, Kozakov D, Vajda S. Accelerating and focusing protein-protein docking correlations using multi-dimensional rotational FFT generating functions. Bioinformatics. 2008;24:1865–73. doi: 10.1093/bioinformatics/btn334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B13] 13.Lu L, Lu H, Skolnick J. MULTIPROSPECTOR: an algorithm for the prediction of protein-protein interactions by multimeric threading. Proteins. 2002;49:350–64. doi: 10.1002/prot.10222. [DOI] [PubMed] [Google Scholar]

[bbt047-B14] 14.Aloy P, Russell RB. Interrogating protein interaction networks through structural biology. Proc Natl Acad Sci USA. 2002;99:5896–901. doi: 10.1073/pnas.092147999. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B15] 15.Shatsky M, Nussinov R, Wolfson HJ. A method for simultaneous alignment of multiple protein structures. Proteins. 2004;56:143–56. doi: 10.1002/prot.10628. [DOI] [PubMed] [Google Scholar]

[bbt047-B16] 16.Aloy P, Bottcher B, Ceulemans H, et al. Structure-based assembly of protein complexes in yeast. Science. 2004;303:2026–9. doi: 10.1126/science.1092645. [DOI] [PubMed] [Google Scholar]

[bbt047-B17] 17.Aytuna AS, Gursoy A, Keskin O. Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces. Bioinformatics. 2005;21:2850–5. doi: 10.1093/bioinformatics/bti443. [DOI] [PubMed] [Google Scholar]

[bbt047-B18] 18.Gunther S, May P, Hoppe A, et al. Docking without docking: ISEARCH-Prediction of interactions using known interfaces. Proteins. 2007;69:839–44. doi: 10.1002/prot.21746. [DOI] [PubMed] [Google Scholar]

[bbt047-B19] 19.Launay G, Simonson T. Homology modelling of protein-protein complexes: a simple method and its possibilities and limitations. BMC Bioinformatics. 2008;9:427. doi: 10.1186/1471-2105-9-427. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B20] 20.Kundrotas PJ, Lensink MF, Alexov E. Homology-based modeling of 3D structures of protein-protein complexes using alignments of modified sequence profiles. Int J Biol Macromol. 2008;43:198–208. doi: 10.1016/j.ijbiomac.2008.05.004. [DOI] [PubMed] [Google Scholar]

[bbt047-B21] 21.Chen HL, Skolnick J. M-TASSER: an algorithm for protein quaternary structure prediction. Biophys J. 2008;94:918–28. doi: 10.1529/biophysj.107.114280. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B22] 22.Sinha R, Kundrotas PJ, Vakser IA. Docking by structural similarity at protein-protein interfaces. Proteins. 2010;78:3235–41. doi: 10.1002/prot.22812. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B23] 23.Kundrotas PJ, Vakser IA. Accuracy of protein-protein binding sites in high-throughput template-based modeling. Plos Comput Biol. 2010;6:e1000727. doi: 10.1371/journal.pcbi.1000727. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B24] 24.Ghoorah AW, Devignes MD, Smail-Tabbone M, et al. Spatial clustering of protein binding sites for template based protein docking. Bioinformatics. 2011;27:2820–7. doi: 10.1093/bioinformatics/btr493. [DOI] [PubMed] [Google Scholar]

[bbt047-B25] 25.Tuncbag N, Gursoy A, Nussinov R, et al. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM. Nat Protoc. 2011;6:1341–54. doi: 10.1038/nprot.2011.367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B26] 26.Mukherjee S, Zhang Y. Protein-protein complex structure predictions by multimeric threading and template recombination. Structure. 2011;19:955–66. doi: 10.1016/j.str.2011.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B27] 27.Tuncbag N, Keskin O, Nussinov R, et al. Fast and accurate modeling of protein-protein interactions by combining template-interface-based docking with flexible refinement. Proteins. 2012;80:1239–49. doi: 10.1002/prot.24022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B28] 28.Sinha R, Kundrotas PJ, Vakser IA. Protein docking by the interface structure similarity: how much structure is needed? Plos One. 2012;7:e31349. doi: 10.1371/journal.pone.0031349. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B29] 29.Tuncbag N, Gursoy A, Keskin O. Prediction of protein-protein interactions: unifying evolution and structure at protein interfaces. Phys Biol. 2011;8:035006. doi: 10.1088/1478-3975/8/3/035006. [DOI] [PubMed] [Google Scholar]

[bbt047-B30] 30.Mintseris J, Pierce B, Wiehe K, et al. Integrating statistical pair potentials into protein complex prediction. Proteins. 2007;69:511–20. doi: 10.1002/prot.21502. [DOI] [PubMed] [Google Scholar]

[bbt047-B31] 31.Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–42. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B32] 32.Aloy P, Ceulemans H, Stark A, et al. The relationship between sequence and interaction divergence in proteins. J Mol Biol. 2003;332:989–98. doi: 10.1016/j.jmb.2003.07.006. [DOI] [PubMed] [Google Scholar]

[bbt047-B33] 33.Garma L, Mukherjee S, Mitra P, et al. How many protein-protein interactions types exist in nature? Plos One. 2012;7:e38913. doi: 10.1371/journal.pone.0038913. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B34] 34.Kundrotas PJ, Zhu ZW, Janin J, et al. Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci USA. 2012;109:9438–41. doi: 10.1073/pnas.1200678109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B35] 35.Hwang H, Vreven T, Janin J, et al. Protein-protein docking benchmark version 4.0. Proteins. 2010;78:3111–14. doi: 10.1002/prot.22830. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B36] 36.Murzin AG, Brenner SE, Hubbard T, et al. SCOP–a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–40. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]

[bbt047-B37] 37.Mashiach E, Nussinov R, Wolfson HJ. FiberDock: a web server for flexible induced-fit backbone refinement in molecular docking. Nucleic Acids Res. 2010;38:W457–61. doi: 10.1093/nar/gkq373. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B38] 38.Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci USA. 1988;85:2444–8. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B39] 39.Hubbard SJ, Thornton JM. ‘NACCESS computer program’, 1993. Department of Biochemistry and Molecular Biology, University College of London, UK. Program is available at: http://www.bioinf.manchester.ac.uk/naccess/ (11 May 2004, date last accessed) [Google Scholar]

[bbt047-B40] 40.Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]

[bbt047-B41] 41.Zhu JH, Weng Z. FAST: a novel protein structure alignment algorithm. Proteins. 2005;58:618–27. doi: 10.1002/prot.20331. [DOI] [PubMed] [Google Scholar]

[bbt047-B42] 42.Gao M, Skolnick J. New benchmark metrics for protein-protein docking methods. Proteins. 2011;79:1623–34. doi: 10.1002/prot.22987. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B43] 43.Hwang H, Vreven T, Whitfield TW, et al. A machine learning approach for the prediction of protein surface loop flexibility. Proteins. 2011;79:2467–74. doi: 10.1002/prot.23070. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B44] 44.Pierce B, Weng Z. ZRANK: reranking protein docking predictions with an optimized energy function. Proteins. 2007;67:1078–86. doi: 10.1002/prot.21373. [DOI] [PubMed] [Google Scholar]

[bbt047-B45] 45.Vreven T, Hwang H, Weng Z. Integrating atom-based and residue-based scoring functions for protein-protein docking. Protein Sci. 2011;20:1576–86. doi: 10.1002/pro.687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B46] 46.Pierce B, Weng Z. A combination of rescoring and refinement significantly improves protein docking performance. Proteins. 2008;72:270–9. doi: 10.1002/prot.21920. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bbt047-B47] 47.Vreven T, Hwang H, Weng Z. Exploring angular distance in protein-protein docking algorithms. PLoS ONE. 2013;8:e56645. doi: 10.1371/journal.pone.0056645. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Evaluating template-based and template-free protein–protein complex structure prediction

Thom Vreven

Howook Hwang

Brian G Pierce

Zhiping Weng

Abstract

INTRODUCTION

METHODS

Dataset

COTH

PRISM

ZDOCK

ZTEM

Scoring

RESULTS AND DISCUSSION

COTH

Table 1:

Comparison of COTH with ZDOCK

PRISM

Table 2:

Comparison of PRISM with ZDOCK and COTH

ZTEM for determining template availability

Table 3:

CONCLUSIONS

Key points.

ACKNOWLEDGEMENTS

Biographies

FUNDING

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Evaluating template-based and template-free protein–protein complex structure prediction

Thom Vreven

Howook Hwang

Brian G Pierce

Zhiping Weng

Abstract

INTRODUCTION

METHODS

Dataset

COTH

PRISM

ZDOCK

ZTEM

Scoring

RESULTS AND DISCUSSION

COTH

Table 1:

Comparison of COTH with ZDOCK

PRISM

Table 2:

Comparison of PRISM with ZDOCK and COTH

ZTEM for determining template availability

Table 3:

CONCLUSIONS

Key points.

ACKNOWLEDGEMENTS

Biographies

FUNDING

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases