Table 4.
Computational methods for prediction of protein-protein interaction.
Technique | Algorithms | Strengths | Weaknesses | Organism | Reference |
---|---|---|---|---|---|
Phylogenetic | Cluster analysis, maximum likelihood, maximum parsimony, Bayesian inference | Provides information of selective environmental pressure | Difficult to estimate divergence of proteins | H. pylori, P. falciparum | Ratmann et al., 2007 |
Machine learning | Random forest, decision tree, k-nearest neighbors, bayesian, Neural networks, support vector machine | Simple to understand, accurate | Dependent of parameter settings and features, black-box predictor, large data set for training | Vibrio cholerae, P. aeruginosa | Nanni et al., 2012; Ehrenberger et al., 2015 |
Data mining | Named entity recognition, ID3, Computational of natural language processing, C4.5 | Fast and process large volumes of information, good to focused list | It is sensitive to noise, require manually curation | H. pylori, Campylobacter jejuni | Bock and Gough, 2003 |
Topological | Power-law degree distribution, clustering coefficient | Common topological characteristics among species (small-world), comparison with random networks | False positives proportional to the size of the network, configuration of protein modules may vary | E. coli | Butland et al., 2005; Wuchty, 2006; Sharan et al., 2007 |
Structure | Shape complementarity, rigid-body docking, heuristic potential | Accurate, good availability of data for primary and secondary structure | Slow development for high throughput methodologies | E. coli, S. typhimurium and T. maritima | Matsuzaki et al., 2014 |