Skip to main content
. 2021 Nov 27;23(1):bbab476. doi: 10.1093/bib/bbab476

Table 1.

Classic ML methods to predict protein–ligand binding sites

SN Approach Techniques Features Database used Year
1 Oriented Shell Model [73] Support vector machine Developed oriented shell model, utilizing distance and angular position distribution Self-curated 2005
2 SitePredict [74] Random forest Predicted small ligand-binding sites mobilizing backbone structure Self-curated 2008
3 LIBRUS [75] Support vector machine Combined ML and homology information for sequence-based ligand-binding residue prediction Self-curated + FINDSITE’s database 2009
4 Qiu and Wang’s method [76] Random forest Used eight structural properties to train random forest classifiers, latter combined to predict binding residues Q-SiteFinder’s dataset 2011
5 Wong et al.’s method [77] Support vector machine + differential evolution Classified the grid points with the location most likely to contain bound ligands LigASite 2012
6 DoGSiteScorer [78] Support vector machine Web server for binding site prediction, analysis and druggability assessment Self-curated 2012
7 Wong et al.’s method [79] Support vector machine Used SVM to cluster most probable ligand-binding pockets using protein properties LigASite + self-curated 2013
8 TargetS [80] Support vector machine + modified AdaBoost Designed template-free predictor with classifier ensemble and spatial clustering BioLip 2013
9 Wang et al.’s method [81] Support vector machine + statistical depth function SVM model integrating sequence and structural information PDBbind 2013
10 LigandRFs [82] Random forest Applied random forest ensemble to identify ligand-binding residues from sequence information alone CASP9 targets + CASP8 targets 2014
11 Suresh et al.’s method [83] Naive Bayes classifier Trained Naive Bayes classifier using only sequence-based information Self-curated 2015
12 OSML [84] Support vector machine Proposed dynamic learning framework for constructing query-driven prediction models BioLip + CASP9 targets 2015
13 PRANK [7] Random forests Developed mechanism to prioritize the predicted putative pockets Astex Diverse set + self-curated 2015
14 UTProt Galaxy [85] Support vector machine + neural network + random forest Developed pipeline for protein–ligand binding site predictive tools using multiomics big data Self-curated 2015
15 Chen et al.’s method [86] Random forest Proposed dynamic ensemble approach to identify protein–ligand binding residues by using sequence information ccPDB + CASP9 targets + CASP8 targets 2016
16 Chen et al.’s method [87] Random forest Predicted allosteric and functional sites on proteins PDBbind + allosteric DB + CATH DB 2016
17 TargetCom [88] Support vector machine + modified AdaBoost algorithm Designed ligand-specific methods to predict the binding sites of protein–ligand interactions by an ensemble classifier BioLip 2016
18 P2Rank 2.1 [89] Bayesian optimization Improved version of P2Rank Self-curated 2017
19 P2Rank [90] Random forest Built stand-alone template-free tool for prediction of ligand-binding sites Self-curated 2018
20 PrankWeb [91] Random forest Online resource providing an interface to P2Rank Self-curated 2019