Skip to main content
. 2007 Jan 12;3(4):405–411. doi: 10.1016/j.ddtec.2006.12.002

Table 1.

Overview of popular regression and learning algorithms used for virtual screening

Classification Regression Variable selection Explanatory Virtual screening
SOM Yes, useful for visualizing global data trends No Various techniques show success extracting pertinent dependent variables by grouping compounds in the same target class together Yes, when the pertinent dependent variables are optimized Identified purinergic receptor antagonists from a virtual combinatorial library [25]
Binary QSAR Yes No No No Showed superior enrichment rates when compared to Bayesian Classifiers and PLS [26]
Bayesian Classifier Yes Yes Descriptors are weighted based on how well each divides the training data Yes if the significance of each descriptor can be extracted Performed poorly compared to SVM, kNN, ANN and Decision trees [27]
Decision trees Yes No Descriptors that best divide one class from another are used to separate the data Variables used in the tree(s) suggest activity dependency Slightly outperformed a Bayes Classifier in a comparison study [27]
PLS variants Yes Yes Variable selection techniques are commonly added above PLS model building Yes, when a variable selection technique is incorporated Ligands for various GPCR targets were successfully enriched from a test database [19]
ANN Yes Yes Performed internally No Comparable enrichment rates in a direct comparison to SVM and kNN [27]
SVM Yes Yes Performed internally Yes, if the weights of each descriptor are explicitly solved. Identified previously characterized Dopamine D1 Inhibitors and suggested new hits [29]
kNN Yes Yes Commonly a genetic algorithm or simulated annealing is used Descriptors selected by multiple models imply relevance to the target property Identified several anticonvulsant compounds that were experimentally confirmed [28]