Skip to main content
. 2018 Jul 31;20(5):1878–1912. doi: 10.1093/bib/bby061

Table 9.

Predictive performance results and comparison reported in various deep learning-based VS studies

Article Evaluation metric Source of the test data sets and Data set statistics (where available) Predictive performance results
Proposed DNN methoda Compared methods
Dahl et al. [238]     Multi-task DNNb Single-task DNN RFb Decision Tree Ensembles
AUCb
  • PubChem

  • B: 19

  • AC: 69 396

  • IC: 70 331

0.825 0.793 0.783 0.795

Ma et al. [164]     Multi-task DNN RF
Pearson correlation coefficient
  • Merck in house set and other data sets

  • T: 15 (Merck)

  • C: 164 024 (Merck)

  • T: 15 (Other)

  • C: 974 795 (Other)

0.496 0.423

Unterthiner et al. [243]     Multi-task DNN SVMb BKDb LRb k-nearest neighbour Parzen-Rosenblatt Bayesian Classifier Similarity Ensemble
AUC
  • ChEMBL

  • T: 5069

  • C: 743 336

  • I: 2 103 018

0.83 0.816 0.803 0.796 0.775 0.73 0.755 0.699

Ramsundar et al. [242]     Pyramidal multi-task NNb (PMTNN) LR RF Single-Task NN (STNN) Pyramidal Single-Task NN (PSTNN) Max{LR, RF, STNN, PSTNN} Multi-task NN (MTNN)
AUC
  • PubChem Bioassay

  • B: 128

  • I: ∼282 000

0.873 0.801 0.800 0.795 0.809 0.824 0.842
  • MUV

  • B: 17

  • I: ∼15 000

0.841 0.752 0.774 0.732 0.745 0.781 0.797
  • Tox21

  • T: 12

  • I: ∼6000

0.818 0.738 0.790 0.714 0.74 0.79 0.785

Koutsoukas et al. [164]     DNN SVM (rbf kernel) SVM (linear kernel) RF k-nearest neighbour NBb  
Mean MCC
  • ChEMBL

  • T: 7

  • AC: 7218

  • IC: 72 082

0.912 0.904 0.861 0.892 0.821 0.764
Wang et al. [244]     PINNs Bipartite Local Model CS and PDb
AUC
  • sc-PDB

  • T: 836

  • C: 2710

  • I: 6830

0.959 0.799 0.858

Wan et al. [245]     DN N RF
 AUC
  • DrugBank

  • D: 2868

  • T: 3314

  • I: 9349

0.792 0.686
AUC
  • ChEMBL

  • AI: 156 083

  • II: 39 857

0.880 0.879
AUC
  • Binding DB

  • AI: 418 577

  • II: 117 210

0.875 0.855
AUC
  • PDB-Bind

  • AI: 2188

  • II: 578

0.880 0.763

Lenselink et al. [246]     DNN PCMb DNN QSAR DNN Multi Class LR QSAR SVM QSAR NB QSAR RF QSAR RF Multi Class RF PCM
AUC
  • ChEMBL

  • T: 1, 227

  • C: 204, 085

  • I: 314, 767

0.894 0.879 0.89 0.858 0.858 0.679 0.868 0.502 0.845
MCC 0.610 0.600 0.63 0.572 0.572 0.380 0.630 0.010 0.670

Wen et al. [85]     Deep Belief Network Bernoulli NB Decision Trees RF
AUC
  • DrugBank

  • D: 1412

  • T: 1520

  • AI: 6262

  • II: 6262

0.916 0.754 0.768 0.910

Wang et al. [248]     RBMs Logic-based approach
AUC
  • MATADOR and STITCH

  • D: 784 (MATADOR)

  • T: 2431 (MATADOR)

  • I: 13 064 (MATADOR)

  • D: 598 (STITCH)

  • T: 671 (STITCH)

  • I: 3296 (STITCH)

0.987  0.921 
AUC (precision vs. recall) 0.896  0.816 
Wallach et al. [165]     Conv. DNN (AtomNet) Smina
AUC ChEMBL-20 PMD 0.781 0.552
  • ChEMBL-20 in-actives

  • A subset of:

  • T: 290

  • AC: 78, 904

  • IC: 2, 367, 120

0.745 0.607
DUD-E-30 0.855 0.700
  • DUD-E-102

  • A subset of:

  • T: 102

  • AC: 22 886

0.895 0.696

Gonczarek et al. [249]     Graph Conv. DNN Neural fingerprints AutoDock Vina Smina
AUC
  • DUD-E

  • T: 102

  • AC: 22 886

  • IC: ∼1

  • million

0.567 0.704 0.633 0.642
  • MUV

  • B: 17

0.474 0.575 0.503 0.503

Kearnes et al. [251]     Graph Conv. DNN MaxSim LR RF Pyramidal multi-task NN
AUC
  • PubChem Bioassay

  • B: 128

  • I: ∼282 000

0.908 0.754 0.838 0.804 0.905
  • MUV

  • B: 17

  • I: ∼15 000

0.858 0.638 0.736 0.655 0.869
  • Tox21

  • T: 12

  • I: ∼6000

0.867 0.728 0.789 0.802 0.854

Altae-Tran et al. [166]     Iterative refinement LSTMb Graph Conv. DNN Siamese one-shotlearning Attention LSTM RF
AUC
  • Tox21

  • B: 12

0.823 0.648 0.820 0.801 0.586
  • SIDER

  • 27 side effects

0.669 0.483 0.687 0.553 0.535
  • MUV

  • B: 17

0.499 0.568 0.601 0.504 0.754
a

In the case of multiple DNN methods proposed, one of them is shown under the proposed method column and the rest are given under the group of compared methods.

b

Abbreviations: T: target, D: drug, C: compound, AC: active compound, IC: inactive compound, I: interaction, B: bioassay, AI: active interaction, II: inactive interaction, DNN: deep neural network, RF: random forest, AUC: Area under the ROC curve, SVM: support vector machine, BKD: binary kernel discrimination, LR: logistic regression, NB: naive Bayes, CS & PD: chemical substructures and protein domains, PCM: proteochemometric modelling, NN: neural net, LSTM: long short-term memory.