Table 4.
Baseline methods features.
Characteristics | Methods | ||||
---|---|---|---|---|---|
1. KronRLS | 2. SimBoost | 3. DeepDTA | 4. WideDTA | 5. PADME | |
Datasets | Davis, Metz | Davis, Metz, Kiba | Davis, Kiba | Davis, Kiba | Davis, Metz, Kiba, ToxCast |
ML/DL | AI/ML | AI/ML | DL | DL | DL |
Similarity (OR) Feature based method | Similarity-based | Similarity and feature based | Feature-based | Feature-based | Feature-based |
Drug representation (or features) | PubChem Sim Chemical kernels | PubChem Sim + statistical and network features | SMILES | SMILES + LMCS | SMILES / ECFP |
Protein representation (or features) | SW sim score, Normalized SW sim score | SW sim score | aaseq | aaseq + PDM | PSC |
NN type for features learning | CNN | two 1D-CNN | GCNN | ||
NN type for prediction | 3 FC layers | FC layer | Feedforward NN | ||
Regressor/OR/activation function | KronRLS model | Gradient boosting model | ReLU | ReLU | ReLU |
Validation setting | S1, S2, S3 | S1 | S1 | S1 | S1, S2, S3 |
Cross Validation | Repeated 10-folds CV, Nested CV, LDO-CV, LTO-CV | 10 times 5 folds CV, LDO-CV, LTO-CV | 5 folds CV | 6 folds CV | 5 folds CV, LDO-CV, LTO-CV |
Performance metrics | CI, MSE | CI, RMSE | CI, MSE, PCC | CI, MSE, PCC | CI, RMSE, R2 |
Classification/Regression | Both | Both | Regression | Regression | Both |
Year | 2014 | 2017 | 2018 | 2019 | 2018 |
ML, Machine Learning; DL, Deep Learning; Sim, Similarity; aaseq, amino-acid sequence; SPS, structural property sequence; PSC, protein sequence composition; PDM, protein domain and motif; ECFP, extended-connectivity fingerprint; LMCS, ligand maximum common substructure; KronRLS, Kronecker Regularized Least Square; CNN, convolutional neural network; GCNN, graph convolution neural network; RNN, recurrent neural network; FC, fully connected; ReLU, rectified linear unit; CV, cross validation; LDO, leave one drug out; LTO, leave one target out; MSE, Mean Square Error; RMSE, root square of mean square error; CI, concordance index; PCC, Pearson correlation coefficient.