Table 3.
The classifier parameters are fixed by the choice from three scenarios responsible for determining the similarity between drugs, proteins, and their embedding vectors
| Classifiers | KGE | KGE-ProtBERT | Molecular fingerprint and protein characteristics |
|---|---|---|---|
| ETC | n-estimators = trees, random-state = 1357 | n-estimators = trees, random-state = 1357 | n-estimators = trees, random-state = 1357 |
| DT | random-state = 1357 | random-state = 1357 | random-state = 1357 |
| MLP | solver = lbfgs, alpha = 1e−5, hidden-layer-sizes = (5, 2), random-state = 1 | solver = lbfgs, alpha = 1e−5, hidden-layer-sizes = (240, 96), random-state = 1 | solver = lbfgs, alpha = 1e−5, hidden-layer-sizes = (240, 96), random-state = 1 |
| SGD | loss = log, penalty = l2, max-iter = 5 | loss = log, penalty = l2, max-iter = 2 | loss = log, penalty = l2, max-iter = 2 |
| Gaussian-NB | |||
| Gradient Boosting | n-estimators = 100, learning-rate = 1.0,max-depth = 1, random-state = 0 | n-estimators = 100, learning-rate = 1.0,max-depth = 2, random-state = 0 | n-estimators = 100, learning-rate = 1.0,max-depth = 2, random-state = 0 |
| Bagging Classifier | KNeighborsClassifier(), max-samples = 0.5, max-features = 0.5 | KNeighborsClassifier(n-neighbors = 1),max-samples = 1, max-features = 1 | KNeighborsClassifier(n-neighbors = 1),max-samples = 1, max-features = 1 |
| K-Neighbors | n-neighbors = 7 | n-neighbors = 2 | n-neighbors = 2 |
| RF | n-estimators = trees, n-jobs = 6, criterion = c, class-weight = balanced, random-state = 1357 | n-estimators = trees, n-jobs = 6, criterion = c, class-weight = balanced, random-state = 1357 | n-estimators = trees, n-jobs = 6, criterion = c, class-weight = balanced, random-state = 1357 |