Skip to main content
. 2021 Jun 3;12:3307. doi: 10.1038/s41467-021-23165-1

Table 1.

Model classes, compound and kinase descriptors and training data used by the Round 2 top-performing teams and the baseline model17.

Team Algorithm type Algorithm name Combined models Training strategy
DMIS_DK Deep learning, multi-target learning Multi-task graph convolutional neural networks 12 Train test split
AI Winter is Coming Gradient boosting decision trees XGboost 5 per target K-fold nested cross validation, boosting
Q.E.D Kernel learning CGKronRLS 440 Boosting
Gregory Koytiger Deep learning, artificial neural network Not applicable 6 Fixed hold out
Olivier Labayle Ridge regression Not applicable Not applicable K-fold cross validation
Baseline Kernel learning CGKronRLS 1 K-fold nested cross validation
Team Training data sources Compound-protein pairs Bioactivity types Protein features Chemical features
DMIS_DK DrugTargetCommons, BindingDB 953521 Kd, Ki, IC50 None Molecular graphs
AI Winter is Coming DrugTargetCommons, ChEMBL 600000 Kd, Ki, IC50, EC50 %inh, %activity None ECFP5, ECFP7, ECFP9, ECFP11
Q.E.D DrugTargetCommons, ChEMBL, UniProt 60462 Kd, Ki, EC50 Amino acid sequences ECFP4, ECFP6
Gregory Koytiger ChEMBL 250000 Kd, Ki, IC50 Amino acid sequences SMILES strings
Olivier Labayle DrugTargetCommons, ChEMBL, UniProt 18200 Kd K-mer counting ECFP
Baseline DrugTargetCommons 44186 Kd Amino acid sequences Path-based fingerprints

Even if the teams chose to combine predictions from multiple models, they had to submit only one prediction per compound-kinase pair for scoring against the measured activities. Supplementary Table 1 provides further details of all the models submitted together with method surveys and model performances in Round 2.