Table 1.
Model classes, compound and kinase descriptors and training data used by the Round 2 top-performing teams and the baseline model17.
| Team | Algorithm type | Algorithm name | Combined models | Training strategy | |
|---|---|---|---|---|---|
| DMIS_DK | Deep learning, multi-target learning | Multi-task graph convolutional neural networks | 12 | Train test split | |
| AI Winter is Coming | Gradient boosting decision trees | XGboost | 5 per target | K-fold nested cross validation, boosting | |
| Q.E.D | Kernel learning | CGKronRLS | 440 | Boosting | |
| Gregory Koytiger | Deep learning, artificial neural network | Not applicable | 6 | Fixed hold out | |
| Olivier Labayle | Ridge regression | Not applicable | Not applicable | K-fold cross validation | |
| Baseline | Kernel learning | CGKronRLS | 1 | K-fold nested cross validation | |
| Team | Training data sources | Compound-protein pairs | Bioactivity types | Protein features | Chemical features |
|---|---|---|---|---|---|
| DMIS_DK | DrugTargetCommons, BindingDB | 953521 | Kd, Ki, IC50 | None | Molecular graphs |
| AI Winter is Coming | DrugTargetCommons, ChEMBL | 600000 | Kd, Ki, IC50, EC50 %inh, %activity | None | ECFP5, ECFP7, ECFP9, ECFP11 |
| Q.E.D | DrugTargetCommons, ChEMBL, UniProt | 60462 | Kd, Ki, EC50 | Amino acid sequences | ECFP4, ECFP6 |
| Gregory Koytiger | ChEMBL | 250000 | Kd, Ki, IC50 | Amino acid sequences | SMILES strings |
| Olivier Labayle | DrugTargetCommons, ChEMBL, UniProt | 18200 | Kd | K-mer counting | ECFP |
| Baseline | DrugTargetCommons | 44186 | Kd | Amino acid sequences | Path-based fingerprints |
Even if the teams chose to combine predictions from multiple models, they had to submit only one prediction per compound-kinase pair for scoring against the measured activities. Supplementary Table 1 provides further details of all the models submitted together with method surveys and model performances in Round 2.