Table 2.
Feature types | Dataset | Model | Design for model testing | Performance | |
---|---|---|---|---|---|
study 1 [19] | Morgan FP; individual genotypes | GDSC, CTRPv2, PDX samples | Neural network | Across cell line-drug pairs | Median Spearman’s rho = 0.37 |
study 2 [21] | Gene expression; genomic mutation; protein interaction network | Colorectal and bladder cancer patients | Ridge regression | Across organoid | Correlation r square = 0.89/0.98 |
study 3 [22] | Gene expression | GDSC, CCLE, LINCS | Ensemble learning | Cross validation within dataset | MSE = 2.0–4.8 |
study 4 [23] | Gene expression | Three clinical datasets of cancer patients | Transfer learning | Cross validation within dataset | Mean AUC = 0.758 |
study 5 [24] | Gene expression | GDSC, clinical trial data | Neural network | Cross validation within dataset | The difference of predicted IC50s |
study 6 [25] | gene expression; genomic mutation; CNV | GDSC, CCLE | rotation forest | cross validation within dataset | MSE = 3.14 on GDSC and 0.404 on CCLE |
study 7 [26] | Gene expression; DNA methylation; genomic mutation; CNV | 265 anti-cancer drugs in 961 cell lines | SVM and elastic net regression | Cross validation within dataset | Pearson’s correlation = 0.3–0.5 |
study 8 [27] | Gene expression | CTRPv2, LINCS | Semi-supervised autoencoder | Across cell lines | AUROC = ~0.7 |
study 9 [28] | Gene expression; protein targets of drugs and pathways | GDSC | Bayesian model, MTL | Within and across cell lines and drugs | Pearson’s correlation = 0.30–0.93 |
study 10 [29] | Structure-based drug similarity; cell line similarity | GDSC, CCLE | A heterogeneous network | Across cell lines | Pearson’s correlation = ~0.8 on CCLE and ~0.45 on GDSC |
study 11 [30] | ECFPs; drug response similarity | CMap of 2.9 million compound pairs | Neural network | Across compound pairs | Pearson’s correlation = 0.518 |
study 12 [31] | Gene expression | GDSC | LASSO | Across tumor samples | P-values on response differences |
study 13 [32] | Gene expression | The NeoALTTO clinical trial dataset | Gene expression similarity | Leave-one-out cross-validation across samples | Concordance index > 0.8 |
study 14 [33] | Chemoinformatic features and FPs; multiomic data | GDSC, CCLE | Logistic regression | Across drug-cell line pairs | AUROC = ~0.7 on GDSC |
study 15 [34] | Cell line mutations; protein–protein interaction network | GDSC, CCLE | A link prediction approach | Leave-one-out cross-validation | AUROC = 0.8474 |
study 16 [35] | Gene expression | GDSC, clinical trials of two drugs | Kernelized rank learning | Cross validation within dataset | precision = 23% - 36% |
study 17 [36] | Chemoinformatic features and FPs; genomic data | NCI-ALMANAC | Neural network | Cross validation within dataset | Pearson’s correlation = 0.97 |
study 18 [37] | Gene expression | Pan-cancer TCGA | Random forest | Across tumor samples | accuracy = 86% and AUC = 0.71 |
study 19 [38] | Molecular FPs; gene expression | GDSC, CCLE | Neural network | Cross validation within dataset | AUROC = 0.89 on GDSC and 0.95 on CCLE |
study 20 [39] | Gene expression | Clinical trial data from TCGA | SVM | Leave-one-out cross-validation | Accuracy > 80% |
study 21 [40] | Proteomic, phosphoproteomic and transcriptomic data | Multiple cancer cell lines | Multiple regression models | Across cell lines | MSE < 0.1 and Spearman’s correlation = 0.7 |
study 22 [10] | Molecular graphs; genomic data | GDSC | GNN | Across cell lines, drugs, and cell line-drug pairs | Pearson’s correlation = 0.9310 and RMSE = 0.0243 across pairs |
study 23 [6] | Omic data; monotherapy; gene–gene interaction network | GDSC, CCLE, AZSDC | Random forest | Across drug–drug pairs | Pearson’s correlation = 0.47 |
study 24 [4] | Monotherapy; genomic mutation; CNV; gene expression | AZSDC | Random forest | Across drug–drug pairs | Pearson’s correlation = 0.53 |
study 25 [41] | Monotherapy; omic data | GDSC, COSMIC, AZSDC, PDX | Ensemble models | Across drug–drug pairs | Pearson’s correlation = 0.24 and ANOVA –log10(p) = 12.6 |
study 26 [42] | Chemoinformatic features, SMILES and FPs; genomic data | GDSC | Neural network | Across cell lines | Pearson’s correlation = 0.79 and RMSE = 0.97 |
study 27 [43] | Molecular FPs; sequence variation | GDSC, COSMIC | Neural network | Within cancer types | Coefficient of determination = 0.843 and RMSE = 1.069 |
study 28 [44] | SMILES; gene expression; protein–protein interaction network | GDSC | Neural network | Across cell lines, drugs, and cell line-drug pairs | Pearson’s correlation = 0.928 and RMSE = 0.887 across pairs |
study 29 [45] | Gene expression; genomic mutation | CCLE, CTD2, UCSC TumorMap | Neural network | Across cell line-drug pairs | Pearson’s correlation = 0.70–0.96 |
study 30 [46] | SMILES and FPs; gene expression data | GDSC | Neural network | Across cell line and drugs | RMSE = 0.110 + − 0.008 |
study 31 [17] | Canonical SMILES; mutation state; CNV | GDSC | Neural network | Across cell line-drug pairs | Pearson’s correlation = 0.909 and RMSE = 0.027 |
study 32 [47] | Graph representation; genomic mutation; CNV; DNA methylation | GDSC, CCLE, TCGA | GNN | Across cell lines, drugs, and cell line-drug pairs | Pearson’s correlation = 0.923 across pairs on TCGA |
study 33 [48] | Molecular FPs | NCI-ALMANAC | Neural network | Across drug–drug pairs | Pearson’s correlation = 0.95–0.98 |
study 34 [49] | Chemoinformatic features and FPs; gene expression | Multiple cancer cell lines | Neural network | Across drug–drug pairs | Pearson’s correlation = 0.73 |
study 35 [50] | Chemoinformatic features and FPs | NCI-ALMANAC | Random forest, XGBoost | Across drug–drug pairs | Pearson’s correlation = 0.43–0.86 |
study 36 [51] | Drug target; gene expression | AZSDC, GDSC, NCI-ALMANAC | Multitask learning | Across cell lines | Pearson’s correlation = 0.23 breast/0.36 colon/0.17 lung |
study 37 [52] | Molecular FPs and SMILES; gene expression; monotherapy | Multiple drug synergy databases | Neural network | Across drug–drug pairs | AUROC = 0.9577 and MSE = 174.3 |
study 38 [53] | Drug similarity and protein similarity; drug target | Multiple drug synergy databases | Multitask learning | Across drug–drug pairs | AUROC = 0.8658 / 0.8715/0.8791 |
study 39 [54] | Drug similarity; gene expression similarity | NCI-DREAM Drug Synergy data | Logistic regression | Across drug–drug pairs | AUROC = 0.43–0.74 and Pearson’s correlation = 0.42–0.74 |
study 40 [55] | Drug target pathways; monotherapy | Drug Combination Database, literature | A manifold ranking algorithm | In vitro validation | Probability concordance = 0.78 |
CTRP, Cancer Therapeutics Response Portal; TCGA, The Cancer Genome Atlas; PDX, Patient-Derived Xenograft; AZSDC, AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge; NCI, National Cancer Institute; AUROC, Area Under Receiver Operating Characteristic curve; SVM, Support Vector Machine; MSE, Mean Squared Error; RMSE, Root Mean Squared Error; ALMANAC, A Large Matrix of Anti-Neoplastic Agent Combinations.