Abstract
The prediction of drug combinations is of great clinical significance. In many diseases, such as high blood pressure, diabetes, and stomach ulcers, the simultaneous use of two or more drugs has shown clear efficacy. It has greatly reduced the progression of drug resistance. This review presents the latest applications of methods for predicting the effects of drug combinations and the bioactivity databases commonly used in drug combination prediction. These studies have played a significant role in developing precision therapy. We first describe the concept of synergy. we study various publicly available databases for drug combination prediction tasks. Next, we introduce five algorithms applied to drug combinatorial prediction, which include traditional machine learning methods, deep learning methods, mathematical methods, systems biology methods and search algorithms. In the end, we sum up the difficulties encountered in prediction models.
Keywords: drug combination, synergistic effect, drug resistance, side effects, machine learning, deep learning
1. Introduction
As a new cross-discipline, pharmacogenomics mainly studies how genomic changes affect drug response and explains the role of drugs in clinical treatment. So far, it is still challenging to use transcriptome data to predict drug responses in tumors because of the heterogeneity between cell lines and tumor cells [1]. Cell lines are cells grown in the laboratory from tumor tissue, while tumor cells are cells in real tumor tissue. Genomic and epigenetic differences exist between cell lines and tumor cells, including mutation, copy number variation, DNA methylation, histone modification, etc. Therefore, there are some differences in cell state, metabolic characteristics and cell signaling pathways between cell lines and tumor cells. This heterogeneity makes it challenging to use transcriptome data from cell lines to predict drug response in tumors. In predicting the sensitivity of melanoma to drugs, Barretina J. et al. [2] found that when applied exclusively to melanoma-derived cell lines, classifiers built using entire cell line datasets performed poorly, with a true positive of only about 0.6 when the false positive was 0.2. Models built using only melanoma cell lines performed better on the receiver operating characteristic (ROC) curve, with a true positive of 0.8 when the false positive was 0.2.
Similarly, there is considerable heterogeneity in inter-tumor and intra-tumor cells. Even patients with the same cancer type may have different prognoses under the same clinical treatment. Tumor heterogeneity makes tumor drug resistance become an urgent problem to be solved. Drug resistance is mainly caused by these mechanisms, such as the mutation of drug target [3], increased efflux of drugs [4] and amplification of an alternate pathway [5]. The main strategy to overcome tumor drug resistance is combination therapy. Combination therapy uses a variety of drugs in the process of treatment, which not only reduces drug intake and side effects but also improves the therapeutic effect by targeting multiple genes and pathways at the same time [6].
In vivo, animal experiments and in vitro drug screening for “case-by-case” identification are the main source of traditional methods for drug combinatorial discovery, but these methods are often tedious, expensive, and labor intensive [7]. In the past few decades, efficient approaches like microarray, next-generation sequencing, and multi-omics data have been developed to solve this problem [8,9]. As the number of potential drug components increases, the number of potential drug and dose combinations increases exponentially. Thus, systematically screening all possible drug combinations is not feasible [10]. Therefore, the search space for drug combinations requires appropriate computational methods urgently.
As shown in Figure 1, the synergy scoring system is broken down into four sections according to the flow of the prediction model. Section A clearly explaining the concept of synergism and antagonism between drug combinations, then input the query combinations. Section B provides information on associated databases, such as gene expression and synergistic drug combination databases. Section C introduces several useful computational methods, including traditional machine learning, deep learning (DL), mathematical methods, systems biology methods and search algorithms. Finally, the experimental validation will be needed, illustrated in Section D.
2. Quantification of Synergistic Effect
The problem of predicting drug combinations is usually defined as a classification or regression task. According to the definition, combined effects can be divided into synergistic, additive, and antagonistic effects because their effects are superior to, equal to, or inferior to the sum of the effects of each drug, respectively [11]. In the regression task, the basic assumptions for quantifying the synergistic or antagonistic effects of drug combinations are different according to the models [12]. The most used models in vivo and in vitro methods are the Loewe additivity model (Loewe) [13] and the Bliss independent model (Bliss) [14]. The overall structure of the drug combination response is shown in Figure 2.
2.1. Loewe Additivity Model
The Loewe additivity model defines the effect of a compound in combination with its combined effect. Synergistic and antagonistic effects are defined by Loewe as deviations from strict additive behavior. The doses of drugs 1 and 2 for the combination should be and . and represent the dose of drugs 1 and 2 required to achieve the combination effect when used alone. Loewe [15] argues that where no interaction occurs between drug 1 and drug 2, the additive behavior of the drug combination takes the form of:
(1) |
Similarly, when there are N non-interacting drugs, the Loewe additivity model for this combination is defined in this manner:
(2) |
where is the dosage of the drug in the compound, and is the equivalent dosage of each drug that achieves the same effect when used alone.
When a combination of drugs has a synergistic effect, the same therapeutic effect can be achieved with a smaller dose than with a single drug. Therefore, the synergism is defined as:
(3) |
In contrast, in situations where a single drug is more effective than a combination of drugs, the antagonism is defined as:
(4) |
To better understand the Loewe additivity model, Figure 3 shows a graph in Cartesian coordinates that represents the dose-response relationship between the two drugs in a combination based on isolines. It portrays curves comprising many dose pairs that achieve a specific drug response effect. Each line represents a dose combination for a particular drug pair. It can clearly distinguish the effects of different drug combinations at different doses.
2.2. Bliss Independence Model
The Bliss independent model, a classic approach to quantifying drug combination effect, considers the two drugs in the combination are probabilistically independent when used alone. Therefore, using the independent definitions of probability and statistics, the consequences of a combination can be determined [16]. According to Bliss, the reaction of each component’s concentration can be used to predict the response. is the effect of the combination, while and are the fractional effects (between 0 and 1) produced by consuming drug 1 and drug 2 separately. Thus, the effect that drugs 1 and 2 have when combined can be formulated as:
(5) |
A synergistic effect occurs when is greater than the right side of the equation. Otherwise, there is animosity when is less than the right side of the equation.
These two fundamental various strategies, Bliss and Loewe, have been commonly utilized for co-exposure tests. Nonetheless, at times, they might introduce decisively various outcomes that relate to the dose-response of a solitary medication. There are many investigations contrasting the Bliss independence model with the Loewe additivity model [17,18,19]. In these comparisons, the overall biological plausibility of the Loewe additivity model makes it slightly preferable. Specifically, when two drugs interact with the same pathway or target, Loewe expects the combined action to be better, while the Bliss independence model aims for non-interacting drug combinations. It is still unclear which model is suitable for studying the combined effects of drugs, and model selection is a major issue. Bliss may misjudge synergism, while Loewe may overemphasize antagonistic effects. One of the most significant obstacles in the field remains the lack of consensus among researchers regarding the precise quantification and definition of synergies and antagonistic relationships [20].
3. Databases in Drug Combination Prediction
Numerous omics databases have been created in response to the growth of systems biology and molecular biology. Table 1 lists some important databases of five types pertinent to recognizing effective drug combinations, including synergistic drug combination, bioactivity resources, gene expression, toxicity/off-target effects, pathways resources and interactions resources. These omics data are typically tested on various cell lines, using large numbers of single drugs or drug combinations, and have received robust experimental validation. Using these omics data, researchers can develop more efficient computational models of drug combinations to accelerate the development of clinical therapies.
Table 1.
Data Type | Database | URL | Latest Update | Description |
---|---|---|---|---|
Synergistic Drug Combination | DrugComb [21] | https://drugcomb.fimm.fi/ (accessed on 13 July 2023) | 2021 | Synergistic Drug Combination mainly contains data on the response of cancer cell lines to or combinations of drugs. |
DrugCombDB [22] | http://drugcombdb.denglab.org/ (accessed on 13 July 2023) | 2019 | ||
NCI-ALMANAC [23] | https://dtp.cancer.gov/ncialmanac (accessed on 13 July 2023) | 2017 | ||
SYNERGxDB [24] | https://www.synergxdb.ca/ (accessed on 13 July 2023) | 2019 | ||
Bioactivity resources | ChEMBL [25] | https://www.ebi.ac.uk/chembl/ (accessed on 13 July 2023) | 2023 | Bioactivity resources mainly contain the biological activity data of small molecules, drug targets, enzymes and proteins. |
DrugBank [26] | https://www.drugbank.com (accessed on 14 July 2023) | 2023 | ||
PubChem [27] | https://pubchem.ncbi.nlm.nih.gov (accessed on 14 July 2023) | 2023 | ||
Gene expression | GEO [28] | https://www.ncbi.nlm.nih.gov/geo/ (accessed on 14 July 2023) | 2013 | It mainly includes gene expression data with or without perturbation, gene methylation data and some interaction data. |
CMap [29] | https://clue.io (accessed on 14 July 2023) | 2021 | ||
LINCS | https://lincsproject.org/ (accessed on 14 July 2023) | 2022 | ||
Toxicity effects resources | SIDER [30] | http://sideeffects.embl.de/ (accessed on 15 July 2023) | 2015 | Toxicity effects resources mainly contain information about drugs and targets, as well as information on side effects. |
TOXRIC [31] | https://toxric.bioinforai.tech/ (accessed on 15 July 2023) | 2022 | ||
Tox21BodyMap [32] | https://sandbox.ntp.niehs.nih.gov/bodymap/ (accessed on 15 July 2023) | 2020 | ||
Pathways resources | Reactome [33] | https://reactome.org/ (accessed on 15 July 2023) | 2023 | Pathways resources mainly contain various biological pathways that facilitate drug combination prediction. |
Pathbank [34] | https://pathbank.org/ (accessed on 15 July 2023) | 2020 | ||
KEGG Pathways [35] | https://www.kegg.jp/kegg/pathway.html (accessed on 15 July 2023) | 2023 | ||
Interactions resources | TTD [36] | https://db.idrblab.net/ttd/ (accessed on 16 July 2023) | 2023 | It mainly contains information about drug targets and targeted drugs that interact with them. |
Bingding DB [37] | http://www.bindingdb.org (accessed on 16 July 2023) | 2023 | ||
HPRD [38] | http://www.hprd.org/ (accessed on 16 July 2023) | 2008 | ||
STRING [39] | https://string-db.org/ (accessed on 16 July 2023) | 2021 | ||
STITCH [40] | http://stitch.embl.de/ (accessed on 16 July 2023) | 2016 |
3.1. Drug Combination Resources
Drug Combination resources mainly include DrugComb, DrugCombDB, SYNERGxDB, NCI-ALMANAC, etc., to collect information about anti-cancer drug combinations. There is a significant overlap between the first three databases because some of their data derives from NCI-ALMANAC.
DrugComb [21] is a community-driven data portal for storing and analyzing drug combination and monotherapy screening data. It offers network modeling tools to picture the system of activity of a drug or combination of drugs for a specific disease sample. DrugComb database contains 8397 unique drugs, 2320 cell lines representing 33 tissues and over 750 thousand unique drug combinations obtained from 37 studies. In addition, DrugComb also provides these combinations with five different types of synergy scores, including Bliss, HSA, Loewe, ZIP and S scores. In MatchMaker, a model proposed by Kuru, H.I. et al. [41] synergy scoring data provided by the DrugComb is used. The model has three layers of network architecture, two layers of drug-specific subnetworks (DSNs) and one layer of synergy prediction network (SPN).
DrugCombDB [22] is another web-based drug combination that integrates multiple data sources and drug combinations, including high-throughput screening analysis of drug combinations, external databases, and manual management of PubMed literature. This database contains more than 6.8 million experimental data with quantitative dose-response and concentrations of drug combinations encompassing 2 thousand drugs and 124 human malignant growth lines.
NCI-ALMANAC [23] database is an enormous matrix of combinations of antineoplastic agents. It has tested over five thousand combinations of 104 approved drugs and measured synergies against 60 cancer cell lines, resulting in more than 290 thousand synergies scores. The study by Sidorov P. et al. [42] was modeled on a dataset provided by NCI-ALMANAC to predict synergy scores for each NCI-60 cell line. They used the Random Forest (RF) algorithm and the Limit Gradient Lift (XGBoost) algorithm to build 2 separate models for each cell line.
SYNERGxDB [24] is a cloud-based pharmacogenomics portal that identifies synergies by incorporating numerous high-throughput drug combination studies with sub-atomic and pharmacological profiles of an enormous board of malignant growth cell lines. Additionally, it provides analytical tools for predicting biomarkers across cancers and identifying successful treatment combinations.
3.2. Bioactivity Resources
ChEMBL [25] is a chemical database of bioactive molecules with drug-like properties that have been manually curated. The current version of the CHEMBL database contains more than 2.3 million distinct compounds, 15 thousand protein targets and 20 million bioactivity measurements. Ye, Z et al. [43] proposed ScaffComb, a deep learning framework that can be applied to ChEMBL databases for virtual screening of drug combinations in enormous synthetic information bases.
DrugBank [26] is a web-enabled database that combines specific information about drug information with thorough drug target information. DrugBank included data on 2358 small molecule and biotechnology drugs, 4563 drug targets, 497 drug metabolizing enzymes and drug transporters, and 2242 compound drug-target binding constants. Ke, J. et al. [44] searched for candidate compounds and aspirin target information from DrugBank to find drug combinations with antiplatelet effects. They finally verified the synergistic effects of Ginkgo biloba extract.
PubChem [27] contains more than 115 million unique chemical structures, 306 million chemical entities, 304 million biological activity data points and 204 thousand interactions between chemicals, genes, and proteins. Unlike DrugBank, which has detailed drug information, PubChem is more like ChEMBL, which focuses more on chemical information.
3.3. Gene Expression Resources
In drug synergistic studies, Gene Expression Omnibus (GEO) [28] can query gene expression, including expression chip data, genome methylation, genome-protein interaction, etc. Lv, Y. et al. [45] used the GEO database to collect relevant gene expression and clinical data for osteosarcoma and para-cancerous tissues while investigating drug response prediction for osteosarcoma.
A database named the Library of Integrated Network-Based Cellular Signatures (LINCS) allows for comparisons of cell expression profiles or other cell processes before and after cell perturbation by various methods, mainly including CMap-based L1000, Drug Toxicity Signature Generation Center (DToxS), etc. Aissa, A.F. et al. [46] used an established preclinical model of non-small-cell lung carcinoma (NSCLC) to analyze recognized markers utilizing LINCS to foresee and validate the function of small molecules.
Connectivity Map (CMap) [29] is a database created by the LINCS Center for Transcriptomics at the Broad Institute using the L1000 sequencing method, primarily used to demonstrate the functional connection between genes, disease states, and small molecule compounds. To save the sequencing cost, only 978 representative landmark genes were sequenced during the sequencing process, and the expression level of the remaining 11,350 genes was predicted by advanced algorithms. Jin L. et al. [47] proposed a CMAP-based scoring framework for predicting new adaptation diseases for drug combinations. In this framework, CMap gives an information-driven way to deal with the recognition of the relationship between genes, diseases, and drugs.
3.4. Toxicity Effects Resources
Side Effect Resource (SIDER) [30] database that integrates information on drugs, targets, and drug side effects. It provides a platform for users to understand the effects of drugs and their adverse reactions fully. It also provides relevant information about the indications for the drug. Prinz, J. et al. [48] proposed a novel machine-learning approach that combines data from SIDER and GWASdb databases into a joint matrix. The model could be used to develop treatments with fewer side effects and test new indications for existing drugs.
TOXRIC [31] provides information on toxicological/feature data, Machine Learning (ML)-ready sub-datasets visualization of multiple benchmarks, etc. More than 113 thousand compounds, 13 toxicity datasets and 39 feature types are included in the TOXRIC data.
Tox21BodyMap [32] is an intuitive web tool that supports rapid chemical toxicity assessment and mechanism hypothesis generation. It gives a perception of mapping Tox21/ToxCast assay targets to the districts of the human body. The web server visually displays chemobiological activity patterns by mapping assay targets to organ systems.
3.5. Pathways Resources
KEGG Pathways [35] is a compilation of human responses and biological pathways, which can be used in drug combination prediction. Like KEGG, Reactome [33] is a database of peer-reviewed articles written by experts on responses and biological pathways in the human body. Compared to KEGG, it is an improved search and data mining tool that simplifies the data search and study related to biological pathways. The library currently covers pathways that concentrate on 19 species, including classical metabolic pathways, signal transduction, gene transcription regulation, and disease. In addition, it uses more than one hundred distinct online bioinformatics resources, such as the NCBI, Ensembl, and UniPro. To reveal the synergistic mechanism of natural products and anti-tumor drugs in the therapy of cancer, Chamberlin, S.R. et al. [49]. Considered pathways in the Reactome database targeted by natural products. They found a significant increase in coverage in the Reactome database relative to other databases, such as Cancer Targetome, that collected FDA-approved cancer drugs in the covered pathways. Moreover, as an interactive database, Pathbank [34] provides information on associated organelles, chemical structures, subcellular compartments, protein complex quaternary structures, and more.
3.6. Interactions Resources
The therapeutic target database (TTD) [36] contains a large amount of drug-related information on drug targets and natural product sources. Currently, TTD has a collection of more than 3 thousand targets and 30 thousand targeted binding drugs. Li, P. et al. [50] used the TTD database to develop a comprehensive model that can be used to study the mechanism of the compound Danshen formula (CDF).
Bingding DB [37] is a publicly accessible database that primarily collects affinity interactions between drug target proteins and small drug-like molecules. It is collected from US patents, scientific publications, and other databases such as PubChem, ChEMBL, etc. The Human Protein Reference Database (HPRD) [38] is the largest database of human protein interactions. The STRING database [39] incorporates all known and anticipated relationships between proteins. This database contains more than 14 thousand organisms, 67.6 million proteins, and 20 billion interactions.
Search Tool for Interacting Chemicals (STITCH) [40] can be used in predicting interactions between chemicals and genes. It is cross-linked with databases such as BindingDB. It shares protein data with STRING, a gene-association database developed by the same team. STITCH collects data from human annotation databases, including DurgBank, TTD, KEGG, Reactome, and ChEMBL. Wang T. et al. [51] used a variety of advanced computational methods to build effective predictive models. They extracted topological characteristics of each drug combination’s topology using a drug network built from STITCH.
4. Methods in Drug Combination Prediction
Over the course of the last many years, computational techniques have been broadly used to predict drug combinations, including traditional machine learning methods, deep learning methods, mathematical methods, systems biology methods, and search algorithms. A brief description of each method reviewed is listed in Table 2. Traditional machine learning applies to various feature types for high prediction accuracy in different scale databases. For a long time, traditional machine learning has been applied to improve and optimize drug discovery and design processes and integrate with other computational methods [52,53]. Deep learning methods can learn the complex nonlinear relationships between input attribute data (such as genomics) and the associated output (such as synergy score) [54]. Due to its multi-processing layer, the accuracy of deep learning models will be incredibly improved with the increment of input data, particularly huge databases [55]. The key step of the mathematical model is to collect the necessary kinetic parameters from the literature or experiments. When cellular pathways and parameterization are available, mathematical simulations can be highly accurate for combinatorial drug discovery [56]. Systems biology methods analyze the therapeutic effects of drug combinations through various biological networks, which take a lot of biological knowledge [57]. Search algorithms are seen as a method that endeavors to investigate feature spaces, using the high performance of computers to purposefully exhaust some or all possible scenarios of a problem-solving space [15].
Table 2.
Methods | Algorithms | URL | Characteristics | Reference |
---|---|---|---|---|
Traditional machine learning | Support vector machine | — | Do well in identifying subtle patterns in complex data sets; poor interpretability; run slowly on large data sets. | [58] |
Decision tree | https://github.com/Lianlian-Wu/ForSyn (accessed on 18 July 2023) | Display visually; easy to over fit; accuracy may decrease when processing data with complex relationships. | [59] | |
Gradient boosting | — | Do well in handling nonlinear relationships and high dimensional data; easy to over fit; hyperparameters tuning is complex. | [60] | |
Deep learning methods | Feedforward neural network | — | Do well in handling nonlinear relationships and high dimensional data; easy to over fit; poor interpretability; processing large data takes a long time | [61] |
Autoencoder | https://github.com/qiaoliuhub/drug_combination (accessed on 19 July 2023) | Feature learning ability is strong; poor interpretability. | [62] | |
Graph convolutional network | https://github.com/Sinwang404/DeepDDS/tree/master (accessed on 19 July 2023) | Being able to capture the relationship and topological information between the nodes in the graph, poor interpretability, and robustness is of concern. | [63] | |
Deep belief network | Perform well in supervised study; easy to over fit; poor interpretability. | [64] | ||
Mathematical methods | Network analysis | — | Be able to capture complex interactions; good interpretability. | [65] |
Dynamic mathematical model | — | Be able to simulate drug reaction more accurately; poor interpretability. | [15] | |
Search algorithms | Breadth first search algorithm | — | Be able to consider a large amount of potential drug-target interactions; robustness is of concern. | [66] |
Systems biology methods | Signature-based model | https://tanlab.ucdenver.edu/kMap(accessed on 20 July 2023) | Understand drug action mechanisms and influencing factors more comprehensively; high requirements on data quality. | [67] |
4.1. Application of Traditional Machine Learning in Drug Combination Prediction
Traditional ML methods include Support vector machine (SVM), Decision tree (DT) and Gradient boosting (GB). SVM is often used for classification tasks, where the goal is to find hyperplanes that separate positive cases from negative cases. Like SVM, a Decision tree is a tree-shaped predictive model that judges the feasibility of various situations based on known probability of occurrence. Different from the previous two methods, Gradient boosting is an ensemble algorithm. It obtains a subset of the sample by operating the sample set and then generates a series of base classifiers. These algorithms are good classifiers for identifying whether drug combinations belong to synergistic or antagonistic effects.
Support vector machine [58] is a sparse and robust classifier. SVM is very well at identifying subtle patterns in complex data sets. SVM also introduces kernel functions for faster computation and prediction of nonlinear problems. In addition, because SVM classifies by maximizing the interval, it is robust to noise and outliers. As mentioned above, the predictive model constructed by Wang T. et al. [51] utilizes SVM to obtain the best characteristic. Their results show that the best SVM classifier they built is significantly better than one that uses only individual features, with prediction accuracy (ACC) of 0.903 and Matthew’s correlation coefficient (MCC) of 0.806. This makes sense because their classifier combines several topological information about the drug. The model is expected to be a helpful method for predicting new drug combinations that didn’t exist in the training set. On large-scale data sets, SVM training time is very slow and takes up a lot of computing resources. What’s worse, SVM is poorly interpretable because its decision boundaries are determined by the support vector rather than all training samples [68]. In general, SVM has relatively high accuracy, strong generalization ability when dealing with nonlinear data and can capture complex relationships in the data.
Decision tree [59] is easy to handle and implement. Random Forest is a Bagging integration algorithm composed of decision trees. The results of the decision tree can be visually displayed through the tree structure, which is easy to understand and visualize. Wu, L. et al. [69] built an advanced deep forest-based model, ForSyn. ForSyn is a multi-layer cascade structure with two new forest types embedded in each cascade as units. Comparing ForSyn with other advanced algorithms on several datasets, their results show that ForSyn performs better, with an area under the precision–recall curve (AUPR) of 0.591 and recall of 0.537. Unlike traditional machine learning methods, this model solves problems about the imbalance of data types and high dimensions of characteristics. However, the model is still confused by the scale of input data.
Moreover, its ability to generalize to new anti-cancer drugs or cancer cell lines is insufficient. The intrinsic problems with these drug combination predictions remain unresolved. Decision trees are prone to over-fitting training data, especially when the depth of the tree is large or there are too many leaf nodes [70]. The accuracy of decision trees mainly depends on data quality, feature selection, tree structure and parameters. In cases where there are complex interactions between complex datasets and features, the accuracy of the decision tree may decline.
Gradient boosting [60] improves the accuracy of any given learning algorithm. It shows high predictive accuracy in many machine learning tasks and is particularly good at dealing with nonlinear relationships and high-dimensional data. It can clarify the decision-making process for predicting the outcome by looking at the importance of each weak learner. Xu, Q. et al. [71] introduced a new model based on the stochastic gradient boosting algorithm called PDC-SGB. The model constructs 732-dimensional feature vectors containing biological, chemical, and pharmacological information for each drug combination. This study integrated six types of characteristics to describe drug combinations, including molecular two-dimensional structure, structural similarity, anatomical therapeutic similarity, protein-protein interactions, chemical-chemical interactions, and disease pathways. Compared with other advanced models, this model shows better performance and feature prediction ability, with its AUC up to 0.9775. However, the performance of the biological part of the model is relatively low, which may be due to the incomplete molecular network or biological pathway and the oversimplified characterization of biological features. Unfortunately, gradient-boosting algorithms are prone to overfitting on training sets, especially when there are many weak learners. The algorithm also involves more hyperparameters than other methods. As a result, hyperparameter tuning is more complex [72].
4.2. Application of Deep Learning Methods in Drug Combination Prediction
Deep learning models mainly contain Feedforward neural network (FNN), Autoencoder (AE), Graph neural network (GNN) and Deep belief network (DBN). In the Feedforward neural network, every neuron is organized in layers, and every neuron is simply associated with the neurons of the past layer. It is often used as a baseline for Deep learning methods. Unlike FNN, the autoencoder is more complex and is a semi-supervised and unsupervised learning artificial neural network for reduction and anomaly detection. The graph neural network uses deep learning to directly learn graph structure by extracting and discovering its characteristics. Deep belief networks can be utilized not exclusively to recognize characteristics and classify data but additionally to produce data.
A feedforward neural network [61] receives the outputs of the past layer and results it to the following layer without feedback. The advantage of this model is that it has strong nonlinear modeling ability and can automatically learn complex relationships between input features. In addition, it can also improve the performance of the model by increasing the number of hidden layers and neurons. Tsai, P.L. et al. [73] proposed a multi-layer FNN with two hidden layers, which was used to predict the treatment outcome of antidepressant therapy in patients with initial treatment and first diagnosis of major depressive disorder (MDD) patients during the severe depressive stage. The first layer of the neural network is the input layer, where each unit receives a one-dimensional data vector containing patient characteristics. The final layer is the output layer that performs the classification. The evaluation results show that the model has an Area Under Curve (AUC) range from 0.7 to 0.8 and can use clinical features and peripheral biochemical characteristics to predict the outcome of antidepressant therapy. The drawback is that during the model training process, they used a small sample size and could not carry out a more detailed analysis. Deep neural networks still have insufficient mechanisms to explain the interactions between variables. Besides, it requires many training samples and a complex network structure, which is easy to overfit.
Furthermore, the training process of feedforward neural networks takes a long time, especially when dealing with large data sets [74]. The results of feedforward neural networks often lack interpretability. The accuracy of this method is affected by many factors, including data quality, feature selection, network structure and hyperparameter selection.
Autoencoder [62] includes both encoder and decoder, a representation learning algorithm in a general sense. It has a strong feature learning ability and can extract useful features from drug response data through unsupervised learning without the need for manually labeled information [75]. Liu, Q. et al. [76] constructed a knowledge-enabled and self-attention transformer-boosted deep learning model, TranSynergy. It includes three major components: (1) input dimension reduction component, (2) self-attention transformer component, and (3) output fully connected component. Their experimental results of model evaluations showed that TranSynergy outperformed the most advanced approaches, and the AUC and AUPR reached 0.908 and 0.625, respectively. As with traditional computational models, the TranSynergy model selected only a few cancer-related genes that included drug targets and annotations due to limited training data. In addition, the model will also cause dimensional disasters due to too many feature dimensions, resulting in overfitting problems. Autoencoder have the risk of overfitting when dealing with large-scale drug response data, especially when the training set is small. The training process of the autoencoder model is unsupervised, so the features extracted by the model are often difficult to interpret [75,77].
Graph neural network [63] is an emergent framework that has emerged recently. The advantage of a graph neural network is that it can capture the relationship and topological information between the nodes in the graph and transform the data into low-dimensional and more discriminative feature space. In addition, graph neural networks can automatically learn the feature representation of nodes and edges [78]. Wang J. et al. [79] proposed a graphical neural network (GNN) and attention mechanism-based model called DeepDDS. In this model, the chemical structure of the drug is represented by a graph. The drug embeddings are calculated according to the above two deep learning models. By integrating genomic and drug signatures, DeepDDS can capture important information from drug chemical structures and gene expression patterns to identify synergistic drug combinations that target specific cancer cell lines. Additionally, they compared DeepDDS with deep learning methods and traditional machine learning methods on a benchmark dataset. Finally, the results demonstrate the better performance of DeepDDS compared to other models, and its performance measures of AUC, area under the AUPR and accuracy reach 0.93, 0.93 and 0.85, respectively. Similarly, DeepDDS still did not show satisfactory predictive accuracy on independent test sets for the same reason described earlier. The main disadvantages of graph neural networks are as follows: (1) Due to the complex structure of GNN, its model training process is relatively difficult. (2) GNN is also a black box, which makes it difficult to explain its decision-making process. (3) GNN is vulnerable to adversarial attacks, and its robustness needs to be improved [80].
A deep belief network [64] can train the weights between its neurons, allowing the whole network to generate enough training data to maximize the probability. Moreover, DBN can automatically learn high-level abstract features from data through unsupervised learning and perform back-propagation through supervised learning. When it comes to supervised training with just some labeled data and extracting features from regular data, DBN performs admirably [81]. Chen, G. et al. [82] introduced a stacked restricted Boltzmann machine (RBM), which can predict the response of drug combinations from gene expression, pathways, and body fingerprints. In their model, the training data is utilized before the learning stage to optimize the weight of the input using contrastive divergence. Their evaluation of the model showed an accuracy rate of 71.5%, the recall of 60.2%, and an F score of 65.4%. Overall, they performed better than the DREAM competition group. The RBM model also faces the problem of data integrity and lack of experimental data, which may be the cause of model performance degradation. Moreover, DBN is prone to over-fitting when dealing with small sample data, so some regularization methods are needed to alleviate over-fitting [81]. DBN can achieve high accuracy in drug response prediction models, but the training data and hyperparameter selection need to be carefully considered in practical applications.
4.3. Application of Mathematical Methods in Drug Combination Prediction
Mathematical methods include Network analysis and Dynamic mathematical models. Network medicine uses a systems-network perspective to understand the disease mechanism [65]. Similarly, the Dynamic mathematical model studies the effects of drug combinations on potential protein concentrations and drug combination therapies, which can effectively control the progression of disease states [15].
The human body is composed of a rich variety of biological units, and with the advancement of bio-measurement technology, various types of disease networks have been established. With the help of Network analysis, the complex interactions between drugs and proteins can be captured, including physical interactions, metabolic pathways, signal transduction, etc. In addition, it can provide an interpretation of the predicted results, for example, by analyzing critical paths in the network, node importance, etc. [83,84]. Yin N. et al. [85] explained the connection between network topology and the effect of drug combinations by displaying the interaction of drug combinations and their targets in the network. They found that the effect of drug combinations depends heavily on the network topology, and they were able to identify motifs that could serve as useful catalogs for rational drug combination design of enzyme systems. Unlike most studies on drug synergies, they focused on antagonism and synergies. Their model generally provides a rational and easy-to-apply approach to designing synergistic drug combinations. However, the results of this model are only based on enzyme networks, and other types of biological networks still need to be further explored. Moreover, this method requires a large amount of drug and protein interaction data, so acquisition and collation are challenging [83,84]. The accuracy of network analysis is like many other methods and is also affected by several factors, including data quality, network construction methods, and prediction algorithms.
The dynamic mathematical model captures important dynamic aspects of disease treatment. It can describe the processes of drug absorption, distribution, metabolism, and excretion in organisms, to simulate drug reactions more accurately. It also has a strong predictive ability, which can predict the effect of drugs under different doses and dosing schemes and support the individualization of drug therapy [86,87]. Geva-Zatorsky, N. et al. [88] dealt with accurately tracking different protein concentrations, considering different drugs through a dynamic proteomics method. They tracked down that the dynamics of proteins’ reaction to drug pairs can be accurately depicted through their responses to various drugs. However, Dynamic mathematical models are usually constructed based on a series of mathematical equations, and their complexity may limit the interpretability of the models [86,87]. These models typically have high accuracy but are more data-intensive, requiring more data and optimization of the model parameters.
4.4. Application of Search Algorithms in Drug Combination Prediction
The breadth-first search algorithm is always extended outward through the boundary between found and unfound vertices. The applications of breadth-first search, especially in drug structures, include finding the shortest path and the minimum distance between two points.
As one of the least complex graph search algorithms, the Breadth-first search algorithm is the basis of numerous graph algorithms. It can consider a large amount of potential drug-target interactions. Ji, L.S. et al. [66] investigated the immunomodulatory mechanisms of Bushen formula (BSF) combined with entecavir (ETV) in patients with newly treated chronic hepatitis B (CHB) and CHB patients with partial virological response to ETV. They finally demonstrated that the combination of these two drugs helped ETV partially alleviate hBsAg reduction in patients and is a potential treatment for these patients. However, in their study, the underlying immunomodulatory mechanisms underlying BSF treatment of CHB patients remain to be explored. This method is not robust enough. Moreover, because its prediction results are often based on the similarity between the drug and the target, it does not consider other factors such as drug metabolism, drug delivery, etc. [89,90]. Regarding the accuracy of the model, the feature selection, parameter selection and tuning, evaluation and verification methods of the search algorithm have a decisive influence.
4.5. Application of Systems Biology Methods in Drug Combination Prediction
Systems biology methods hypothesize that drugs that are effective for specific diseases can be used as candidates for other diseases with similar characteristics of changes in gene expression. The model is suitable for rapid drug combination prediction for experimental verification.
The cMap database provides gene expression profiles of numerous small molecules against different cancer cell lines, which provides rich data for the Signature-based model. In addition, Systems biology methods can integrate large-scale biological data at different levels, such as gene expression, protein interactions, metabolic pathways, etc., to gain a more comprehensive understanding of drug action mechanisms and influencing factors [91]. Kim J. et al. [67] have developed a web-based program called K-Map. The program can uncover the perplexing communications between protein kinases and their inhibitors and provide the basis for rational clinical drug use. This model can link kinases to drugs using quantitative signatures of kinase inhibitor activity. In addition, it is highly real-time and useful, and they also update their data on a quarterly basis, making it even more valuable. However, there are still some disadvantages to this method: accuracy is highly dependent on the quality of the input data. Moreover, the method usually uses complex network models and algorithms, which makes it difficult to interpret the result [91]. The systems biology approaches construct complex network models by integrating multiple data sources to achieve high accuracy.
5. Discussion
In clinical therapy, personalized cancer treatment requires that models predicting drug response can effectively predict the effect of the drug combination and provide reasonable explanations in the face of complex molecular characteristics and noisy pharmacogenomic data. Till now thousands of bioactivity databases and various computational methods have been generated in recent years. This review focuses on the databases and methods used to predict the response of cell lines to drug combinations, as well as the definition of synergies, to provide cancer patients with individualized, precise treatment regimens, which may improve patient survival and survival time and achieve precision cancer therapy.
Although many studies have already achieved great predictive performance, there are still many challenges in this research direction. Appropriate databases seem to be crucial for better predictive performance. For instance, in some large publicly available databases, the size of cancer cell lines and drugs is insufficient to train models with strong generalization. Moreover, most of these models make predicting the response of novel drugs or novel cell lines difficult, which didn’t appear in the training set. As mentioned in this article, the problem of class imbalance also needs to be solved, which is also an important reason for the low generalization ability of the model. In most of these models mentioned in this paper, structural information, physical and chemical information about drugs, and cell lines’ expression information are used as characteristics to predict the effect of drug combinations. The study of drug synergies should pay more attention to the biological links between drug combinations and cell lines rather than the characteristics of each. Therefore, more features of omics should be considered in the prediction process. Another issue worth exploring is whether the drug combination is synergistic or antagonistic, which is generally related to the drug dose. This can be simply understood as the fact that drug combinations are often found to be synergistic in one dose range and antagonistic in another. However, considering the different doses of drugs based on studying different effects of drug combinations is a very difficult problem. Although some studies [92] have considered the effect of drug dosage, the issue still needs to be explored more deeply to achieve the goal of precision medicine. In recent years, deep-learning language models have shown great promise in drug discovery, including understanding drug-drug interactions, protein design, and engineering. For example, ProGen, a language model produced by Madani A. et al. [93], can generate protein sequences with predictable functions and can be adapted to different protein families. Another example is the application of ChatGPT in predicting and interpreting common drug-drug interactions (DDIs) [94]. With the help of ChatGPT, clinicians and patients can effectively identify potential DDI effects and make the right decisions.
Author Contributions
Conceptualization, T.H. and Y.L.; methodology, Y.P. and H.R.; data curation, Y.P. and H.R.; writing—original draft preparation, Y.P. and H.R.; writing—review and editing, L.L., T.H. and Y.L. funding acquisition, T.H. and Y.L. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
This review did not generate new data. The information on databases and methods mentioned in this review can be found in the URL columns in Table 1 and Table 2, respectively.
Conflicts of Interest
The authors declare no conflict of interest.
Funding Statement
This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB38050200, XDA26040304), the National Key R&D Program of China (2022YFF1203202, 2018YFC2000205), and the Self-supporting Program of Guangzhou Laboratory (SRPG22-007).
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Nair N.U., Greninger P., Zhang X.H., Friedman A.A., Amzallag A., Cortez E., Das Sahu A., Lee J.S., Dastur A., Egan R.K., et al. A landscape of response to drug combinations in non-small cell lung cancer. Nat. Commun. 2023;14:3830. doi: 10.1038/s41467-023-39528-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehar J., Kryukov G.V., Sonkin D., et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gorre M.E., Mohammed M., Ellwood K., Hsu N., Paquette R., Rao P.N., Sawyers C.L. Clinical Resistance to STI-571 Cancer Therapy Caused by BCR-ABL Gene Mutation or Amplification. Science. 2001;293:876–880. doi: 10.1126/science.1062538. [DOI] [PubMed] [Google Scholar]
- 4.Chang G., Roth C.B. Structure of MsbA from E. coli: A Homolog of the Multidrug Resistance ATP Binding Cassette (ABC) Transporters. Science. 2001;293:1793–1800. doi: 10.1126/science.293.5536.1793. [DOI] [PubMed] [Google Scholar]
- 5.Engelman J.A., Zejnullahu K., Mitsudomi T., Song Y., Hyland C., Park J.O., Lindeman N., Gale C.-M., Zhao X., Christensen J., et al. MET Amplification Leads to Gefitinib Resistance in Lung Cancer by Activating ERBB3 Signaling. Science. 2007;316:1039–1043. doi: 10.1126/science.1141478. [DOI] [PubMed] [Google Scholar]
- 6.Mokhtari R.B., Homayouni T.S., Baluch N., Morgatskaya E., Kumar S., Das B., Yeger H. Combination therapy in combating cancer. Oncotarget. 2017;8:38022–38043. doi: 10.18632/oncotarget.16723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cheng F., Liang H., Butte A.J., Eng C., Nussinov R. Personal Mutanomes Meet Modern Oncology Drug Discovery and Precision Health. Pharmacol. Rev. 2019;71:1–19. doi: 10.1124/pr.118.016253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Argelaguet R., Velten B., Arnol D., Dietrich S., Zenz T., Marioni J.C., Buettner F., Huber W., Stegle O. Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 2018;14:e8124. doi: 10.15252/msb.20178124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Cokol M., Chua H.N., Tasan M., Mutlu B., Weinstein Z.B., Suzuki Y., Nergiz M.E., Costanzo M., Baryshnikova A., Giaever G., et al. Systematic exploration of synergistic drug pairs. Mol. Syst. Biol. 2011;7:544. doi: 10.1038/msb.2011.71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Morris M., Clarke D., Osimiri L., Lauffenburger D. Systematic Analysis of Quantitative Logic Model Ensembles Predicts Drug Combination Effects on Cell Signaling Networks. CPT Pharmacomet. Syst. Pharmacol. 2016;5:544–553. doi: 10.1002/psp4.12104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lee J.-H., Kim D.G., Bae T.J., Rho K., Kim J.-T., Lee J.-J., Jang Y., Kim B.C., Park K.M., Kim S. CDA: Combinatorial Drug Discovery Using Transcriptional Response Modules. PLoS ONE. 2012;7:e42573. doi: 10.1371/journal.pone.0042573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wu L., Wen Y., Leng D., Zhang Q., Dai C., Wang Z., Liu Z., Yan B., Zhang Y., Wang J., et al. Machine learning methods, databases and tools for drug combination prediction. Brief. Bioinform. 2022;23:bbab355. doi: 10.1093/bib/bbab355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Loewe S. The problem of synergism and antagonism of combined drugs. Arzneimittelforschung. 1953;3:285–290. [PubMed] [Google Scholar]
- 14.Bliss C.I. The Toxicity of Poisons Applied Jointly. Ann. Appl. Biol. 1939;26:585–615. doi: 10.1111/j.1744-7348.1939.tb06990.x. [DOI] [Google Scholar]
- 15.Vakil V., Trappe W. Drug Combinations: Mathematical Modeling and Networking Methods. Pharmaceutics. 2019;11:208. doi: 10.3390/pharmaceutics11050208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chou T.C. What is synergy? Scientist. 2007;21:15. [Google Scholar]
- 17.Goldoni M., Johansson C. A mathematical approach to study combined effects of toxicants in vitro: Evaluation of the Bliss independence criterion and the Loewe additivity model. Toxicol. Vitr. 2007;21:759–769. doi: 10.1016/j.tiv.2007.03.003. [DOI] [PubMed] [Google Scholar]
- 18.Laskey S.B., Siliciano R.F. A mechanistic theory to explain the efficacy of antiretroviral therapy. Nat. Rev. Microbiol. 2014;12:772–780. doi: 10.1038/nrmicro3351. [DOI] [PubMed] [Google Scholar]
- 19.Chevereau G., Bollenbach T. Systematic discovery of drug interaction mechanisms. Mol. Syst. Biol. 2015;11:807. doi: 10.15252/msb.20156098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tonekaboni S.A.M., Ghoraie L.S., Manem V.S.K., Haibe-Kains B. Predictive approaches for drug combination discovery in cancer. Brief. Bioinform. 2018;19:263–276. doi: 10.1093/bib/bbw104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zheng S., Aldahdooh J., Shadbahr T., Wang Y., Aldahdooh D., Bao J., Wang W., Tang J. DrugComb update: A more comprehensive drug sensitivity data repository and analysis portal. Nucleic Acids Res. 2021;49:W174–W184. doi: 10.1093/nar/gkab438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Liu H., Zhang W., Zou B., Wang J., Deng Y., Deng L. DrugCombDB: A comprehensive database of drug combinations toward the discovery of combinatorial therapy. Nucleic Acids Res. 2020;48:D871–D881. doi: 10.1093/nar/gkz1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Holbeck S.L., Camalier R., Crowell J.A., Govindharajulu J.P., Hollingshead M., Anderson L.W., Polley E., Rubinstein L., Srivastava A., Wilsker D., et al. The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity. Cancer Res. 2017;77:3564–3576. doi: 10.1158/0008-5472.CAN-17-0489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Seo H., Tkachuk D., Ho C., Mammoliti A., Rezaie A., Tonekaboni S.A.M., Haibe-Kains B. SYNERGxDB: An integrative pharmacogenomic portal to identify synergistic drug combinations for precision oncology. Nucleic Acids Res. 2020;48:W494–W501. doi: 10.1093/nar/gkaa421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gaulton A., Bellis L.J., Bento A.P., Chambers J., Davies M., Hersey A., Light Y., McGlinchey S., Michalovich D., Al-Lazikani B., et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–D1107. doi: 10.1093/nar/gkr777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wishart D.S., Feunang Y.D., Guo A.C., Lo E.J., Marcu A., Grant J.R., Sajed T., Johnson D., Li C., Sayeeda Z., et al. DrugBank 5.0: A Major Update to the DrugBank Database for 2018. Nucleic Acids Res. 2018;46:D1074–D1082. doi: 10.1093/nar/gkx1037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kim S., Chen J., Cheng T., Gindulyte A., He J., He S., Li Q., Shoemaker B.A., Thiessen P.A., Yu B., et al. PubChem 2023 update. Nucleic Acids Res. 2023;51:D1373–D1380. doi: 10.1093/nar/gkac956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Barrett T., Wilhite S.E., Ledoux P., Evangelista C., Kim I.F., Tomashevsky M., Marshall K.A., Phillippy K.H., Sherman P.M., Holko M., et al. NCBI GEO: Archive for functional genomics data sets—Update. Nucleic Acids Res. 2013;41:D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Lamb J., Crawford E.D., Peck D., Modell J.W., Blat I.C., Wrobel M.J., Lerner J., Brunet J.-P., Subramanian A., Ross K.N., et al. The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease. Science. 2006;313:1929–1935. doi: 10.1126/science.1132939. [DOI] [PubMed] [Google Scholar]
- 30.Kuhn M., Letunic I., Jensen L.J., Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44:D1075–D1079. doi: 10.1093/nar/gkv1075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wu L., Yan B., Han J., Li R., Xiao J., He S., Bo X. TOXRIC: A comprehensive database of toxicological data and benchmarks. Nucleic Acids Res. 2023;51:D1432–D1445. doi: 10.1093/nar/gkac1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Alexandre B., Auerbach S.S., Houck K.A., Kleinstreuer N.C. Tox21BodyMap: A webtool to map chemical effects on the human body. Nucleic Acids Res. 2020;48:W472–W476. doi: 10.1093/nar/gkaa433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Jassal B., Matthews L., Viteri G., Gong C., Lorente P., Fabregat A., Sidiropoulos K., Cook J., Gillespie M., Haw R., et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2020;48:D498–D503. doi: 10.1093/nar/gkz1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wishart D.S., Li C., Marcu A., Badran H., Pon A., Budinski Z., Patron J., Lipton D., Cao X., Oler E., et al. PathBank: A comprehensive pathway database for model organisms. Nucleic Acids Res. 2020;48:D470–D478. doi: 10.1093/nar/gkz861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kanehisa M., Furumichi M., Sato Y., Ishiguro-Watanabe M., Tanabe M. KEGG: Integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49:D545–D551. doi: 10.1093/nar/gkaa970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhou Y., Zhang Y., Lian X., Li F., Wang C., Zhu F., Qiu Y., Chen Y. Therapeutic target database update 2022: Facilitating drug discovery with enriched comparative data of targeted agents. Nucleic Acids Res. 2022;50:D1398–D1407. doi: 10.1093/nar/gkab953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Gilson M.K., Liu T., Baitaluk M., Nicola G., Hwang L., Chong J. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2016;44:D1045–D1053. doi: 10.1093/nar/gkv1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Prasad T.S.K., Goel R., Kandasamy K., Keerthikumar S., Kumar S., Mathivanan S., Telikicherla D., Raju R., Shafreen B., Venugopal A., et al. Human Protein Reference Database—2009 update. Nucleic Acids Res. 2009;37:D767–D772. doi: 10.1093/nar/gkn892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Szklarczyk D., Gable A.L., Nastou K.C., Lyon D., Kirsch R., Pyysalo S., Doncheva N.T., Legeay M., Fang T., Bork P., et al. The STRING database in 2021: Customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2021;49:D605–D612. doi: 10.1093/nar/gkaa1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Szklarczyk D., Santos A., von Mering C., Jensen L.J., Bork P., Kuhn M. STITCH 5: Augmenting protein-chemical interaction networks with tissue and affinity data. Nucleic Acids Res. 2016;44:D380–D384. doi: 10.1093/nar/gkv1277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kuru H.I., Tastan O., Cicek A.E. MatchMaker: A Deep Learning Framework for Drug Synergy Prediction. IEEE/ACM Trans. Comput. Biol. Bioinform. 2022;19:2334–2344. doi: 10.1109/TCBB.2021.3086702. [DOI] [PubMed] [Google Scholar]
- 42.Sidorov P., Naulaerts S., Ariey-Bonnet J., Pasquier E., Ballester P.J. Predicting Synergism of Cancer Drug Combinations Using NCI-ALMANAC Data. Front. Chem. 2019;7:509. doi: 10.3389/fchem.2019.00509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ye Z., Chen F., Zeng J., Gao J., Zhang M.Q. ScaffComb: A Phenotype-Based Framework for Drug Combination Virtual Screening in Large-Scale Chemical Datasets. Adv. Sci. 2021;8:e2102092. doi: 10.1002/advs.202102092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ke J., Li M.T., Huo Y.J., Cheng Y.Q., Guo S.F., Wu Y., Zhang L., Ma J., Liu A.J., Han Y. The Synergistic Effect of Ginkgo biloba Extract 50 and Aspirin Against Platelet Aggregation. Drug Des. Dev. Ther. 2021;15:3543–3560. doi: 10.2147/DDDT.S318515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lv Y., Wu L., Jian H., Zhang C., Lou Y., Kang Y., Hou M., Li Z., Li X., Sun B., et al. Identification and characterization of aging/senescence-induced genes in osteosarcoma and predicting clinical prognosis. Front. Immunol. 2022;13:997765. doi: 10.3389/fimmu.2022.997765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Aissa A.F., Islam A., Ariss M.M., Go C.C., Rader A.E., Conrardy R.D., Gajda A.M., Rubio-Perez C., Valyi-Nagy K., Pasquinelli M., et al. Single-cell transcriptional changes associated with drug tolerance and response to combination therapies in cancer. Nat. Commun. 2021;12:1628. doi: 10.1038/s41467-021-21884-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Jin L., Tu J., Jia J., An W., Tan H., Cui Q., Li Z. Drug-repurposing identified the combination of Trolox C and Cytisine for the treatment of type 2 diabetes. J. Transl. Med. 2014;12:153. doi: 10.1186/1479-5876-12-153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Prinz J., Koohi-Moghadam M., Sun H., Kocher J.A., Wang J. Novel Neural Network Approach to Predict Drug-Target Interactions Based on Drug Side Effects and Genome-Wide Association Studies. Hum. Hered. 2018;83:79–91. doi: 10.1159/000492574. [DOI] [PubMed] [Google Scholar]
- 49.Chamberlin S.R., Blucher A., Wu G., Shinto L., Choonoo G., Kulesz-Martin M., McWeeney S. Natural Product Target Network Reveals Potential for Cancer Combination Therapies. Front. Pharmacol. 2019;10:557. doi: 10.3389/fphar.2019.00557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Li P., Chen J., Wang J., Zhou W., Wang X., Li B., Tao W., Wang W., Wang Y., Yang L. Systems pharmacology strategies for drug discovery and combination with applications to cardiovascular diseases. J. Ethnopharmacol. 2014;151:93–107. doi: 10.1016/j.jep.2013.07.001. [DOI] [PubMed] [Google Scholar]
- 51.Wang T., Chen L., Zhao X. Prediction of Drug Combinations with a Network Embedding Method. Comb. Chem. High Throughput Screen. 2018;21:789–797. doi: 10.2174/1386207322666181226170140. [DOI] [PubMed] [Google Scholar]
- 52.Gertrudes J.C., Maltarollo V.G., Silva R.A., Oliveira P.R., Honorio K.M., da Silva A.B. Machine Learning Techniques and Drug Design. Curr. Med. Chem. 2012;19:4289–4297. doi: 10.2174/092986712802884259. [DOI] [PubMed] [Google Scholar]
- 53.Bajorath J., Kearnes S., Walters W.P., Meanwell N.A., Georg G.I., Wang S. Artificial Intelligence in Drug Discovery: Into the Great Wide Open. J. Med. Chem. 2020;63:8651–8652. doi: 10.1021/acs.jmedchem.0c01077. [DOI] [PubMed] [Google Scholar]
- 54.Talevi A., Morales J.F., Hather G., Podichetty J.T., Kim S., Bloomingdale P.C., Kim S., Burton J., Brown J.D., Winterstein A.G., et al. Machine Learning in Drug Discovery and Development Part 1: A Primer. CPT Pharmacomet. Syst. Pharmacol. 2020;9:129–142. doi: 10.1002/psp4.12491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Tsigelny I.F. Artificial intelligence in drug combination therapy. Brief. Bioinform. 2019;20:1434–1448. doi: 10.1093/bib/bby004. [DOI] [PubMed] [Google Scholar]
- 56.Sheng Z., Sun Y., Yin Z., Tang K., Cao Z. Advances in computational approaches in identifying synergistic drug combinations. Brief. Bioinform. 2018;19:1172–1182. doi: 10.1093/bib/bbx047. [DOI] [PubMed] [Google Scholar]
- 57.Ryall K.A., Tan A.C. Systems biology approaches for advancing the discovery of effective drug combinations. J. Cheminform. 2015;7:7. doi: 10.1186/s13321-015-0055-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Noble W.S. What is a support vector machine? Nat. Biotechnol. 2006;24:1565–1567. doi: 10.1038/nbt1206-1565. [DOI] [PubMed] [Google Scholar]
- 59.Safavian S.R., Landgrebe D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 1991;21:660–674. doi: 10.1109/21.97458. [DOI] [Google Scholar]
- 60.Friedman J.H. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002;38:367–378. doi: 10.1016/S0167-9473(01)00065-2. [DOI] [Google Scholar]
- 61.Schmidhuber J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015;61:85–117. doi: 10.1016/j.neunet.2014.09.003. [DOI] [PubMed] [Google Scholar]
- 62.Kramer M.A. Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 1991;37:233–243. doi: 10.1002/aic.690370209. [DOI] [Google Scholar]
- 63.Jiang P., Huang S., Fu Z., Sun Z., Lakowski T.M., Hu P. Deep graph embedding for prioritizing synergistic anticancer drug combinations. Comput. Struct. Biotechnol. J. 2020;18:427–438. doi: 10.1016/j.csbj.2020.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Movahedi F., Coyle J.L., Sejdic E. Deep Belief Networks for Electroencephalography: A Review of Recent Contributions and Future Outlooks. IEEE J. Biomed. Health Inform. 2018;22:642–652. doi: 10.1109/JBHI.2017.2727218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Barabasi A.-L., Gulbahce N., Loscalzo J. Network medicine: A network-based approach to human disease. Nat. Rev. Genet. 2011;12:56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Ji L.S., Gao Q.T., Guo R.W., Zhang X., Zhou Z.H., Yu Z., Zhu X.J., Gao Y.T., Sun X.H., Gao Y.Q., et al. Immunomodulatory Effects of Combination Therapy with Bushen Formula plus Entecavir for Chronic Hepatitis B Patients. J. Immunol. Res. 2019;2019:8983903. doi: 10.1155/2019/8983903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Kim J., Yoo M., Kang J., Tan A.C. K-Map: Connecting kinases with therapeutics for drug repurposing and development. Hum. Genom. 2013;7:20. doi: 10.1186/1479-7364-7-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Huang S., Cai N., Pacheco P.P., Narandes S., Wang Y., Xu W. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genom. Proteom. 2018;15:41–51. doi: 10.21873/cgp.20063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Wu L., Gao J., Zhang Y., Sui B., Wen Y., Wu Q., Liu K., He S., Bo X. A hybrid deep forest-based method for predicting synergistic drug combinations. Cell Rep. Methods. 2023;3:100411. doi: 10.1016/j.crmeth.2023.100411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Galal A., Talal M., Moustafa A. Applications of machine learning in metabolomics: Disease modeling and classification. Front. Genet. 2022;13:1017340. doi: 10.3389/fgene.2022.1017340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Xu Q., Xiong Y., Dai H., Kumari K.M., Xu Q., Ou H.Y., Wei D.-Q. PDC-SGB: Prediction of effective drug combinations using a stochastic gradient boosting algorithm. J. Theor. Biol. 2017;417:1–7. doi: 10.1016/j.jtbi.2017.01.019. [DOI] [PubMed] [Google Scholar]
- 72.Li K., Yao S., Zhang Z., Cao B., Wilson C.M., Kalos D., Kuan P.F., Zhu R., Wang X. Efficient gradient boosting for prognostic biomarker discovery. Bioinformatics. 2022;38:1631–1638. doi: 10.1093/bioinformatics/btab869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Tsai P.-L., Chang H.H., Chen P.S. Predicting the Treatment Outcomes of Antidepressants Using a Deep Neural Network of Deep Learning in Drug-Naïve Major Depressive Patients. J. Pers. Med. 2022;12:693. doi: 10.3390/jpm12050693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Kriegeskorte N., Golan T. Neural network models and deep learning. Curr. Biol. 2019;29:R231–R236. doi: 10.1016/j.cub.2019.02.034. [DOI] [PubMed] [Google Scholar]
- 75.Wang D., Gu J. VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder. Genom. Proteom. Bioinform. 2018;16:320–331. doi: 10.1016/j.gpb.2018.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Liu Q., Xie L. TranSynergy: Mechanism-driven interpretable deep neural network for the synergistic prediction and pathway deconvolution of drug combinations. PLOS Comput. Biol. 2021;17:e1008653. doi: 10.1371/journal.pcbi.1008653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Xu J., Xu J., Meng Y., Lu C., Cai L., Zeng X., Nussinov R., Cheng F. Graph embedding and Gaussian mixture variational autoencoder network for end-to-end analysis of single-cell RNA sequencing data. Cell Rep. Methods. 2023;3:100382. doi: 10.1016/j.crmeth.2022.100382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Zhang Z., Chen L., Zhong F., Wang D., Jiang J., Zhang S., Jiang H., Zheng M., Li X. Graph neural network approaches for drug-target interactions. Curr. Opin. Struct. Biol. 2022;73:102327. doi: 10.1016/j.sbi.2021.102327. [DOI] [PubMed] [Google Scholar]
- 79.Wang J., Liu X., Shen S., Deng L., Liu H. DeepDDS: Deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief. Bioinform. 2022;23:bbab390. doi: 10.1093/bib/bbab390. [DOI] [PubMed] [Google Scholar]
- 80.Zhou J., Cui G., Hu S., Zhang Z., Yang C., Liu Z., Wang L., Li C., Sun M. Graph neural networks: A review of methods and applications. AI Open. 2020;1:57–81. doi: 10.1016/j.aiopen.2021.01.001. [DOI] [Google Scholar]
- 81.Ding Y., Wang F., Lei X., Liao B., Wu F.X. Deep belief network-Based Matrix Factorization Model for MicroRNA-Disease Associations Prediction. Evol. Bioinform. Online. 2020;16:1176934320919707. doi: 10.1177/1176934320919707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Chen G., Tsoi A., Xu H., Zheng W.J. Predict effective drug combination by deep belief network and ontology fingerprints. J. Biomed. Inform. 2018;85:149–154. doi: 10.1016/j.jbi.2018.07.024. [DOI] [PubMed] [Google Scholar]
- 83.Chen X., Yan C.C., Zhang X., You Z.-H. Long non-coding RNAs and complex diseases: From experimental results to computational models. Brief. Bioinform. 2017;18:558–576. doi: 10.1093/bib/bbw060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Sun J., Shi H., Wang Z., Zhang C., Liu L., Wang L., He W., Hao D., Liu S., Zhou M. Inferring novel lncRNA–disease associations based on a random walk model of a lncRNA functional similarity network. Mol. Biosyst. 2014;10:2074–2081. doi: 10.1039/C3MB70608G. [DOI] [PubMed] [Google Scholar]
- 85.Yin N., Ma W., Pei J., Ouyang Q., Tang C., Lai L. Synergistic and Antagonistic Drug Combinations Depend on Network Topology. PLoS ONE. 2014;9:e93960. doi: 10.1371/journal.pone.0093960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Kaschek D., Sharanek A., Guillouzo A., Timmer J., Weaver R.J. A dynamic mathematical model of bile acid clearance in HepaRG cells. Toxicol. Sci. 2018;161:48–57. doi: 10.1093/toxsci/kfx199. [DOI] [PubMed] [Google Scholar]
- 87.Cohen A.A., Geva-Zatorsky N., Eden E., Frenkel-Morgenstern M., Issaeva I., Sigal A., Milo R., Cohen-Saidon C., Liron Y., Kam Z., et al. Dynamic Proteomics of Individual Cancer Cells in Response to a Drug. Science. 2008;322:1511–1516. doi: 10.1126/science.1160165. [DOI] [PubMed] [Google Scholar]
- 88.Geva-Zatorsky N., Dekel E., Cohen A.A., Danon T., Cohen L., Alon U. Protein Dynamics in Drug Combinations: A Linear Superposition of Individual-Drug Responses. Cell. 2010;140:643–651. doi: 10.1016/j.cell.2010.02.011. [DOI] [PubMed] [Google Scholar]
- 89.Wong P.K., Yu F., Shahangian A., Cheng G., Sun R., Ho C.-M. Closed-loop control of cellular functions using combinatory drugs guided by a stochastic search algorithm. Proc. Natl. Acad. Sci. USA. 2008;105:5105–5110. doi: 10.1073/pnas.0800823105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Zinner R.G., Barrett B.L., Popova E., Damien P., Volgin A.Y., Gelovani J.G., Lotan R., Tran H.T., Pisano C., Mills G.B., et al. Algorithmic guided screening of drug combinations of arbitrary size for activity against cancer cells. Mol. Cancer Ther. 2009;8:521–532. doi: 10.1158/1535-7163.MCT-08-0937. [DOI] [PubMed] [Google Scholar]
- 91.Feala J.D., Cortes J., Duxbury P.M., Piermarocchi C., McCulloch A.D., Paternostro G. Systems approaches and algorithms for discovery of combinatorial therapies. Wiley Interdiscip. Rev. Syst. Biol. Med. 2010;2:181–193. doi: 10.1002/wsbm.51. [DOI] [PubMed] [Google Scholar]
- 92.Julkunen H., Cichonska A., Gautam P., Szedmak S., Douat J., Pahikkala T., Aittokallio T., Rousu J. Leveraging multi-way interactions for systematic prediction of pre-clinical drug combination effects. Nat. Commun. 2020;11:6136. doi: 10.1038/s41467-020-19950-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Madani A., Krause B., Greene E.R., Subramanian S., Mohr B.P., Holton J.M., Olmos J.L., Jr., Xiong C., Sun Z.Z., Socher R., et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 2023;41:1099–1106. doi: 10.1038/s41587-022-01618-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Juhi A., Pipil N., Santra S., Mondal S., Behera J.K., Mondal H. The Capability of ChatGPT in Predicting and Explaining Common Drug-Drug Interactions. Cureus. 2023;15:e36272. doi: 10.7759/cureus.36272. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This review did not generate new data. The information on databases and methods mentioned in this review can be found in the URL columns in Table 1 and Table 2, respectively.