Skip to main content
Journal of Advanced Research logoLink to Journal of Advanced Research
. 2023 Dec 2;65:329–343. doi: 10.1016/j.jare.2023.11.035

An interpretable artificial intelligence framework for designing synthetic lethality-based anti-cancer combination therapies

Jing Wang a,1, Yuqi Wen b,1, Yixin Zhang b,1, Zhongming Wang c, Yuyang Jiang c, Chong Dai d, Lianlian Wu c, Dongjin Leng b, Song He b,, Xiaochen Bo b,
PMCID: PMC11519055  PMID: 38043609

Graphical abstract

KDDSL open the black box of neural network to achieve interpretable prediction of SL (The two shields represent two SL genes). This can help us utilize SL to kill cancer cell.

graphic file with name ga1.jpg

Keywords: Synthetic lethality, Combination therapy, Knowledge-driven, Data-driven, Artificial intelligence

Highlights

  • Synthetic lethality offers a promising means for expanding drug target space for cancer treatment.

  • We developed a knowledge and data dual-driven AI framework for synthetic lethality prediction (KDDSL).

  • KDDSL is a “visible” AI framework that integrates domain knowledge to explore synthetic lethality and its implications in cancer treatment. This “visible” framework enhances the trust between biologists and AI.

  • KDDSL can help in identifying SL interactions and guiding the design of synergistic anti-cancer combination therapies.

  • By applying KDDSL, we identified the combination of MDM2 and CDK9 inhibitors, which demonstrated significant anticancer effects in vitro and in vivo.

Abstract

Introduction

Synthetic lethality (SL) provides an opportunity to leverage different genetic interactions when designing synergistic combination therapies. To further explore SL-based combination therapies for cancer treatment, it is important to identify and mechanistically characterize more SL interactions. Artificial intelligence (AI) methods have recently been proposed for SL prediction, but the results of these models are often not interpretable such that deriving the underlying mechanism can be challenging.

Objectives

This study aims to develop an interpretable AI framework for SL prediction and subsequently utilize it to design SL-based synergistic combination therapies.

Methods

We propose a knowledge and data dual-driven AI framework for SL prediction (KDDSL). Specifically, we use gene knowledge related to the SL mechanism to guide the construction of the model and develop a method to identify the most relevant gene knowledge for the predicted results.

Results

Experimental and literature-based validation confirmed a good balance between predictive and interpretable ability when using KDDSL. Moreover, we demonstrated that KDDSL could help to discover promising drug combinations and clarify associated biological processes, such as the combination of MDM2 and CDK9 inhibitors, which exhibited significant anti-cancer effects in vitro and in vivo.

Conclusion

These data underscore the potential of KDDSL to guide SL-based combination therapy design. There is a need for biomedicine-focused AI strategies to combine rational biological knowledge with developed models.

Introduction

Most targeted anti-cancer treatments currently center around the inhibition of specific oncogenes, but not all tumors harbor targetable oncogenes and the majority of cancer patients lack access to any appropriate targeted therapies [1], [2]. Relative to conventional mono-genetic targeted therapy strategies, synthetic lethality (SL)-based strategies expand the spectrum of druggable targets by taking interactions among genes into account. SL occurs when inhibiting one gene in isolation has a relatively limited impact on cellular viability, whereas simultaneously inhibiting two genes results in pronounced cell death [3]. Studies of SL have shown that it can be explained in part by two genes that participate in parallel biological processes or in a single essential process such that their combined inhibition is ultimately lethal [4], [5]. One particularly well-characterized SL gene interaction is that between PARP and BRCA [6], [7]. These two genes function in parallel redundant DNA repair biological processes, but their simultaneous inhibition results in irreparable DNA damage and cell death [5]. In light of these results, PARP inhibitors have received clinical approval for the treatment of breast cancer patients harboring loss-of-function BRCA mutations [8], [9].

Despite the clinical success of PARP inhibitors, only 3–4 % of women with breast cancer carry the requisite BRCA mutations, and the value of PARP inhibition in non-BRCA-mutated cancers has been limited [10]. Moreover, a single gene can also have a range of mutation-related variants, and their effects can be variable [11]. As such, there is a clear need to improve on the current SL-based drug discovery paradigm (Fig. 1a), which focuses on leveraging extant loss-of-function mutations in SL gene A to develop inhibitors against SL gene B. Accordingly, some researchers have proposed the implementation of “induced SL” strategies wherein treatment with one drug mimics the effects of an inactivating mutation in SL gene A, rendering tumor cells vulnerable to treatment with a second drug that inhibits SL gene B, thus enabling the design of a range of SL-based combination treatment regimens [10], [12], [13], [14], [15], [16]. These strategies can be classified into two types: gene-level SL and biological process (BP)-level SL (Fig. 1b). In gene-level SL approaches, drug combinations can simultaneously inhibit two SL gene targets [10], [13], [14], [17]. For example, combined EGFR and PARP inhibition has been successful as a means of treating breast cancer (ClinicalTrials.gov, NCT02158507) [10]. In BP-level SL strategies, drugs or therapies that mimic the effects of SL partner genes of gene B on a particular BP can be combined with a second drug that mimics the effects of inhibiting SL gene B. For example, a range of DNA damage-inducing drugs and treatments, including radiation therapy and platinum-based chemotherapy, have been shown to synergize with PARP inhibitors [12], [18]. This is because PARP related SL interactions are associated with DNA damage repair mechanism.

Fig. 1.

Fig. 1

The drug discovery paradigm based on SL. a Conventional drug discovery paradigm based on SL. Loss of function mutation of gene A and drug inhibition of gene B lead to cell death. b Synergistic drug combinations discovery paradigm based on induced SL. It can be divided into two strategies: gene-level induced SL (the top). Drug combinations that inhibit two SL genes respectively can lead to cell death. (2) BP (biological process) level induced SL (the bottom). The specific BP related to the mechanism of gene B-involved SL interactions is considered the primary drug target, followed by the inhibition of gene B to achieve a synergistic inhibitory effect on tumor cells.

Most studies to date have focused on developing drugs that synergize with PARP inhibitors based on gene- or BP-level SL interactions. To expand the scope of induced SL-based combination cancer treatments, there is a pressing need to comprehensively characterize more SL interactions. The advent of CRISPR/Cas9 and RNA interference (RNAi) technologies has provided the opportunity to systematically map the SL networks in human cancer cells. However, these lab-based approaches to identifying SL interactions are expensive and time-consuming, and using them to screen “combinatorial explosion” gene combinations is not realistic [19]. The recent accumulation of biological data and the rapid development of artificial intelligence (AI) technologies have led researchers to propose data-driven AI methods to reduce the SL search space of lab-based screening methods. Several SL prediction studies have successfully leveraged data-driven AI methods [20], [21], [22], [23]. Even so, these methods generally lack explanation in the decision process [24], providing researchers with few clues regarding the potential mechanisms underlying SL phenotypes. This is problematic given that understanding these underlying methods is critical to enabling the further design of SL-based combination treatment regimens.

Due to the rapid accumulation of prior domain knowledge and biological data, researchers have recently developed novel knowledge and data dual-driven AI platforms that combine conventional neural networks and biological knowledge [25], [26], [27], [28]. These models have the following advantages: (1) Enhanced human understanding. Unlike conventional black-box AI models, where neurons are abstract entities, these models represent neural networks using biological knowledge. The biological knowledge enables the model to be interpretable and “visible”, allowing biologists to gain a deeper understanding of its workings. (2) Improved model performance. Incorporating biological knowledge can reduce the number of parameters while enhancing overall model performance. (3) Mechanistic insights. These models enable the exploration of mechanistic information by identifying key biological knowledge. Given the above advantages, knowledge and data dual-driven AI models have shown promising potential when predicting yeast phenotype [28] and drug response [27]. To further explore SL for cancer treatment, it is challenging to develop a knowledge and data dual-driven AI framework that can capture both effective SL interactions and potential biological mechanism to assist in designing anti-cancer combination therapies.

In this study, we developed a knowledge and data dual-driven AI framework for SL prediction (KDDSL). Since the impact of gene perturbations on cell phenotype is achieved through multi-scale biological processes, incorporating biological processes (BP) knowledge can help uncover the underlying mechanisms of SL [5]. Therefore, BP knowledge serves as a fundamental component in KDDSL. In addition, we used 978 landmark gene expression profile data [29] as the input feature. This gene expression feature can capture the information of cell-line specific genetic background more comprehensively. It is worth noting that SL interactions are highly dependent on cellular context [30], [31]; therefore, we trained KDDSL with cell-line specific feature and label data. Furthermore, to identify the most relevant BP knowledge driving each sample’s prediction, we defined an importance score by introducing a feed-forward neural network (FNN) for each BP term. The information learning capacity of each BP term are characterized accordingly and serves as the basis of each sample’s explanation. Then global explanation of more SL samples is identified based on each sample’s explanation. After initially evaluating the predictive capacity and interpretability of this model, we demonstrated that it was able to reveal the mechanisms underlying SL. Subsequent in vitro and in vivo experiments were then conducted to verify the ability of this KDDSL model to guide the design of synergistic drug combinations. Specifically, we have identified a promising synergistic drug combination of MDM2 and CDK9 inhibitors, which could potentially facilitate their clinical application.

Materials and methods

Data processing

Label data. This study employed four label datasets: an A549 dataset comprised of SL samples screened in A549 cells, a K562 dataset comprised of SL samples screened in K562 cells, a MIAPaCa-2 dataset comprised of SL samples screened in MIAPaCa-2 cells, and a cell line-nonspecific Global dataset consisting of samples screened in a range of human cell lines. Samples in the A549 dataset were derived from SynLethDB [32], which provides SL and non-SL interactions in this human cell line, and from CRISPR double-knockout experiments [30], [33], [34], with scores [35] having been employed to detect negative and positive SL samples as detailed previously by Wan et al. [20]. In total, the A549 dataset contained 567 positive SL samples and 1080 negative SL samples. The positive samples in the K562 and MIAPaca-2 datasets were retrieved from SynLethDB. The positive samples in the Global dataset were retrieved from SynLethDB, which provides an SL statistic score that can be used to gauge the reliability of specific SL interactions. The maximum SL statistic score for SL interactions from computational predictions was adopted as a threshold value in this study. SL interactions with SL statistic scores above this threshold were regarded as positive samples. Negative samples consisted of a random combination of the genes involved in positive samples for these three datasets. Ultimately, the K562 dataset consisted of 1711 positive SL samples and 1711 negative SL samples. The MIAPaca-2 dataset consisted of 545 positive SL samples and 545 negative SL samples. The Global dataset consisted of 6847 positive SL samples and 6847 negative SL samples.

Feature data. Gene knockdown-induced transcriptomic data from the LINCS L1000 project [29], which quantified the expression of 978 landmark genes under both control and perturbed (knockdown or compound treatment) conditions in a range of human cell lines, were used for feature extraction. Wan et al. [20] revealed the ability of the LINCS L1000 project to provide effective features for SL predictions, and gene features were thus constructed using 978-dimensional gene expression profiles under the conditions of the perturbation of a specific target gene. Specifically, gene expression profiles under conditions of gene knockdown perturbation were obtained from the Level 5 data in the LINCS L1000 project, which provides 978-dimensional z-scores for these gene expression profiles under gene knockdown conditions in several human cell lines. Given that knockdown experiments for a particular gene may have been performed under different conditions, gene expression profiles for some of these genes have the potential not to be unique. The moderated z-score (MODZ) method was therefore utilized to acquire Level 5 consensus replicate signatures from LINCS L1000 project Level 4 data [29], providing unique expression profiles for each gene in each included cell line. The expression profiles in A549 cells were used as the gene expression profile for the A549 dataset. The MODZ approach was further applied to the gene expression profiles of all cell lines to generate the unique expression profiles for each gene on the other three datasets.

Gene ontology preparation

Gene Ontology (GO) biological process terms were used to guide the KDDSL architecture. The curated literature-based GO database compiles annotations relating to the characteristics and functions of specific genes, including specific biological process, molecular function, and cellular component GO terms [36]. Each biological process term was defined as a model subsystem. Subsystem filtering was performed using the following criteria:

  • 1.

    To avoid the potential for inaccuracy or circularity, subsystems inferred by genetic interaction were removed, with the only retained subsystems being those coded with “part_of” and “is_a” in the GO database.

  • 2.

    Any subsystems containing less than 5 related genes were excluded. Related genes refer to the genes that are directly or indirectly connected to the subsystem.

  • 3.

    Any subsystems containing a minimum of 5 related genes more than any upstream subsystem were retained.

  • 4.

    The maximal network depth was confined to 4 layers to mitigate the vanishing gradient problem, with the bottom 3 layers and the top layer (biological process subsystem) being retained.

  • 5.

    When subsystems were deleted, all upstream and downstream subsystems were connected to ensure network integrity. When this reorganization resulted in the formation of shortcut connections, they were removed from the network. For example, if the resultant network contained both A-B-C and A-C connections, the latter would be deleted.

Based on these criteria, 1133 subsystems were ultimately retained in the final reorganized network and used to guide the KDDSL architecture.

The framework of KDDSL

By constructing a hierarchical knowledge-embedded deep network, KDDSL was able to extract gene pair features. The functional state of each subsystem in KDDSL was represented by k neurons (from 2 to 10 per subsystem). The state of the subsystem t was determined based on directly connected genes and upstream subsystems in two forward propagation modules (gene-subsystem and subsystem-subsystem). Genes directly connected with subsystem t were treated as a gene set for gene subsystem forward propagation. The directly connected gene vector was obtained by concatenating the expression profile values in the gene set of the SL genes x and y. For subsystem-subsystem forward propagation, the subsystem vector was obtained via the concatenation of the neuron activation values for its upstream subsystems. The state of the subsystem t was represented by the output vector O<x,y>(t)

O<x,y>(t)=BatchNormTanhLinearI<x,y>(t)

where I<x,y>(t) is the input vector concatenated by the directly connected gene vector and subsystem vectors, and Linear(I<x,y>t) is the linear transformation of the input vector.Linear(I<x,y>t)=WtI<x,y>t+bt, where Wt is a matrix with the dimensions [LIt,LOt] in which LI(t) is the length of the input vector, as determined by the number of the directly connected genes and upstream subsystems for subsystem t, LO(t) was the length of the output vector, and b(t) was the bias vector with the same size as LO(t).

Tanh (hyperbolic tangent) was used as an activation function, while BatchNorm served as a normalizing function for regularization to mitigate the risk of overfitting. BatchNorm was also able to decrease the impact of internal covariate shift resulting from different scales of weights in Wt. The gene expression profile information for an SL interaction was propagated through the KDDSL network to make SL predictions.

The training process of KDDSL

The A549 dataset was separated into training and test sets at a 9:1 ratio, and 10-fold cross-validation was performed. The Global dataset was similarly separated into training and test sets at a 4:1 ratio and 5-fold cross-validation was performed. The input feature for this model was the concatenation of the gene expression profiles of two candidate SL genes. As the model is not capable of differentiating between a potential SL interaction xy based on order (x-y or y-x), the training samples with order x-y and y-x were doubled during the training process. Both x-y and y-x representations were propagated through the network of KDDSL and averaged to obtain predictions, allowing KDDSL to achieve the same xy SL interaction predictions irrespective of order (x-y or y-x). The parameters were trained using a Cross-Entropy objective (loss) function:

CrossEntropyLoss=-i=1Kpilogqi

The objective function was optimized with Adam. The batch size was set to 64 and the learning rate was 0.0001. All network parameters were initialized by the Xavier algorithm.

The evaluation metrics

AUC and AUPR were used for performance evaluation metrics. To enhance the objectivity of the evaluation, we refer to “unified score” to unify the results of each metric as the final comprehensive metric [37]. It is defined as follows:

U=-InrN+1

The evaluation process involved assigning ranks (r) to each method for specific metrics (AUC and AUPR). The total number of methods evaluated was denoted as N. Higher scores indicated superior performance. In cases where a method was assessed across multiple datasets, we calculated its average unified scores.

Identification of important subsystems embedded in KDDSL

To define the important KDDSL subsystems involved in predicting SL interactions, a one-layer feed-forward neural network (FNN) was constructed for each subsystem consisting of a linear layer followed by the Tanh activation function. The FNN input was the activation value for the SL interaction for this subsystem, while the output was a two-dimensional vector that could be used for predicting label probabilities. FNN training was performed using the same strategy employed for KDDSL training. Cross-entropy (CE) loss was calculated between predicted and actual labels of the SL interactions for each subsystem. As these CE scores characterized the information learning capacity of these subsystems, they were adopted as important subsystem-specific scores that were used to rank subsystems from the smallest to the largest CE score. Lower CE scores corresponded to better probability estimates. As CE scores were associated with the layer location of the subsystem in KDDSL, the top subsystems for each layer were selected as the topmost important subsystems The top subsystems across multiple samples were determined using the mean CE scores for those samples.

Baseline models

Model performance was evaluated by comparing KDDSL with multiple baseline models, including conventional ANN, sparse ANN(S-ANN), and three state-of-the-art methods. The conventional ANN model utilized a fully connected ANN with the same number of layers and neurons as KDDSL, while the S-ANN was a sparse ANN matching the KDDSL architecture with shuffled connections. The utilized state-of-the-art methods included CMF, which is a matrix factorization-based method for SL prediction proposed by Liany et al. [38], SL2MF, a matrix factorization-based method for SL prediction proposed by Liu et al. [39], DDGCN, the first GNN-based predictive method for SL proposed by Cai et al. [23], and GCATSL, a GNN-based predictive method for SL proposed by Long et al. [22], KG4SL, the first knowledge graph based method for SL prediction proposed by Zhu et al. [40] and SLGNN, a knowledge graph based method for SL prediction proposed by Wang et al. [41].

In vitro and in vivo drug synergy and mechanism assays

Chemical and drugs. Small-molecule inhibitors of KDR (ZM 323,881 HCl, T1991), VHL (VH298, TQ0121), and CDK9 (NVP-2, T16363) were obtained from Targetmol (MA, USA). Inhibitors of MDM2 (Idasanutlin, S7205), MAPK12 (BMS-582949, S8124), USP1 (ML323, S7529), CDK9 (AZD-4573, S8719), and RELA (QNZ, S4902) were obtained from Selleck Chemicals (TX, USA).

Cell lines. A549 cells were obtained from the American Type Culture Collection (ATCC, VA, USA) and validated via short tandem repeat (STR) genotyping. These cells were cultured in RPMI-1640 (Hyclone, UT, USA) supplemented with 10 % fetal bovine serum (FBS; Hyclone, USA), and 1 % penicillin/streptomycin (Invitrogen, CA, USA) in a 37 °C 5 % CO2 incubator.

Antibodies. Rabbit antibodies specific for β-actin (20536–1-AP), ATM (27156–1-AP), RAD51 (14961–1-AP), γ-H2AX (10856–1-AP), and LC3 (14600–1-AP) were obtained from Proteintech (IL, USA). Rabbit antibodies specific for p53 (2527), p62 (39749), and cleaved-PARP (5625) were obtained from Cell Signaling Technology (MA, USA).

Viability and drug synergy assays. For viability assays, A549 cells were added to 96-well plates (5,000/well) overnight, followed by treatment with the indicated drugs for 72 h, with DMSO serving as vehicle control. A CCK-8 kit was used to assess the viability of adherent cells based on provided directions (Dojindo Laboratories, mamoto Ken, Japan). Drug synergy assays utilized to Chou-Talalay equation to quantify synergistic effects for drug pairs. A combinational Index (CI) < 1 was indicative of synergism, while a CI > 1 was indicative of antagonism [41]. A549 cells were treated for 72 h with individual drugs or combinations at selected doses, after which viability was analyzed via CCK-8 kit. CI curves for these combinatorial treatments are shown in Supplementary Information Fig. S2.

Quantitative real-time-PCR. A549 cells were added to 6-well plates (1 × 106/well) overnight, after which they were treated with appropriate inhibitors for the indicated periods of time, after which they were rinsed using PBS. TRIzol (Invitrogen, USA) was then used based on provided directions to isolate cellular RNA, after which 2 μg of total RNA per sample was used for first-strand cDNA synthesis with oligo (dT) primers using Moloney murine leukemia virus reverse transcriptase (Promega, WI USA). All qPCR reactions were performed in triplicate in a 20 μL volume containing 10 μL of SYBR Premix Ex Taq Master Mix (2 × ) (Takara, Shiga-ken, Japan), 0.5 mM of each of the primer, and 10 ng of cDNA. The comparative Ct method was used to compute relative target gene expression, and β-actin served as a normalization control. DNA repair-related primers are listed in Table S1.

Western immunoblotting. A549 cells were added to 6-well plates (1 × 106/well) overnight, after which they were treated with appropriate inhibitors for the indicated periods of time. Cells were then rinsed using PBS followed by lysis using RIPA buffer supplemented with protease inhibitors. A Bradford assay was used to measure protein concentrations, and protein was then separated via 10 % SDS-PAGE prior to transfer onto a nitrocellulose membrane. Blots were then probed with appropriate primary and secondary antibodies in sequence, and were imaged with the Imaging system (Biorad, CA, USA), with Image Lab being used to analyze the resultant images.

Animal xenograft models. The Institutional Animal Care Committee of Beijing Institute of Biotechnology approved all animal studies. Female nude mice were obtained from Vital River Laboratory Animal Technology (Beijing, China) and housed in a specific pathogen-free animal facility. Xenografts were established by subcutaneously injecting 0.1 mL of PBS containing 5 × 106 A549 cells into the dorsal flank of each 6-week-old mouse. Calipers were used to measure the size of the resultant tumors over time, with tumor volume being calculated as follows: volume = (longest diameter × shortest diameter2)/2. When tumors were ∼ 100 mm3, mice were randomized into four groups and orally administered appropriate treatment drugs every 2 days or an equivalent volume of saline as a control. Mice were weighed and tumor volumes were measured every 3 days. Drug inhibitor activity was assessed based on tumor growth inhibition (TGI) according to the following formula: TGI (%) = (Vc-Vt) / (Vc-V0) *100, where Vc and Vt respectively correspond to the median volume in the control and treated groups, while V0 is the median volume in the control group at the start of the study. Studies were terminated when tumors were a maximum of ∼ 1.5 cm in diameter. Mice were deeply anesthetized prior to euthanasia to minimize suffering, after which tumors were excised and weighed.

Statistical analysis. In vitro experiments were repeated three times, and the results were compared using one-way analyses of variance (ANOVAs) using SPSS 13.0 or GraphPad Prism 8.0. Data are reported as means ± SD, and P < 0.05 was the significance threshold.

Ethics statement

All experiments involving animals were conducted according to the ethical policies and procedures approved by the Institutional Animal Care Committee of Beijing Institute of Biotechnology (Approval no. IACUC-DWZX-2022–582).

Results

Overview of the KDDSL framework

KDDSL is a deep learning framework for the prediction and mechanistic interpretation of SL interactions that is embedded with hierarchical BP knowledge (Fig. 2). Since the impact of gene perturbations on cell phenotype is achieved through multi-scale biological processes, the biological process terms from GO (Gene Ontology) [36] were used to guide the construction of the KDDSL architecture, with each term being defined as a subsystem for the model in this study. The functional state of each of these subsystems was represented by between 2 and 10 neurons. Connections between subsystems were only considered to exist when there were GO relationships between an upstream subsystem (near input layer) and a downstream subsystem (near output layer), mapping the GO biological hierarchy. The number of neurons per subsystem was determined by comparing the performance of KDDSL with different neurons for each subsystem. As shown in Supplementary Information Fig. S1, the KDDSL model was robust when using different numbers of neurons for each subsystem. Two neurons were ultimately adopted for each subsystem, given that the KDDSL model using two neurons per subsystem exhibited relatively better performance together with decreased model complexity. In this way, neurons in the conventional neural network model are replaced by biological process knowledge terms, making the model more easily understandable and providing potential insights into biological mechanisms.

Fig. 2.

Fig. 2

The KDDSL framework and the application of the KDDSL model to develop combination therapies.

The KDDSL model was mainly trained by SL interactions from the A549 cell line derived from the SynLethDB database [32] and from CRISPR double-knockout experiments [30], [33], [34] as label data. L1000 gene expression data were adopted as the input gene feature. We chose this feature data for the following reasons: (1) L1000 gene expression data quantified the expression of 978 landmark genes under both control and gene knockdown conditions. Each dimension of this feature data represents a landmark gene, so that the neurons representing BP knowledge can be connected to the input gene feature through the gene-BP connection knowledge; (2) L1000 gene expression data represents gene knockdown-induced transcriptomic perturbation in specific cell line, which can capture the information of cell-line specific genetic perturbation more comprehensively. The input feature was the concatenation of the gene expression profiles of two SL genes. In particular, SL describes a bidirectional relationship between two genes, where if gene x has an SL relationship with gene y, gene y also has an SL relationship with gene x. Therefore, the model should not differentiate between the input order of xy and yx. To address this, we adopt training and predictive strategies in DeepSynergy [42] to ensure that the same predictions would be generated for the SL interaction xy irrespective of ordinality (see Methods for further details).

After assessing the predictive capacity and interpretability of the KDDSL model, it was adapted to design combination cancer treatment regimens. These strategies can be classified into two types: gene-level SL and biological process (BP)-level SL. In gene-level SL approaches, drug combinations can simultaneously inhibit two SL gene targets [10], [13], [14]. In BP-level SL strategies, drugs or therapies that mimic the effects of SL partner genes of gene B on a particular BP can be combined with a second drug that mimics the effects of inhibiting SL gene B. In this study, for BP-level induced SL-based combination therapies, important BPs associated with key therapeutic target-related SL interactions identified by KDDSL were regarded as drug targets such that drugs targeting these processes may synergize with other key therapeutic targeted drugs. The results of these predictions were subjected to literature-based, in vitro, and ClinicalTrails.gov (searching combination therapies entering clinical trials on https://clinicaltrials.gov/) validation. For gene-level induced SL-based combination therapies, the top-ranked SL interactions predicted by KDDSL were selected as target drug pairs, and then synergistic drug combinations based on these pairs were validated in vitro and in vivo. The synergistic effects and underlying mechanisms for these regimens were then validated from multiple perspectives (Fig. 2).

(1) The KDDSL framework. KDDSL was constructed with hierarchical biological knowledge such that inputting unknown gene pairs into this model can provide outputs in the form of predicted SL interactions and the identification of important biological processes (BPs) associated with these predicted SL interactions.

(2) KDDSL-based development of combination therapies. This approach was broadly divided into two major strategies:

Combination therapies based on BP-level induced SL interactions (Upper). When key therapeutic target-related SL interactions (e.g., EGFR, mTOR) are input into KDDSL, the important BPs identified by this model can be regarded as the first drug target, synergizing with a drug that inhibits gene B to promote cell death. The results of these predictions were subjected to literature-based, in vitro, and ClinicalTrails.gov (searching combination therapies entering clinical trials on https://clinicaltrials.gov/) validation.

Combination therapies based on gene-level induced SL interactions (Lower). Combinations of drugs that inhibit two SL gene targets can promote cell death. Key BPs identified by KDDSL offer a potential mechanistic basis for this activity. In vitro and in vivo studies were performed to validate the synergistic effects of predicted combination therapies and the underlying mechanisms.

KDDSL exhibits competitive predictive performance

To assess the performance of KDDSL, it was initially compared to several baseline models as detailed in the Methods section, including a fully-connected artificial neural network (ANN) with the same number of layers and neurons as KDDSL, a sparse ANN (S-ANN) matching the architecture of KDDSL but with shuffles connections, and six state-of-the-art models developed for SL prediction (CMF [38], SL2MF [39], DDGCN [23], GCATSL [22], KG4SL [40], SLGNN [43]). As shown in Fig. 3a–b, KDDSL achieves the highest mean AUC and AUPR on A549 dataset. It is noteworthy that KDDSL slightly outperformed the ANN and S-ANN. Given that the connections between neurons were the only differences among these models, this suggests that the biological knowledge-based connection modality used by KDDSL can meaningfully guide SL prediction.

Fig. 3.

Fig. 3

Comparison of the performance of different predictive models. a-b The performance of these models was assessed based on the AUC (area under the curve, a) and AUPR (area under the precision-recall curve, b) on A549 dataset. *p < 0.05, **p < 0.01, ***p < 0.001 (t-test). c The performance of these models based on average unified score across four datasets. We used the highest unified score as reference (KDDSL, 100 %). For more details about datasets and baseline models, refer to Materials and Methods section.

To further evaluate the generalization performance of KDDSL, it was applied to additional three datasets. K562 dataset contains SL samples from K562 cell line. MIAPaca-2 dataset contains SL samples from MIAPaca-2 cell line. Global dataset contains SL samples from multiple cell lines (see the Methods section for additional details). Table S2 shows the performance comparison between KDDSL with baseline models in the four datasets (including A549 dataset). Across all four datasets, KDDSL consistently demonstrated competitive predictive performance. To make our evaluation more objective, we refer to an average unified score [37] (see “Methods”) and rank these models according to the average unified score. This score takes into account multiple metrics and datasets, providing a comprehensive assessment of model performance. As shown in Fig. 3c and Supplementary Information Table S3, KDDSL achieves the highest average unified score, surpassing the second-place model by an impressive margin of 24 %. These results suggest that KDDSL has the potential to be extended to other SL datasets.

KDDSL identifies important biological processes underlying SL interactions

In contrast to conventional ANNs, KDDSL integrates knowledge of BPs to enable model interpretability, potentially revealing the mechanisms underlying SL interactions. To identify important subsystems associated with SL predictions, the cross-entropy (CE) scores for each of these subsystems were ranked (refer to Methods section). A lower CE score can better characterize the information learning capacity of the subsystem. As CE score is associated with the layer in which a subsystem is located within the model, the top subsystems with the lowest CE scores for each layer (other than the fourth layer, which only contained a single BP subsystem) were taken as the final important subsystems. To validate that KDDSL had accurately identified important subsystems, neuron activation values for the top subsystems were visualized (Fig. 4), revealing that the top subsystem for each layer was able to distinguish between positive and negative samples with a higher CH (Calinski-Harabasz) index. The CH index [44] is generally utilized to assess clustering effectiveness, with higher CH index values denoting a greater inter-cluster distance and a smaller intra-cluster distance. In contrast, the bottom subsystem for each layer was unable to distinguish between samples with different labels (Fig. 4). These results indicate that this ranking approach can effectively identify important subsystems, which can provide valuable information to enable model interpretation.

Fig. 4.

Fig. 4

KDDSL identifies important SL interaction-related subsystems. a The top subsystems for layers 1–3 were able to differentiate between samples with different labels (top), whereas the bottom subsystems for layers 1–3 were not (bottom). Negative and positive samples are represented by 0 and 1, respectively. b The potential mechanisms underlying SL interactions between ERCC1 and USP1 identified by KDDSL. Evidence supporting this candidate mechanism was derived from the literature.

To further demonstrate that the top subsystems embedded in KDDSL can clarify the mechanistic basis for SL interactions, the top 10 subsystems per layer were selected for each sample. Further analysis confirmed that these subsystems may harbor useful information regarding the mechanisms governing SL phenotypes. Take, for example, the case of an SL interaction between ERCC1 (excision repair cross complementing‐group 1) and USP1 (ubiquitin‐specific peptidase 1). Regulation of autophagy (rank 1st in layer 3) and regulation of signal transduction by p53 class mediator (rank 1st in layer 1) were among the top subsystems in our model. ERCC1 has been reported to participate in multiple DNA repair pathways [45]. USP1 has been shown to be associated with autophagy. Raimondi et al. demonstrated that USP1-deficient cells exhibit impaired canonical autophagic activity [46]. Given that the loss of autophagy can contribute to SL defects in DNA repair [47], we hypothesize that autophagy and DNA repair may underlie the SL interactions between ERCC1 and USP1. Consistently, the important subsystems identified by KDDSL included both autophagy and the key DNA repair regulator p53 (Fig. 4b). These results demonstrate that the top subsystems identified using KDDSL can offer insight into the molecular mechanisms driving SL phenotypes. Overall, KDDSL can thus generate accurate and interpretable SL predictions, providing clear value for the further exploration of candidate SL interactions.

KDDSL enables the design of combination therapies by identifying and mechanistically characterizing SL interactions

Development of combination therapies based on BP-level induced SL. Conventionally viewed as a straightforward genetic concept, the understanding of SL has evolved with deeper insights into its relationships [48]. Given that DNA damage repair (DDR) is related to the mechanisms underlying many PARP-related SL interactions (e.g., BRCA, WEE1), many PARPi and DNA-damaging agent-based combination therapies are undergoing clinical testing. This principle for this combination treatment strategy centers on BP-level induced SL. We thus hypothesize that the important subsystems for key therapeutic target-related SL interactions identified by KDDSL may highlight opportunities for synergistic drug combination treatment. EGFR mutations are among the most common driver mutations in lung cancer [49], with mutations activating EGFR tyrosine kinase activity, leading to uncontrolled tumor cell growth [50]. The mTOR protein kinase is a key regulator of cellular growth and cancer progression [51], with mTOR dysregulation being a hallmark of lung cancer and many other cancers [52]. Accordingly, we obtained important EGFR and mTOR-related SL interactions by ranking the subsystems for the involved SL interactions according to their mean CE scores, as detailed in the Methods section. These important subsystems are listed in Table S4-S5.

Some important subsystems suggest synergistic drug combination opportunities for EGFR and mTOR-targeted therapies. As detailed in Table S4, three lipid-associated subsystems emerged as important subsystems associated with EGFR-related SL interactions, including the cholesterol biosynthetic process (ranked 7th in layer 1), cholesterol metabolism process (ranked 2nd in layer 2), and regulation of lipid metabolism process (ranked 8th in layer 3) (Table S4). This indicates that targeting lipid metabolism in combination with EGFR inhibitors may represent a promising treatment strategy. Lipids are important in many different cell functions, and lipid metabolic reprogramming has been raised as an emerging hallmark of cancer [53], making it an important target for a class of antitumor therapies [54]. Many studies have demonstrated that targeting lipid metabolic reprogramming can improve the efficacy of EGFR inhibitors in lung cancer [55]. In one recent cohort study, the use of cholesterol-lowering statins was found to be significantly associated with prolonged survival in lung cancer patients undergoing EGFR inhibitors treatment [56]. We listed several clinical trials in which lipid-regulating agents and EGFR-TKIs were being used to treat lung cancer (Fig. 5a). More detailed conclusions of literatures related to the combined use of EGFR and lipid-regulating drugs were listed in Table S6.

Fig. 5.

Fig. 5

KDDSL can help to develop promising combination therapies based on BP-level induced SL. a Schematic overview of EGFR-related combination therapy and clinical trial validation. The green and purple blocks respectively correspond to EGFR inhibitors and lipid-regulating agents. Clinical Trial Registration Numbers are annotated on the grey line between these blocks.b Schematic overview of mTOR-related combination therapy and clinical trial validation. The green and purple blocks respectively represent mTOR inhibitors and DNA-damaging agents. Clinical Trial Registration Numbers are annotated on the grey line between these blocks.c The impact of Ixabepilone, Veliparib, or Advosertinib (DNA-damaging agents) with Everolimus (mTOR inhibitor) as single agents or drug combinations in A549 cells. Cell viability was measured 72 h after treatment with the indicated doses. Data are means ± SD for triplicate analyses. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Additionally, four DDR-related subsystems were identified as important subsystems for mTOR-related SL interactions (ranked 6th and 9th in layer 1, ranked 1st, 5th, and 9th in layer 2) (Table S5). Communication between the DDR response and mTOR has been suggested to be beneficial for cancer treatment [57], [58]. We listed several clinical trials exploring the combined use of DNA-damaging agents and mTOR inhibitors to treat lung cancer (Fig. 5b). Moreover, a CCK-8 assay was further used to assess synergistic interactions between mTOR inhibitors and DNA-damaging agents. Notably, we found that combining several DNA-damaging agents (Ixabepilone, Veliparib, and Adosertinib) with the mTOR inhibitor Everolimus resulted in the significant inhibition of A549 cell growth as compared to single-agent treatment. These data emphasize the potential promise of combining DNA-damaging agents and mTOR inhibitors to treat lung cancer (Fig. 5c). The above two cases highlight the potential of KDDSL for the identification of the mechanisms that underlie key therapeutic target-related SL interactions, providing guidance that can enable BP-level induced SL-based combination treatment development.

Development of combination therapies based on gene-level induced SL. The concept of SL posits that drugs that inhibit two SL genes may exhibit potential synergistic efficacy. To validate this hypothesis, we used the well-trained KDDSL model to predict previously unknown SL interactions, prioritizing these predicted interactions based on the resultant scores with the top predicted interactions being selected as drug targets, including CDK9 (cyclin-dependent kinase 9) + MDM2 (mouse double minute 2 homolog) (ranked 15th among 2034 predicted SL interactions) and CDK9 + PARP2 (ranked 33rd among 2034 predicted SL interactions), given that highly selective inhibitors were available for these gene pairs. A Cell Counting Kit-8 (CCK-8) assay was then used to evaluate the synergistic antitumor activity of combinations of drugs targeting these genes. Notably, these analyses revealed that combining NVP2 (CDK9 inhibitor) with Idasanutlin (MDM2 inhibitor) or Veliparib (PARP2 inhibitor) inhibited the growth of A549 cells more effectively than single agent treatment (Fig. 6a). Combination index curves generated using the Chou-Talalay method [41] further revealed the highly potent synergistic effects of combining NVP2 with Idasanutlin or Veliparib (Combination Index < 1; Supplementary Information Fig. S2a–S2b). Dose-response matrices and dose–response curves for combinations of NVP2 with Idasanutlin or Veliparib highlighted high levels of potency and robust synergistic efficacy, as demonstrated by pronounced reductions in IC50 values for these individual drugs in the presence of the second drug (Combination Index < 1; Fig. 6c–6d; Supplementary Information Fig. S3-4). A second CDK9 inhibitor, AZD-4573, also synergized with both Idasanutlin and Veliparib (Supplementary Information Fig. S2c–S2f). These results highlight the potent SL interactions achieved by inhibiting CDK9 and either MDM2 or PARP in A549 cells.

Fig. 6.

Fig. 6

KDDSL can help to develop promising combination therapies based on gene-level induced SL. Shown are results corresponding to the in vitro validation of synergistic efficacy of Idasanutlin (MDM2 inhibitor) or Veliparib (PARP2 inhibitor) and NVP-2 (CDK9 inhibitor). MDM2 + CDK9 and PARP2 + CDK9 were SL interactions predicted by KDDSL. a-b The effects of single-agent or combination treatment with Idasanutlin or Veliparib with NVP-2 on A549 cells. After treatment with the indicated drug doses for 72 h, cell viability was evaluated (a). A combinatorial index (CI) was calculated with the Chou-Talalay equation based on data for multiple doses and response points. CI values for three different indicated Fa are shown (b). c Dose-response curves for single-agent or combination Idasanutlin and NVP-2 treatment for 72 h at the indicated dose levels in A549 cells. d Veliparib and NVP-2 dose–response curves when used as single agents or combinations to treat A549 cells for 72 h. e qPCR analyses were used to assess the expression of the indicated genes in A549 cells following single-agent or combined Idasanutlin (10 nM) or NVP-2 (3.3 nM) treatment for 48 h. f Western immunoblotting was used to analyze A549 cells treated for 48 h with Idasanutlin (10 nM) and/or NVP-2 (3.3 nM). g A549 cells were treated for 48 h with Veliparib (0.5 μM) and/or NVP-2 (3.3 nM), after which the ratio of positive MDC staining was assessed by flow cytometry to enable quantitative analysis. h Western immunoblotting was used to analyze A549 cells treated for 48 with Veliparib (0.5 μM) and/or NVP-2 (3.3 nM). Data are means ± SD for triplicate analyses. *p < 0.05, **p < 0.01 vs. vehicle (t-test).

Given these promising results, we sought to explore the mechanistic basis for these observed SL phenotypes. DDR was identified by KDDSL as an important subsystem for the CDK9 + MDM2 SL interaction (ranked 6th in layer 3). As such, we assessed the impact of CDK9 + MDM2 inhibition on the expression of DDR genes. As shown in Fig. 6e, treating A549 cells with NVP2 and Idasanutlin resulted in the pronounced downregulation of a range of DDR genes, while also provoking DNA damage and apoptotic cell death as detected based on γ-H2AX, ATM, RAD51, p53, and cleaved PARP levels (Fig. 6f). These results suggest that combined CDK9 + MDM2 inhibition can induce DNA damage and tumor cell death. The regulation of autophagy is also an important subsystem associated with the CDK9 + PARP SL interaction identified by KDDSL (ranked 6th in layer 3). Accordingly, the fluorescent autophagosome-staining MDC dye was used to assess the autophagic ratio in A549 cells treated with NVP2 and/or Veliparib. Treatment with NVP2 or Veliparib alone only had a slight pro-autophagic effect in these cells (Fig. 6g), whereas combined NVP2 + Veliparib treatment significantly enhanced the autophagic ratio. The autophagy substrate p62 negatively regulates autophagy and directly interacts with LC3 to recruit it into autophagosomes [59], [60]. When levels of LC3 and p62 were assessed via Western immunoblotting in A549 cells to determine whether combination treatment was sufficient to induce autophagy (Fig. 6h). A marked increase in p62 activation was observed upon combination treatment with concomitant LC3-I conversion to LC3-II, consistent with the ability of CDK9 + PARP inhibition to induce autophagy in tumor cells.

To extend these results in vivo, the effects of single-agent or combination NVP2 and Idasanutlin treatment were evaluated in nude mice bearing A549 xenograft tumors (Fig. 7a). Treatment with Idasanutlin (10 mg/kg) or NVP2 (20 mg/kg) alone resulted in modest tumor growth inhibition (TGI)% values of 33.0 % and 47.2 %, respectively. In contrast, a TGI% of 89.2 % was achieved by combined NVP2 + Idasanutlin treatment consistent with a pronounced inhibitory effect on tumor growth upon combination treatment (Fig. 7b–7d). No apparent toxicity was associated with this combined treatment at the selected drug doses, as determined based on analyses of murine body weight over the experimental period (Fig. 7e). Combination treatment also leads to provoking DNA damage and apoptotic cell death in vivo, as detected based on γ-H2AX, ATM, RAD51, p53, and cleaved PARP levels. Together, these results suggest that simultaneously targeting CDK9 and MDM2 holds promise for treating lung cancer. These findings further emphasize the accuracy of the predictions made by KDDSL and its ability to guide the design of synergistic drug combinations based on gene-level induced SL.

Fig. 7.

Fig. 7

KDDSL can help to develop promising combination therapies based on gene-level induced SL. Shown are the results of in vivo validation experiments exploring the synergetic effects and mechanistic basis for Idasanutlin (MDM2 inhibitor) and NVP2 (CDK9 inhibitor) treatment.a The treatment protocol for mice bearing A549 xenograft tumors and treated with vehicle, Idasanutlin, or NVP-2 alone or in combination.b-e Mice were treated as detailed in (a) beginning 30 days after xenograft implantation, and tumor growth was monitored every three days (b). Tumors were imaged (c) and weighed (d) at study end (n = 7/group). **p < 0.01 (t-test). (e) The body weight of BALB/c nude mice treated as detailed in (A) (n = 7/group).f Western immunoblotting results for tumors from each group. Data are means ± SD from seven tumors at each time point.

Discussion

Here, we developed a knowledge and data dual-driven AI framework KDDSL and demonstrated that it could be effectively applied to design combination anti-cancer therapies. A series of computation, experimental, and literature-based validation strategies were used to demonstrate the ability of KDDSL to accurately predict SL interactions and to clarify the underlying mechanism. The further application of this model to guide the development of SL-based combination therapies offers a promising means of expanding drug target pairs for the treatment of cancers.

Our KDDSL model differs significantly from most models previously used to predict SL interactions in two critical aspects. First, we integrate gene knowledge relating to SL into the architecture KDDSL to guide its decision process, such that the resultant predictions can be improved and interpretable. Most AI models are data-driven, often lacking insight into how the living system actually functions. As important participants in the underlying SL mechanism, the BP knowledge embedded in KDDSL can thereby help to optimize the predictive process while clarifying the mechanisms underlying SL, aiding its further application. Second, KDDSL takes the cellular context associated with SL into consideration, which is noteworthy given the growing body of evidence emphasizing the dependence of SL phenotypes on the cellular context [30], [31]. As this model was trained using cell line-specific data, KDDSL was able to make SL predictions appropriate to this particular cellular context. Potentially owing to this property, the combined inhibition of CDK9 and MDM2 resulted in robust antitumor activity without any significant adverse effects on tumor-bearing mice (Fig. 7e).

Despite the success of anti-cancer drugs, the persistence of drug resistance and toxicity remains a large challenge in clinical practice [61]. Identifying synergistic drug combinations plays a crucial role in addressing drug resistance and minimizing toxicity. SL characterizes a form of genetic redundancy wherein functional compensation occurs upon inhibiting a specific target [62], [63]. Consequently, SL provides candidate drug target pairs for rational combination therapy design [17]. However, the vast search space of gene combinations makes identifying SL challenging. The predictive and interpretable ability of KDDSL make it a promising model for designing SL-based combination therapies. Our results highlight the promise of both BP-level and gene-level induced SL-guided combination therapy development by further integrating expert manual selection.

For BP-level SL-guided combination therapy design, we focus on the synergetic therapies combined with key therapeutic targeted drugs. While key therapeutic targets, such as mTOR and EGFR, shaped the area of precision oncology, the therapeutic effect of single-target inhibition is limited due to the clinical complexity of cancer [64], [65]. As a knowledge and data dual driven model, KDDSL can offer the important knowledge associated with key therapeutic target-related SL interactions. In this way, we can utilize the information of multiple SL interactions to discover the potential synergetic drugs with key therapeutic target inhibitors. In vitro and literature-based evidence highlighted the potential for the discovery of synergistic drugs with mTOR and EGFR inhibitors, respectively.

For gene-level SL-guided combination therapy design, we focus on designing combination therapy based on the predicted SL interactions. In clinical practice, some targeted drugs exhibit modest anti-cancer effects, and it is necessary to explore suitable combination therapy strategies [65]. In this study, gene pairs among the top-ranked SL interactions predicted by KDDSL were selected as the target pairs, leading to the successful in vitro validation of two synergistic drug combinations and the in vivo validation of one of these combinations. In particular, we have demonstrated the synergistic effect of MDM2 and CDK9 inhibitors in vivo. CDK9 has emerged as a promising therapeutic target for various cancers, with several inhibitors currently under clinical development [66], [67]. Many clinical trials involving different CDK9 inhibitors yield unfavorable outcomes, and combining CDK9 inhibitors with other anti-cancer drugs may lead to improved therapeutic outcomes compared to CDK9 inhibitors alone, indicating the potential of CDK9 inhibitors in combination therapies [66]. The MDM2 is also an attractive anti-cancer target due to its ability to regulate the expression of p53 protein [68]. The synergistic activity observed upon CDK9 inhibitor and MDM2 inhibitor treatment holds potential clinical promise for the treatment of lung cancer and warrants further clinical investigation.

We anticipate further opportunities to improve the predictive capability and interpretability of KDDSL, so as to facilitate its application. These efforts will hinge on leveraging additional cell line-specific and on more comprehensive embedded knowledge. Given the inherent limitations in the available data and knowledge underlying this model, however, our study is subject to certain limitations. Firstly, owing to a limited number of available training samples, we were not able to apply KDDSL to plenty of cell lines. However, we applied KDDSL to additional four datasets to further verify its generalization performance. We believe that this model will be of significant value as the volume of available cell line-specific SL data continue to grow. Secondly, only BP-related knowledge was integrated into the KDDSL model architecture. Curated literature-based knowledge may not be sufficiently comprehensive to clarify the mechanisms underlying SL phenotypes. In this study, although KDDSL can provide clues for the mechanisms underlying SL interactions, the mechanisms are not specific enough. Thus, in designing combination therapy, both AI-predicted results and expert manual selection were used to select target pairs. Future studies should incorporate more specialized biological knowledge, like cancer gene regulatory networks, to enhance guidance for combination therapy design and clinical application.

In summary, this study utilized a knowledge and data dual-driven AI framework to explore SL-based anti-cancer combination therapies. To cope with complex diseases like cancer, single-target drug intervention has proven to be insufficient [69]. The predictive accuracy and interpretable ability of KDDSL make it an invaluable tool to guide the design of combination therapy based on SL, which leverages the SL relationship of targets to design combination therapies. These results highlight the importance of using knowledge-embedded AI models to analyze biological mechanisms and facilitate drug target discovery. Future biomedicine-focused AI strategies also need to combine rational biological knowledge with developed models.

Compliance with ethics requirements

All Institutional and National Guidelines for the care and use of animals were followed.

CRediT authorship contribution statement

Jing Wang: Conceptualization, Investigation, Methodology, Data curation, Writing – original draft. Yuqi Wen: Conceptualization, Methodology, Writing – original draft. Yixin Zhang: Validation, Writing – original draft. Zhongming Wang: Methodology. Yuyang Jiang: Methodology. Chong Dai: Formal analysis. Lianlian Wu: Formal analysis. Dongjin Leng: Formal analysis. Song He: Funding acquisition, Supervision, Writing – review & editing. Xiaochen Bo: Funding acquisition, Supervision, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (62103436).

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jare.2023.11.035.

Contributor Information

Song He, Email: hes1224@163.com.

Xiaochen Bo, Email: boxiaoc@163.com.

Appendix A. Supplementary data

The following are the Supplementary data to this article:

Supplementary data 1
mmc1.docx (3.3MB, docx)

Data availability

The dataset and code supporting the conclusions of this article is available in the GitHub repository https://github.com/Wingswang728/KDDSL.

References

  • 1.O'Neil N.J., Bailey M.L., Hieter P. Synthetic lethality and cancer. Nat Rev Genet. 2017;18:613–623. doi: 10.1038/nrg.2017.47. [DOI] [PubMed] [Google Scholar]
  • 2.Dempster J.M., Rossen J., Kazachkova M., Pan J., Kugener G., Root D.E., et al. Extracting biological insights from the project achilles genome-scale CRISPR screens in cancer cell lines. bioRxiv. 2019 [Google Scholar]
  • 3.Huang A., Garraway L.A., Ashworth A., Weber B. Synthetic lethality as an engine for cancer drug target discovery. Nat Rev Drug Discov. 2020;19:23–38. doi: 10.1038/s41573-019-0046-z. [DOI] [PubMed] [Google Scholar]
  • 4.Tucker C.L., Fields S. Lethal combinations Nat Genet. 2003;35:204–205. doi: 10.1038/ng1103-204. [DOI] [PubMed] [Google Scholar]
  • 5.Li S., Topatana W., Juengpanich S., Cao J., Hu J., Zhang B., et al. Development of synthetic lethality in cancer: Molecular and cellular classification. Signal Transduct Target Ther. 2020;5:241. doi: 10.1038/s41392-020-00358-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bryant H.E., Schultz N., Thomas H.D., Parker K.M., Flower D., Lopez E., et al. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase. Nature. 2005;434:913–917. doi: 10.1038/nature03443. [DOI] [PubMed] [Google Scholar]
  • 7.Farmer H., McCabe N., Lord C.J., Tutt A.N.J., Johnson D.A., Richardson T.B., et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature. 2005;434:917–921. doi: 10.1038/nature03445. [DOI] [PubMed] [Google Scholar]
  • 8.Fong P.C., Boss D.S., Yap T.A., Tutt A., Wu P., Mergui-Roelvink M., et al. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N Engl J Med. 2009;361:123–134. doi: 10.1056/NEJMoa0900212. [DOI] [PubMed] [Google Scholar]
  • 9.Tung N.M., Robson M.E., Ventz S., Santa-Maria C.A., Nanda R., Marcom P.K., et al. TBCRC 048: Phase II study of olaparib for metastatic breast cancer and mutations in homologous recombination-related genes. J Clin Oncol. 2020;38:4274–4282. doi: 10.1200/JCO.20.02151. [DOI] [PubMed] [Google Scholar]
  • 10.Stringer-Reasor E.M., May J.E., Olariu E., Caterinicchia V., Li Y., Chen D., et al. An open-label, pilot study of veliparib and lapatinib in patients with metastatic, triple-negative breast cancer. Breast Cancer Res. 2021;23:30. doi: 10.1186/s13058-021-01408-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Mullard A. DNA damage response drugs for cancer yield continued synthetic lethality learnings. Nat Rev Drug Discov. 2022;21:403–405. doi: 10.1038/d41573-022-00092-4. [DOI] [PubMed] [Google Scholar]
  • 12.Wera A.C., Lobbens A., Stoyanov M., Lucas S., Michiels C. Radiation-induced synthetic lethality: Combination of poly(ADP-ribose) polymerase and RAD51 inhibitors to sensitize cells to proton irradiation. Cell Cycle. 2019;18:1770–1783. doi: 10.1080/15384101.2019.1632640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kitange G.J., Mladek A.C., Schroeder M.A., Pokorny J.C., Carlson B.L., Zhang Y., et al. Retinoblastoma binding protein 4 modulates temozolomide sensitivity in glioblastoma by regulating DNA repair proteins. Cell Rep. 2016;14:2587–2598. doi: 10.1016/j.celrep.2016.02.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Duskova K., Lejault P., Benchimol E., Guillot R., Britton S., Granzhan A., et al. DNA junction ligands trigger DNA damage and are synthetic lethal with DNA repair inhibitors in cancer cells. J Am Chem Soc. 2020;142:424–435. doi: 10.1021/jacs.9b11150. [DOI] [PubMed] [Google Scholar]
  • 15.Angers S. Wnt signaling inhibition confers induced synthetic lethality to PARP inhibitors. EMBO Mol Med. 2021;13:e14002. doi: 10.15252/emmm.202114002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bagnolini G., Milano D., Manerba M., Schipani F., Ortega J.A., Gioia D., et al. Synthetic lethality in pancreatic cancer: Discovery of a new RAD51-BRCA2 small molecule disruptor that inhibits homologous recombination and synergizes with olaparib. J Med Chem. 2020;63:2588–2619. doi: 10.1021/acs.jmedchem.9b01526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Han K., Jeng E.E., Hess G.T., Morgens D.W., Li A., Bassik M.C. Synergistic drug combinations for cancer identified in a CRISPR screen for pairwise genetic interactions. Nat Biotechnol. 2017;35:463–474. doi: 10.1038/nbt.3834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Xiong J., Barayan R., Louie A.V., Lok B.H. Novel therapeutic combinations with PARP inhibitors for small cell lung cancer: A bench-to-bedside review. Semin Cancer Biol. 2022 doi: 10.1016/j.semcancer.2022.07.008. [DOI] [PubMed] [Google Scholar]
  • 19.Horlbeck M.A., Xu A., Wang M., Bennett N.K., Park C.Y., Bogdanoff D., et al. Mapping the genetic landscape of human cells. Cell. 2018;174 doi: 10.1016/j.cell.2018.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wan F.P., Li S.Y., Tian T.Z., Lei Y.P., Zhao D., Zeng J.Y. EXP2SL: A machine learning framework for cell-line-specific synthetic lethality prediction. Front Pharmacol. 2020;11 doi: 10.3389/fphar.2020.00112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hao Z.F., Wu D., Fang Y., Wu M., Cai R.C., Li X.L. Prediction of synthetic lethal interactions in human cancers using multi-view graph auto-encoder. Ieee J Biomed Health. 2021;25:4041–4051. doi: 10.1109/Jbhi.2021.3079302. [DOI] [PubMed] [Google Scholar]
  • 22.Long Y., Wu M., Liu Y., Zheng J., Kwoh C.K., Luo J., et al. Graph contextualized attention network for predicting synthetic lethality in human cancers. Bioinformatics. 2021 doi: 10.1093/bioinformatics/btab110. [DOI] [PubMed] [Google Scholar]
  • 23.Cai R., Chen X., Fang Y., Wu M., Hao Y. Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers. Bioinformatics. 2020;36:4458–4465. doi: 10.1093/bioinformatics/btaa211. [DOI] [PubMed] [Google Scholar]
  • 24.Tang Z., Chuang K.V., DeCarli C., Jin L.W., Beckett L., Keiser M.J., et al. Interpretable classification of Alzheimer's disease pathologies with a convolutional neural network pipeline. Nat Commun. 2019;10:2173. doi: 10.1038/s41467-019-10212-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kuenzi B.M., Park J., Fong S.H., Sanchez K.S., Lee J., Kreisberg J.F., et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell. 2020;38:672–684. doi: 10.1016/j.ccell.2020.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Fortelny N., Bock C. Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data. Genome Biol. 2020;21:190. doi: 10.1186/s13059-020-02100-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Elmarakeby H.A., Hwang J., Arafeh R., Crowdis J., Gang S., Liu D., et al. Biologically informed deep neural network for prostate cancer discovery. Nature. 2021;598:348–352. doi: 10.1038/s41586-021-03922-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ma J.Z., Yu M.K., Fong S., Ono K., Sage E., Demchak B., et al. Using deep learning to model the hierarchical structure and function of a cell. Nat Methods. 2018;15:290–298. doi: 10.1038/Nmeth.4627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Subramanian A., Narayan R., Corsello S.M., Peck D.D., Natoli T.E., Lu X., et al. A next generation connectivity Map: L1000 platform and the first 1,000,000 Profiles. Cell. 2017;171 doi: 10.1016/j.cell.2017.10.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Shen J.P., Zhao D., Sasik R., Luebeck J., Birmingham A., Bojorquez-Gomez A., et al. Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions. Nat Methods. 2017;14:573–576. doi: 10.1038/nmeth.4225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.A.A. Ku H.M. Hu X. Zhao K.N. Shah S. Kongara D. Wu et al. Integration of multiple biological contexts reveals principles of synthetic lethality that affect reproducibility Nature Communications 11 2020 ARTN 2375 10.1038/s41467-020-16078-y [DOI] [PMC free article] [PubMed]
  • 32.Guo J., Liu H., Zheng J. SynLethDB: Synthetic lethality database toward discovery of selective and sensitive anticancer drug targets. Nucleic Acids Res. 2016;44:D1011–D1017. doi: 10.1093/nar/gkv1108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhao D., Badur M.G., Luebeck J., Magana J.H., Birmingham A., Sasik R., et al. Combinatorial CRISPR-Cas9 metabolic screens reveal critical redox control points dependent on the KEAP1-NRF2 regulatory axis. Mol Cell. 2018;69 doi: 10.1016/j.molcel.2018.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Najm F.J., Strand C., Donovan K.F., Hegde M., Sanson K.R., Vaimberg E.W., et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat Biotechnol. 2018;36:179–189. doi: 10.1038/nbt.4048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Zamanighomi M., Jain S.S., Ito T., Pal D., Daley T.P., Sellers W.R. GEMINI: A variational Bayesian approach to identify genetic interactions from combinatorial CRISPR screens. Genome Biol. 2019;20:137. doi: 10.1186/s13059-019-1745-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.The Gene Ontology C. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45 2017 D331 D338 10.1093/nar/gkw1108 [DOI] [PMC free article] [PubMed]
  • 37.Leng D., Zheng L., Wen Y., Zhang Y., Wu L., Wang J., et al. A benchmark study of deep learning-based multi-omics data fusion methods for cancer. Genome Biol. 2022;23:171. doi: 10.1186/s13059-022-02739-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Liany H., Jeyasekharan A., Rajan V. Predicting synthetic lethal interactions using heterogeneous data sources. Bioinformatics. 2020;36:2209–2216. doi: 10.1093/bioinformatics/btz893. [DOI] [PubMed] [Google Scholar]
  • 39.Liu Y., Wu M., Liu C., Li X.L., Zheng J. SL(2)MF: Predicting synthetic lethality in human cancers via logistic matrix factorization. IEEE/ACM Trans Comput Biol Bioinform. 2020;17:748–757. doi: 10.1109/TCBB.2019.2909908. [DOI] [PubMed] [Google Scholar]
  • 40.Wang S., Xu F., Li Y., Wang J., Zhang K., Liu Y., et al. KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics. 2021;37:i418–i425. doi: 10.1093/bioinformatics/btab271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chou T.-C. Drug combination studies and their synergy quantification using the chou-talalay method. Cancer Res. 2010;70:440. doi: 10.1158/0008-5472.CAN-09-1947. [DOI] [PubMed] [Google Scholar]
  • 42.Preuer K., Lewis R.P.I., Hochreiter S., Bender A., Bulusu K.C., Klambauer G. DeepSynergy: Predicting anti-cancer drug synergy with deep learning. Bioinformatics. 2018;34:1538–1546. doi: 10.1093/bioinformatics/btx806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Zhu Y., Zhou Y., Liu Y., Wang X., Li J. SLGNN: Synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics. 2023:39. doi: 10.1093/bioinformatics/btad015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Calinski R., Harabasz J. A dendrite method for cluster analysis. Commun Stat. 1974;3:1–27. [Google Scholar]
  • 45.Kim D.E., Dolle M.E.T., Vermeij W.P., Gyenis A., Vogel K., Hoeijmakers J.H.J., et al. Deficiency in the DNA repair protein ERCC1 triggers a link between senescence and apoptosis in human fibroblasts and mouse skin. Aging Cell. 2020;19 doi: 10.1111/acel.13072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Raimondi M., Cesselli D., Di Loreto C., La Marra F., Schneider C., Demarchi F. USP1 (ubiquitin specific peptidase 1) targets ULK1 and regulates its cellular compartmentalization and autophagy. Autophagy. 2019;15:613–630. doi: 10.1080/15548627.2018.1535291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Liu E.Y., Xu N., O'Prey J., Lao L.Y., Joshi S., Long J.S., et al. Loss of autophagy causes a synthetic lethal deficiency in DNA repair. Proc Natl Acad Sci U S A. 2015;112:773–778. doi: 10.1073/pnas.1409563112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Akimov Y., Aittokallio T. Re-defining synthetic lethality by phenotypic profiling for precision oncology. Cell Chem Biol. 2021;28:246–256. doi: 10.1016/j.chembiol.2021.01.026. [DOI] [PubMed] [Google Scholar]
  • 49.Jin R., Peng L., Shou J., Wang J., Jin Y., Liang F., et al. EGFR-mutated squamous cell lung cancer and its association with outcomes. Front Oncol. 2021;11:680804. doi: 10.3389/fonc.2021.680804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Wu Q., Luo W.X., Li W., Wang T., Huang L., Xu F. First-generation EGFR-TKI Plus chemotherapy versus EGFR-TKI alone as first-line treatment in advanced NSCLC With EGFR activating mutation: A systematic review and meta-analysis of randomized controlled trials. Front Oncol. 2021;11 doi: 10.3389/fonc.2021.598265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Saxton R.A., Sabatini D.M. mTOR signaling in growth, metabolism, and disease. Cell. 2017;168:960–976. doi: 10.1016/j.cell.2017.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hua H., Kong Q., Zhang H., Wang J., Luo T., Jiang Y. Targeting mTOR for cancer therapy. J Hematol Oncol. 2019;12:71. doi: 10.1186/s13045-019-0754-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Butler L.M., Perone Y., Dehairs J., Lupien L.E., de Laat V., Talebi A., et al. Lipids and cancer: Emerging roles in pathogenesis, diagnosis and therapeutic intervention. Adv Drug Deliv Rev. 2020;159:245–293. doi: 10.1016/j.addr.2020.07.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Wegiel B., Vuerich M., Daneshmandi S., Seth P. Metabolic switch in the tumor microenvironment determines immune responses to anti-cancer therapy. Front Oncol. 2018;8:284. doi: 10.3389/fonc.2018.00284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Eltayeb K., La Monica S., Tiseo M., Alfieri R., Fumarola C. Reprogramming of lipid metabolism in lung cancer: An overview with focus on EGFR-mutated non-small cell lung cancer. Cells. 2022:11. doi: 10.3390/cells11030413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Nguyen P.A., Chang C.C., Galvin C.J., Wang Y.C., An S.Y., Huang C.W., et al. Statins use and its impact in EGFR-TKIs resistance to prolong the survival of lung cancer patients: A Cancer registry cohort study in Taiwan. Cancer Sci. 2020;111:2965–2973. doi: 10.1111/cas.14493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Danesh Pazhooh R., Rahnamay Farnood P., Asemi Z., Mirsafaei L., Yousefi B., Mirzaei H. mTOR pathway and DNA damage response: A therapeutic strategy in cancer therapy. DNA Repair (Amst) 2021;104:103142. doi: 10.1016/j.dnarep.2021.103142. [DOI] [PubMed] [Google Scholar]
  • 58.Ma Y., Vassetzky Y., Dokudovskaya S. mTORC1 pathway in DNA damage response. Biochim Biophys Acta Mol Cell Res. 2018;1865:1293–1311. doi: 10.1016/j.bbamcr.2018.06.011. [DOI] [PubMed] [Google Scholar]
  • 59.Manley S., Ni H.M., Kong B., Apte U., Guo G., Ding W.X. Suppression of autophagic flux by bile acids in hepatocytes. Toxicol Sci. 2014;137:478–490. doi: 10.1093/toxsci/kft246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Button R.W., Vincent J.H., Strang C.J., Luo S. Dual PI-3 kinase/mTOR inhibition impairs autophagy flux and induces cell death independent of apoptosis and necroptosis. Oncotarget. 2016;7:5157–5175. doi: 10.18632/oncotarget.6986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Du J., Zhou Y., Li Y., Xia J., Chen Y., Chen S., et al. Identification of Frataxin as a regulator of ferroptosis. Redox Biol. 2020;32:101483. doi: 10.1016/j.redox.2020.101483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Parrish P.C.R., Thomas J.D., Gabel A.M., Kamlapurkar S., Bradley P.K., Berger A.H. Discovery of synthetic lethal and tumor suppressor paralog pairs in the human genome. Cell Rep. 2021;36:109597. doi: 10.1016/j.celrep.2021.109597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Cereda M., Ciccarelli M.T.P., Genetic Redundancy F.D. Functional compensation, and cancer vulnerability. Trends Cancer. 2016;2:160–162. doi: 10.1016/j.trecan.2016.03.003. [DOI] [PubMed] [Google Scholar]
  • 64.Levantini E., Maroni G., Del Re M., Tenen D.G. EGFR signaling pathway as therapeutic target in human cancers. Semin Cancer Biol. 2022;85:253–275. doi: 10.1016/j.semcancer.2022.04.002. [DOI] [PubMed] [Google Scholar]
  • 65.Murugan A.K. mTOR: Role in cancer, metastasis and drug resistance. Semin Cancer Biol. 2019;59:92–111. doi: 10.1016/j.semcancer.2019.07.003. [DOI] [PubMed] [Google Scholar]
  • 66.Wu T., Qin Z., Tian Y., Wang J., Xu C., Li Z., et al. Recent developments in the biology and medicinal chemistry of CDK9 inhibitors: An update. J Med Chem. 2020;63:13228–13257. doi: 10.1021/acs.jmedchem.0c00744. [DOI] [PubMed] [Google Scholar]
  • 67.Ma H., Seebacher N.A., Hornicek F.J., Duan Z. Cyclin-dependent kinase 9 (CDK9) is a novel prognostic marker and therapeutic target in osteosarcoma. EBioMedicine. 2019;39:182–193. doi: 10.1016/j.ebiom.2018.12.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Zhu H., Gao H., Ji Y., Zhou Q., Du Z., Tian L., et al. Targeting p53-MDM2 interaction by small-molecule inhibitors: learning from MDM2 inhibitors in clinical trials. J Hematol Oncol. 2022;15:91. doi: 10.1186/s13045-022-01314-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Barabasi A.L., Gulbahce N., Loscalzo J. Network medicine: A network-based approach to human disease. Nat Rev Genet. 2011;12:56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data 1
mmc1.docx (3.3MB, docx)

Data Availability Statement

The dataset and code supporting the conclusions of this article is available in the GitHub repository https://github.com/Wingswang728/KDDSL.


Articles from Journal of Advanced Research are provided here courtesy of Elsevier

RESOURCES