Abstract
Synthetic lethality (SL) is a promising gene interaction for cancer therapy. Recent SL prediction methods integrate knowledge graphs (KGs) into graph neural networks (GNNs) and employ attention mechanisms to extract local subgraphs as explanations for target gene pairs. However, attention mechanisms often lack fidelity, typically generate a single explanation per gene pair, and fail to ensure trustworthy high-order structures in their explanations. To overcome these limitations, we propose Diverse Graph Information Bottleneck for Synthetic Lethality (DGIB4SL), a KG-based GNN that generates multiple faithful explanations for the same gene pair and effectively encodes high-order structures. Specifically, we introduce a novel DGIB objective, integrating a determinant point process constraint into the standard information bottleneck objective, and employ 13 motif-based adjacency matrices to capture high-order structures in gene representations. Experimental results show that DGIB4SL outperforms state-of-the-art baselines and provides multiple explanations for SL prediction, revealing diverse biological mechanisms underlying SL inference.
Keywords: synthetic lethality, machine learning explainability, graph neural network, information bottleneck
Introduction
Synthetic lethality (SL) is a promising type of genetic interaction where the co-occurrence of two (or more) genetic events leads to cell death, while the occurrence of either event is compatible with cell viability. SL has become a cornerstone of anticancer drug research, by targeting a gene that is nonessential in normal cells but synthetic lethal with a gene with cancer-specific alterations, which would enable the selective killing of cancer cells without harming normal cells. For example, AZD1775, a WEE1 Inhibitor, is based on the SL interaction between WEE1 and p53 mutations [1]. Despite extensive research on SL through high-throughput wet-lab screening methods, these methods often face various challenges, such as high costs and inconsistencies across platforms. Thus, predicting SL using computational models becomes highly complementary to wet-lab approaches.
SL prediction approaches can be broadly categorized into statistical inference methods, network-based methods, and supervised machine learning (ML) methods. Among these, graph neural networks (GNNs) are currently the most popular model, largely owing to their ability to model complex gene interactions [2]. Although many SL gene pairs have been identified, few of them have been applied to cancer treatment, as understanding the underlying biological mechanisms remains a critical challenge. Unfortunately, most GNNs lack the capability to explain SL mechanisms. To address this, methods incorporating attention mechanisms and knowledge graphs (KGs), a heterogeneous graph containing biological entities and their relationships, have emerged [2–5]. These approaches enable the identification of crucial edges or semantic features in KGs while predicting SL interaction.
Although KG-based methods with attention mechanisms improve the interpretability of SL predictions, they still face three major challenges. First, explanations based on the attention mechanisms often lack reliability, since they tend to assign higher weights to frequent edges and produce unstable explanations across independent runs of the same model [6–10]. As illustrated by the examples in Fig. 1, the gray subgraph, predicted by attention-based methods, includes a red dashed edge labeled “repair.” This edge, irrelevant to the SL mechanism, is assigned higher importance due to its frequent occurrence in the KG. Second, existing KG-based methods generate only a single core subgraph to explain predictions for a given gene pair, even though multiple subgraphs may provide valid explanations [11]. As illustrated in Fig. 1, the purple subgraph highlights a mechanism where single-strand break (SSB) converts to double-strand break (DSB), while the blue subgraph represents replication fork blocking. Both subgraphs explain the SL interaction between PARP1 and BRCA [11]. Third, the high-order structures contained in the explanations generated by KG-based methods are often untrustworthy, since the key step of these self-explainable methods, learning gene representation for prediction, cannot capture the information of the interactions between the neighbors (high-order), although the information between a gene and its neighbors can be effectively captured (low-order). For instance, as shown in Fig. 1, the “DAN damage” node representation produced by KG-based methods remains unchanged, regardless of the high-order edge “HR
Trapped replication fork.” We thus ask: for a gene pair, how to find multiple rather than one faithful core subgraphs and encode their high-order graph information for prediction?
Figure 1.

Toy example of a KG with self-loops integrating biological context and relevant mechanisms between the given gene pair BRCA1 and PARP1. The purple and blue subgraphs illustrate mechanisms where either the conversion of SSBs to DSBs or the blockage of replication forks leads to DNA damage in the absence of HR. The gray subgraph represents the predicted core subgraph of an attention-based method. A GIB-based method identifies only one correct subgraph, while our DGIB4SL can find all correct subgraphs (purple and blue). The self-loops are not depicted for brevity.
Our main contribution lies in addressing this question by proposing the Diverse Graph Information Bottleneck for Synthetic Lethality (DGIB4SL), an interpretable GNN model for SL prediction on KGs, hinging on the motif-based GNN encoder and our proposed DGIB objective. First, to alleviate instability and the bias toward frequent edges in attention weights, unlike the cross-entropy loss commonly used in attention-based methods, DGIB4SL employs the GIB principle [12], widely applied in interpretable GNNs [13, 14], to define a core subgraph from the neighborhood of a gene pair. GIB provides a principled objective function for graph representation learning (GRL), determining which data aspects to preserve and discard [15]. However, the standard GIB objective identifies only a single core subgraph for each gene pair. To capture all relevant core subgraphs from the enclosing graph, such as the purple and blue subgraphs in Fig. 1, we propose the novel Diverse GIB (DGIB) objective function, which incorporates a determinant point process (DPP) [16] constraint into GIB. DPP quantifies diversity by measuring differences between core subgraphs through the determinant of the inner product of their subgraph representations. Second, to encode both high-order and low-order pair-wise structural information from the candidate core subgraphs for prediction, DGIB4SL employs a motif-based GNN encoder [17]. Specifically, it uses 13 motif-based adjacency matrices to capture the high-order structure of a gene pair’s neighborhood, followed by a GNN with injective concatenation to combine motif-wise representations and produce the final representation of the core graph. We summarize our key contributions in the following.
We employ the GIB principle to define a core subgraph, providing a principled alternative to attention weights, which often exhibit instability and bias toward frequent edges.
We extend the GIB objective to handle data with multiple core subgraphs, resulting in DGIB, which serves as the objective for our DGIB4SL model.
We use a motif-basd GNN encoder in DGIB4SL to capture both low- and high-order structures in node neighborhoods, ensuring reliable high-order structures in explanations.
Experimental results demonstrate that our DGIB4SL outperforms state-of-the-art methods in both accuracy and explanation diversity.
Related work
SL prediction methods can be categorized into three types: statistical inference methods, network-based methods, and supervised ML methods [18]. Statistical methods [19–22], such as ASTER [22], rely on predefined biological rules, which limit their applicability in complex systems due to strong underlying assumptions [23–25]. Network-based approaches [26–28], such as iMR-gMCSs [28], improved reproducibility by analyzing pathway level interactions. However, their performance is often limited by noise and incomplete data. With advancements in ML, supervised techniques such as SVM [29] and RF [30], and their combination [31] have been developed to facilitate feature selection using manually crafted biological features. However, their dependency on manual feature engineering poses the risk of overlooking critical interactions. SL
MF [23] advances SL prediction by decomposing SL networks into matrices, offering a structured approach. However, its reliance on linear matrix decomposition struggles to capture the inherent complexity of SL networks. To overcome these limitations, deep learning methods [32–41] are developed. For example, DDGCN [32], the first GNN-based model, employs GCNs with dual-dropout to mitigate SL data sparsity. Similarly, MPASL [39] improves gene representations by capturing SL interaction preferences and layer-wise differences on heterogeneous graphs. Although many SL gene pairs have been identified, few of them have been applied to cancer treatment. Understanding the underlying biological mechanisms is crucial for developing SL-based cancer therapies. Unfortunately, most ML models lack the capability to fully explain SL mechanisms. To address this, methods incorporating prior knowledge into the above models through KG have been proposed [2–5, 38]. Most of these methods utilize attention mechanisms to identify important edges [2, 3], paths [4], or factors (subsets of relational features) [5] within KG to explain the mechanisms underlying SL. For example, KR4SL [4] encodes structural information, textual semantic information, and sequential semantics to generate gene representations and leverages attention to highlight important edges across hops to form paths as explanations. Similarly, SLGNN [5] focuses exclusively on KG data for factor-based gene representation learning, where relational features in the KG constitute factors, and attention weights are used to identify the most significant ones. However, attention weights are often unstable, frequently assigning higher weights to frequent edges [10], and typically provide only a single explanation per sample. Additionally, these methods struggle to capture high-order structures for prediction. To address these issues, DGIB4SL replaces attention mechanisms with graph information bottlenecks (IBs) to identify key edges and employs motif-based encoders along with DPP to encode high-order structures and generate multiple explanations. For further details on explainability in GNNs, please refer to our Supplementary Materials.
Preliminaries
Notations and problem formulation
An undirected SL graph is denoted by
, with the set of nodes (or genes)
, the set of edges or SL interactions
, and the node feature matrix
. In addition to the SL interactions, we also have external knowledge about the functions of genes. We represent this information as a directed KG
and let
denote the node features associated with
, where
is a set of entities, and
is a set of relations. To achieve the goals outlined later, we define
, where
is the adjacency matrix,
is the node feature matrix, and
is the edge feature matrix. Graph
represents the directed joint graph of
and
, constructed by mapping genes from
to entities in
, and adding edges labeled “SL” for corresponding gene pairs based on
. We use
to represent the element at the
th row and the
th column of a matrix
, and
to represent the
th row of the matrix. A comprehensive list of the mathematical notations used in this paper is provided in Table S1 in the Supplementary Materials.
In this paper, we investigate the problem of interpretable SLprediction, which aims to extract a local subgraph around two target genes for link prediction, potentially in an end-to-end fashion. Formally, given the joint graph
that combines the SL graph and the KG, and a pair of genes
and
, we first collect their
-hop neighborhoods from
for each gene,
,
, where
denotes the shortest distance between two nodes. We then take the intersection of nodes between their neighborhoods to construct a pairwise enclosing graph [3]
. Our goal is to learn a function
, which maps the enclosing graphs
to optimized subgraph
, and learn a binary classifier
parameterized by
for SL prediction based on the optimized subgraph
.
Information bottleneck
In ML, it is crucial to determine which parts of the input data should be preserved and which should be discarded. IB [42] offers a principled approach for addressing this challenge by compressing the source random variable to keep the information relevant for predicting the target random variable and discarding target-irrelevant information.
Definition 1
(IB). Given random variables
and
, the IB principle aims to compress
to a bottleneck random variable
while keeping the information relevant for predicting
:
(1) where
is a Lagrangian multiplier to balance the two mutual information terms.
Recently, the IB principle has been applied to learn a bottleneck graph named IB-Graph for the input graph [12], which keeps minimal sufficient information in terms of the graph’s data. In our context of SL prediction, the IB-Graph is defined as follows.
Definition 2
(IB-Graph). For an enclosing graph
![]()
around a pair of genes
and
and the associated label information
, the optimal subgraph
found by IB principle is called IB-Graph if
(2) where
and
are the task-relevant adjacency matrix and the node feature matrix of
, respectively.
Intuitively, GIB (Equation 2) aims to learn the core subgraph of the input graph
, discarding information from
by minimizing the mutual information
, while preserving target-relevant information by maximizing the mutual information
.
Methods
Overview
In this section, we present DGIB4SL, an interpretable SL prediction framework that incorporates a diversity constraint using DPP and GIB objective to generate multiple explanations called IB-graphs for the same gene pair. The framework consists of three key components: IB objective formulation, motif-based DGIB estimation, and prediction. First, we introduce our DGIB objective and derive its tractable upper bound. Next, given that most of the existing IB estimation approaches fail to capture high-order structure information, we propose a novel motif-based DGIB estimation method, which involves three phases: IB-graph learning through random noise injection to select significant edges, GRL, and prediction, as shown in Fig. 2(a)–(c). In the GRL phase, we employ the motif-wise representation learning method [17] to implement the GNN module in Fig. 2(b), enabling the capture of high-order structures in IB-graph, as illustrated in Fig. 2(d).
Figure 2.
Overview of DGIB4SL. DGIB4SL takes the enclosing graph data
around genes
and
as inputs, throughout the phases (a),(b), and (c), and outputs the interaction confidence of the gene pair
and
IB-graphs
,...,
that captures the high-order graph structure. In phase (a), an IB-graph
is generated by injecting random noise to select important edges, with edge weights
estimated from
using the edge weight estimation module (Eq. S11).
serves as the parameter for a multidimensional Bernoulli distribution, from which an adjacency matrix of
is sampled. In phase (b), IB-graph representations are learned via variational estimation. Each IB graph data
is passed through the same motif-based GNN
(Eq. 10) to obtain a distribution from which a representation
is sampled. The motif-based GNN, shown in subfigure (d), projects the IB-graph into 13 motif-based matrices. Each motif-based matrix
is processed by a different GIN encoder to produce motif-wise representations, which are then concatenated (Eq. 15). In phase (c), each IB-graph representation is passed through an Multilayer Perceptrons (MLP)-based classifier to make
predictions (Eq. 11). During training, the representations and predictions are used to compute DPP and GIB, which are jointly optimized in DGIB4SL.
Diverse graph information bottleneck
We now present our main result, which demonstrates how to generate
different IB-graphs for any gene pair
, denoted as
. We first reduce this problem to a special case of the subset selection problem where diversity is preferred, i.e. the problem of balancing two aspects: (i) each selected subgraph should satisfy the definition of an IB-Graph; and (ii) the selected subgraphs should be diverse as a group so that the subset is as informative as possible.
DPP [16] is an elegant and effective probabilistic model designed to address one key aspect of the above problem: diversity. Formally, let
denote the set of all possible subgraphs of a graph
. A point process
defined on the ground set
is a probability measure over the power set of
.
is called a DPP if, for any subset
, the probability of selecting this subset is given by
![]() |
(3) |
where
represents the determinant of a given matrix and
is a real, positive semidefinite matrix and thus there exists a matrix
such that
![]() |
(4) |
where
is the graph representation of the
th IB-graph
and
denotes the GRL module. More details about the
module and intuitions about the ability of Equation 3 to measure diversity are provided in Eqs. 9–12 and Supplementary Materials, respectively.
To learn
different subgraphs from the enclosing graph
for the gene pair
that balance diversity with the IB-graph definition, we introduce the Diverse Graph Information Bottleneck (DGIB) objective function, formulated as follows:
![]() |
(5) |
where
is a Lagrangian multiplier to trade off GIB and DPP terms. Intuitively, the GIB term focuses on learning multiple IB-graphs from the input graph
, while the DPP term ensures that these IB-graphs are as different as possible.
Due to the non-Euclidean nature of graph data and the intractability of mutual information, it is challenging to optimize the DGIB objective in Equation 5 directly. Therefore, we adopt the approach of Sun et al. [43] to derive tractable variational upper bounds for
and
:
![]() |
(6) |
Detailed proof of Equation 6 is given in Supplementary Materials.
Remark 1.
Each explanation or IB-graph generated by DGIB4SL consists of a single core subgraph rather than a combination of multiple core subgraphs. In datasets with tens of thousands of gene pairs, as used in our experiments, individual core subgraphs are more likely to be shared across different enclosing graphs than combinations of multiple core subgraphs. This is because the probability of a specific combination being repeatedly shared decreases exponentially with its complexity. In contrast, the structural simplicity of individual core subgraphs makes them more likely to be shared. Minimizing the compression term in DGIB allows DGIB4SL to select individual core subgraphs with higher shared frequency.
High-order motif-based DGIB estimation
We now address another key question of this work: how to compute the DGIB upper bound in Equation 6 without losing the high-order information, which is crucial for generating trustworthy explanations. For instance, in a KG, a gene’s functional relevance often depends on high-order structures, such as cooperative pathways or shared regulatory targets among its neighbors. Ignoring these structures can result in misleading explanations. To overcome this, we propose a novel high-order motif-based DGIB estimation method DGIB4SL.
Mutual information estimation
We first outline the general procedure for estimating the DGIB upper bound defined in Equation 6, which is largely analogous to previous work [43, 44]. This procedure involves learning the
th IB-graph
from the enclosing graph
and driving its representation
through a GRL function (
), such that
, assuming no information is lost during this transformation. Under this assumption,
,
. Consequently, the DGIB upper bound, which DGIB4SL aims to minimize, is expressed as
![]() |
(7) |
To calculate Equation 7, we follow a two-step process. In Step 1, we estimate a IB-graph
based on all the subgraphs from
. In Step 2, we implement the
function to infer the graph representation
of
and feed
into Equation 7.
(Step 1: IB-graph
learning) We compress the information of
via noise injection to estimate the
th IB-graph
. To construct
, we model all potential edges of the subgraph as mutually independent Bernoulli random variables. The parameters of these variables are determined by the learned important weights
, where
denotes the set of entities in
:
![]() |
(8) |
where
represents the importance weight or sampling probability for the entity pair
. The computation of
(corresponding to
in Fig. 1) is jointly optimized with relational graph learning, following the approach of Wang et al. [45]. Further details are provided in Supplementary Materials. To sample the IB-graph, we employ the concrete relaxation [46] for the Bernoulli distribution. Additionally, we construct
to be the same as
since no nodes are removed during the construction of
. An example of the IB-graph construction is provided in the Supplementary Materials.
(Step 2: IB-GRL and prediction) Using the previously constructed
, we compute the prediction, diversity, and KL terms in Equation 7 by implementing
through variational inference. For the KL term
, we treat the prior
and the posterior
as parametric Gaussians and thus this term has an analytic solution:
![]() |
(9) |
where the outputs
and
represent the mean vector and the diagonal covariance matrix of the distribution for the graph embedding
of
, respectively. We model
as a GNN parameterized by the weights
with
-dimensional output and a readout or pooing operator. The first
dimensions of this GNN’s output correspond to
, while the remaining
dimensions correspond to
, formally expressed as
![]() |
(10) |
We treat
as a fixed
-dimensional spherical Gaussian distribution. To compute the prediction term
, we adopt the equivalent cross entropy loss function
. The conditional distribution
is implemented using a two-layer perceptron in this work, parameterized by trainable weights
, as described below.
![]() |
(11) |
Finally, for the diversity term
(Equation 7, Equation 4), the matrix
is constructed by arranging the
IB-graph representations as its rows. Specifically,
![]() |
(12) |
Generating high-order graph representation via motif
Most methods struggle to satisfy the no-information-loss assumption of the above mutual information estimation framework, since their GNN implementation in Eq.10 often fails to capture the high-order structure of the estimated explanation. Inspired by MGNN [17], we reduce this to a problem of enhancing the model’s representation power beyond the one-dimensional Weisfeiler–Leman (
-WL) graph isomorphism test [47]. Specifically, the 1-WL test distinguishes graph structures by iteratively compressing node neighborhood information into unique labels, making it a widely recognized tool for evaluating the expressive power of GNNs [17, 47, 48].
We first formalize three key definitions underlying our approach, starting with the notion of a network motif.
Definition 3
(Network motif). A motif is a connected graph of
nodes (
), with a
adjacency matrix
containing binary elements
.
Let
denote different motifs and
represent the corresponding associated matrices. An example of all possible three-node motifs is shown in Fig. 3. Chen et al. [17] demonstrated that three-node motifs are sufficiently expressive to capture graph structures. Thus, we only use motifs with three nodes in this work.
Figure 3.

All three-node motifs in a directed and unweighted graph.
Definition 4
(Motif set). The motif set of a three-node motif
in a directed graph
is defined by
(13) where
is a tuple containing three node indices and
is the
adjacency matrix of the subgraph induced by
.
For example, the motif set of
in Fig. 1 can be
. Based on the motif set, we define the operator
to transform an ordered tuple into an unordered set, e.g.
. Using this operator, the motif-based adjacency matrix is defined as follows.
Definition 5
(Motif-based adjacency matrix). For a graph
, a motif-based adjacency matrix
of
in terms of a given motif
is defined by
(14)
Intuitively,
denotes the number of times nodes
and
are connected through an element of the given motif set
. The roles of these definitions will be discussed in Equation 15.
To generate graph embeddings with greater expressive power than the
-WL test, Chen et al. [17] demonstrated that associating node or graph embeddings with different motif structures and combining these embeddings using an injective function effectively captures high-order and low-order graph structure. Specifically, we use a two-layer GIN [47] as the underlying GNN and employ different GINs to encode the structure of 13 motifs in
, producing node embeddings through motif-based adjacency matrices. Then, the motif-wise embeddings are combined via injective concatenation. Mathematically, we construct the GNN module in Eq.10 as
![]() |
(15) |
where
are the motif-based adjacency matrix of
in terms of a given motif
and ∥ denotes a concatenation function. In summary, Equation 15 preserves high-order and low-order pair-wise structural information when calculating DGIB, enhancing the reliability of the high-order structure in an IB-graph.
Results
Experimental setup
Datasets and baselines
To evaluate the effectiveness of our DGIB4SL, we utilized the dataset provided by the Synthetic Lethality Benchmark (SLB) [49]. The dataset is collected from SynLethDB 2.0, a comprehensive repository of SL data, and includes 11 types of entities and 27 relationships. It contains 35 913 human SL gene pairs involving 9845 genes, along with a KG named SynLethKG, which comprises 54 012 nodes and 2233 172 edges. Additional details on SynLethKG can be found in Tables S2–S3 in Supplementary Materials.
We evaluated two categories of methods, selecting 13 recently published methods. These include three matrix factorization (MF)-based methods: GRSMF [25], SL2MF [23], and CMFW [24], and 10 GNN-based methods: DDGCN [32], GCATSL [35], SLMGAE [36], MGE4SL [34], PTGNN [37], KG4SL [2], PiLSL [3], NSF4SL [38], KR4SL [4], and SLGNN [5]. Among these, KG4SL, PiLSL, NSF4SL, KR4SL, and SLGNN integrate KGs into the generation of node representations. Detailed descriptions of these baselines can be found in Supplementary Materials.
Implementation details
We evaluated our method using five-fold cross-validation by splitting the gene pairs and using four ranking metrics: Normalized Discounted Cumulative Gain (NDCG@C), Recall@C, Precision-@C, and Mean Average Precision (MAP@C). NDCG@C measures the positioning of known SL gene pairs within the model’s predicted list, while Recall@C and Precision@C assess the model’s ability to identify relevant content and rank the top C results accurately, respectively. MAP@C provides a comprehensive evaluation by combining precision and ranking across multiple queries, averaging the precision at each relevant prediction up to the Cth position. In this study, we evaluated these metrics using the top C=10 and top C=50 predictions. The coefficients
and
in Equation 6 were set
. More details on data preprocessing, hyperparameters settings for DGIB4SL, and baseline implementations are provided in Supplementary Materials.
Performance evaluation
We evaluated the empirical performance of DGIB4SL against state-of-the-art baselines, as summarized in Table 1 and Table 2. Baseline performance was referenced from the public leaderboards provided by SLB [49], except for KR4SL, which was based on our experimental results. As shown in Table 1 and Table 2, DGIB4SL consistently outperformed all baselines on the SynLethDB 2.0 dataset [50]. Specifically, KR4SL achieved the second-best performance on NDCG@50, Recall@10, Precision@10, Precision@50, MAP@10, and MAP@50, while PiLSL and NSF4SL achieved the second-best performance on NDCG@10 and Recall@50, respectively. Our DGIB4SL further improved over KR4SL by 9.9
, 26.7
, 10.6
, 8.0
, 6.0
, and 5.5
on NDCG@50, Recall@10, Precision@10, Precision@50, MAP@10, and MAP@50, respectively, and outperformed PiLSL and NSF4SL by 11.5
and 14.2
in NDCG@10 and Recall@50, respectively. From these results, we draw the following conclusions: (i) the competitive performance of DGIB4SL and KG-based baselines highlights the value of KGs in providing biological context for gene related label prediction. (ii) The integration of motifs significantly enhances model performance by expanding the receptive field and encoding high-order edges into predictions effectively.
Table 1.
Performance of various methods in terms of NDCG and Recall under five-fold cross-validation. Values in parentheses indicate paired
-test
-values comparing baselines with DGIB4SL.
| NDCG@10 | NDCG@50 | Recall@10 | Recall@50 | |
|---|---|---|---|---|
| GRSMF | 0.2844 ( ) |
0.3153 ( ) |
0.3659 ( ) |
0.4460 ( ) |
SL MF |
0.2807 ( ) |
0.3110 ( ) |
0.2642 ( ) |
0.3401 ( ) |
| CMFW | 0.2390 ( ) |
0.2744 ( ) |
0.3257 ( ) |
0.4097 ( ) |
| DDGCN | 0.1568 ( ) |
0.1996 ( ) |
0.2379 ( ) |
0.3447 ( ) |
| GCATSL | 0.2642 ( ) |
0.2976 ( ) |
0.3363 ( ) |
0.4203 ( ) |
| SLMGAE | 0.2699 ( ) |
0.3160 ( ) |
0.3198 ( ) |
0.4421 ( ) |
| MGE4SL | 0.0028 ( ) |
0.0071 ( ) |
0.0020 ( ) |
0.0085 ( ) |
| PTGNN | 0.2358 ( ) |
0.2740 ( ) |
0.3361 ( ) |
0.4323 ( ) |
| KG4SL | 0.2505 ( ) |
0.2853 ( ) |
0.3347 ( ) |
0.4253 ( ) |
| PiLSL |
0.5166 ( ) |
0.5175 ( ) |
0.3970 ( ) |
0.4021 ( ) |
| NSF4SL | 0.2279 ( ) |
0.2706 ( ) |
0.3526 ( ) |
0.4624 ( ) |
| KR4SL | 0.5105 ( ) |
0.5248 ( ) |
0.4131 ( ) |
0.4135 ( ) |
| SLGNN | 0.1468 ( ) |
0.2004 ( ) |
0.2154 ( ) |
0.3717 ( ) |
| DGIB4SL | 0.5760 | 0.5766 | 0.5233 | 0.5280 |
Table 2.
Performance of various methods in terms of Precision and MAP under five-fold cross-validation.
| Precision@10 | Precision@50 | MAP@10 | MAP@50 | |
|---|---|---|---|---|
| GRSMF | 0.3683 ( ) |
0.4461 ( ) |
0.2568 ( ) |
0.2521 ( ) |
SL MF |
0.2694 ( ) |
0.3407 ( ) |
0.2861 ( ) |
0.2769 ( ) |
| CMFW | 0.3267 ( ) |
0.4098 ( ) |
0.2043 ( ) |
0.2069 ( ) |
| DDGCN | 0.2385 ( ) |
0.3447 ( ) |
0.1280 ( ) |
0.1321 ( ) |
| GCATSL | 0.3372 ( ) |
0.4204 ( ) |
0.2354 ( ) |
0.2382 ( ) |
| SLMGAE | 0.3222 ( ) |
0.4422 ( ) |
0.2514 ( ) |
0.2469 ( ) |
| MGE4SL | 0.0022 ( ) |
0.0085 ( ) |
0.0018 ( ) |
0.0024 ( ) |
| PTGNN | 0.3372 ( ) |
0.4324 ( ) |
0.1948 ( ) |
0.1975 ( ) |
| KG4SL | 0.3357 ( ) |
0.4254 ( ) |
0.2175 ( ) |
0.2208 ( ) |
| PiLSL | 0.4098 ( ) |
0.4035 ( ) |
0.5153 ( ) |
0.5149 ( ) |
| NSF4SL | 0.3563 ( ) |
0.4626 ( ) |
0.1881 ( ) |
0.1818 ( ) |
| KR4SL |
0.4845 ( ) |
0.4901 ( ) |
0.5175 ( ) |
0.5200 ( ) |
| SLGNN | 0.2172 ( ) |
0.3718 ( ) |
0.1259 ( ) |
0.1252 ( ) |
| DGIB4SL | 0.5359 | 0.5294 | 0.5485 | 0.5484 |
Explanation evaluation
Qualitative analysis
Leveraging the DGIB mechanism (Equation 6), our DGIB4SL not only predicts SL interactions but also provides
explanations that reveal the biological mechanisms underlying the predictions for the same gene pair. For this case study, we selected the SL pair BRCA1 and E2F1 from the test data, where the predicted interaction between BRCA1 and E2F1 matched the actual label. To remove unimportant edges from the enclosing core graphs of (BRCA1, E2F1), we applied edge sampling probabilities with thresholds 0.58 and 0.76 for the first and second core subgraphs distribution (Equation 8), respectively. Edges with probabilities exceeding these thresholds (
,
) were retained. The filtered core graphs are shown in Fig. 4(a) and (b).
Figure 4.

Two explanations learned from our DGIB4SL provide different insights into the biological mechanisms underlying SL of the same gene pair (BRCA1, E2F1). For details on the edge nomenclature, please refer to Table S4 in Supplementary Materials.
We first analyzed the first core subgraph (Fig. 4(a)). The first core subgraph highlights two key mechanisms of SL between BRCA1 and E2F1: (1) homologous recombination (HR) deficiency due to BRCA1 mutation: pathway “BRCA1
TERF2
ATM
HR
RAD51
TIMELESS” indicates that BRCA1 mutation inactivates the HR pathway. This leaves DSBs, converted from unresolved SSBs, unrepaired. (2) SSB repair pathway blockage: the pathways “E2F1
PARP1
Regulation of DSB repair” and “Regulation of SSB repair
PARP1” demonstrate that E2F1 mutation weakens both SSB and DSB repair functions. These combined defects in SSB repair and HR result in unrepairable DNA damage, genomic instability, and ultimately cell death. Previous studies [51] have shown that E2F1 depletion impairs HR, disrupting DNA replication and causing DNA damage, further supporting these findings.
We then analyzed the second core subgraph (Fig. 4(b)). The second core subgraph identifies a different mechanism, centered around replication fork blockage, while maintaining the shared premise of HR repair pathway loss. Specifically, the pathway “E2F1
TIMELESS
Replication Fork Arrest” reveals that E2F1 mutation destabilizes replication forks, leading to stalled replication. TIMELESS, a downstream target of E2F1, plays a critical role in stabilizing replication forks during DNA replication stress.
Quantitative analysis
We evaluated the Infidelity [52] and Sparseness [53] (see supplementary materials for these metrics descriptions) and used the DPP to evaluate the diversity of explanations generated by DGIB4SL and other explainable SL prediction methods, including KG4SL, PiLSL, SLGNN, and KR4SL. To compare diversity, we introduced KR4SL
, a variant of KR4SL with a multihead attention mechanism, since the explainable baselines (except for SLGNN) generate a single explanation using similar attention mechanisms. As shown in Table 3, DGIB4SL outperforms other methods in terms of Infidelity, Sparseness, and DPP. We draw the following conclusions:
Table 3.
Comparison of attention weights in KG-Based SL prediction methods with explanations in DGIB4SL in terms of Fidelity, Sparsity, and Diversity. Symbols
and
, respectively, represent that larger and smaller metric values are better.
Infidelity
|
Sparseness
|
DPP
|
|
|---|---|---|---|
| KG4SL |
|
0.330 ( ) |
– |
| PiLSL |
|
0.340 ( ) |
– |
| SLGNN |
|
0.120 ( ) |
1.59 ( ) |
| KR4SL |
|
0.352 ( ) |
– |
KR4SL
|
|
0.326 ( ) |
0.48 ( ) |
| DGIB4SL |
|
0.463 | 1.67 |
Diversity: despite using multihead attention, KR4SL
showed lower DPP values, indicating that multihead attention alone has a limited capacity for generating diverse explanations. SLGNN’s DPP performance is competitive, due to the inclusion of a distance correlation regularizer that encourages independent factor embeddings, indirectly enhancing diversity.Sparsity: baselines employing similar attention mechanisms showed comparable Sparseness values, except for SLGNN, which directly uses learnable weights to estimate the importance of different relational features.
Fidelity: the Infidelity of attention-based methods is relatively low, possibly due to the inherent instability and high-frequency biases of attention mechanisms [6, 7].
Model analysis
Ablation study
As illustrated in Fig. 2, the DGIB (Equation 6), DPP constraint (third line in Equation 6), and motif-based graph encoder (Equation 15) are key components of DGIB4SL. Based on these, we derived the following variants for the ablation study: (1) DGIB4SLw/oM: DGIB4SL without motif information, to assess the impact of motifs; (2) DGIB4SLw/oB: DGIB4SL without the DGIB objective (replacing it with an attention mechanism); and (3) DGIB4SLw/oP: DGIB4SL without the DPP constraint (essentially reducing the objective to GIB). To evaluate the contributions of motifs, DGIB and DPP, we compared DGIB4SL against these variants. As shown in Fig. 5, DGIB4SL consistently outperformed DGIB4SLw/oM across all metrics, highlighting the importance of incorporating high-order structures through motifs in SL prediction. Secondly, the performance of DGIB4SLw/oB was comparable with DGIB4SL on all metrics. This result is expected, since DGIB4SLw/oB can still extract label-related input information via attention mechanisms, even if this information may not always faithfully reflect the model’s behavior. Thirdly, DGIB4SLw/oP also achieved comparable performance to DGIB4SL. This is intuitive since, without DPP constraints, DGIB4SLw/oP may find
similar explanations, which could overlap with one of the
different explanations found by DGIB4SL. To further compare their explanations, we evaluated their diversity using the DPP measure, calculated as the determinant of
,...,
(Eqs. 3 and 4). As shown in the two rightmost columns of Fig. 5, DGIB4SL produced significantly more diverse explanations compared with DGIB4SLw/oP.
Figure 5.

Ablation study of DGIB4SL for Motif, DGIB, and DPP on NDCG@10, Recall@10, Precision@10, MAP@10 (left Y-axis) and one diversity metric DPP (right Y-axis).
Convergence analysis
In this section, we analyze the convergence behavior of DGIB4SL. For clarity, we rewrite the DGIB objective function in Eq. 7 as
, where
is the binary cross-entropy loss,
is the KL-divergence loss, and
represents the DPP loss. Figure 6 illustrates the convergence trends of each component of the DGIB objective. The solid lines correspond to training set values, while the dashed lines represent testing set values. As shown in Fig. 6(a)-(b), both
and
experienced a steep decline during the initial epochs, with minimal separation between training and testing curves. This indicates rapid learning and effective generalization by the model. In Fig. 6(c),
shows negligible differences between training and testing curves, suggesting that the compressed input information allows the model to generalize effectively on the test set. In contrast, Fig. 6(d) highlights that
initially exhibits a more pronounced gap between training and testing curves. However, this gap narrows over time, demonstrating that the model learns diverse representations effectively, albeit at a slower pace compared with other loss components.
Figure 6.
Convergence of DGIB4SL: (a) learning curve of DGIB and (b)–(d): learning curve of each component of the loss of DGIB4SL.
Parameter sensitivity
We explored the impact of the Lagrangian multipliers
and
in Equation 8, the graph representation dimension
in Equation 9, and the number of explanations
generated by DGIB4SL for each gene pair on SL prediction performance. The performance trend is shown in Fig. 7. From the results, we observed the following: (1) as illustrated in Fig. 7(a)-(b), DGIB4SL’s performance was relatively insensitive to
and
. Specifically,
values in the range
and
values in the range
typically yielded robust and reliable performance. For most cases,
proved to be an optimal choice. (2) Figure 7(c) indicates that increasing
gradually improved performance, peaking around
, but began to decline afterward, likely due to overfitting. As the performance gain beyond
was modest, we opted for
to simplify the model while maintaining strong performance. (3) Figure 7(d) shows the impact of
within the approximate range of
(heuristically estimated; see supplementary materials for details). The results indicate that DGIB4SL performs stably when
, primarily due to two reasons:
Figure 7.

Parameter sensitivity analysis for DGIB4SL on Lagrangian multipliers
,
, graph embedding dimension
and the number of explanations
generated for each gene pair on performance.
When the actual number of core subgraphs
in the
th enclosing graph satisfies
, DGIB4SL only needs to identify at least one core subgraph to make accurate predictions, and the specific number of identified core subgraphs has minimal impact on the results.When
, the setting of
in DGIB biases the trade-off toward relevance (GIB) over diversity (DPP), causing DGIB4SL to prioritize generating core subgraphs relevant to the labels, even if some explanations may overlap.
For
, DGIB4SL’s performance declines, primarily due to the increased number of parameters in the relational edge weight module (Eq. S11), which leads to overfitting. We set
for DGIB4SL because most gene pairs have no more than three core subgraphs. This choice helps effectively prevent overfitting and reduce explanation overlap.
Stability analysis
To evaluate the stability of DGIB4SL and attention-based methods, we introduced noise using three distinct random seeds to compare the edge importance distributions. Specifically, we ran DGIB4SL and KG4SL three times with different random seeds. Here, KG4SL was selected as a representative of attention-based methods due to its straightforward design and interpretability. For each run, kernel density estimation [54] was applied to compute the distributions of the importance scores for each edge within the core graph of the gene pair (ACTR10, PELO). As shown in Fig. 8(a), the (unnormalized) attention weight distribution generated by KG4SL is unstable. In contrast, Fig. 8(b) and (c) show that the distributions of edge weight (i.e.
in Equation 8) generated by DGIB4SL for its two explanations largely overlap across different random seeds, demonstrating the stability of our DGIB4SL’s explainability.
Figure 8.
Edge weight distribution of DGIB4SL and KG4SL for the same gene pair (ACTR10, PELO) under different random seeds.
Conclusion and discussion
We present DGIB4SL, an interpretable SL prediction framework that ensures trustworthy and diverse explanations. DGIB4SL introduces a novel DGIB objective with a Determinantal Point Process constraint to enhance diversity and employs a motif-based strategy to capture high-order graph information. A variational upper bound is proposed to address computational challenges, enabling efficient estimation. Experimental results show that DGIB4SL outperforms all baselines on the SynLethDB 2.0 dataset.
A key limitation of DGIB4SL lies in the fixed number
for generating explanations, which may result in overlapping or incomplete explanations. Future work could explore an adaptive mechanism to dynamically adjust
for each enclosing graph. Additionally, DGIB4SL is a general framework for interaction prediction and can be applied to other domains requiring diverse and interpretable explanations, such as drug–drug interaction prediction and functional genomics research.
Key Points
We propose an interpretable knowledge GNN DGIB4SL that predicts SL interactions with diverse explanations.
We use the GIB principle to define a core subgraph of a gene pair, and extend the GIB objective to handle data with multiple core subgraphs, resulting in DGIB, which serves as the objective for DGIB4SL.
We apply motif-based GNNs to capture high-order graph structures.
The model’s effectiveness is validated through real-world data and case studies.
Supplementary Material
Acknowledgments
The authors thank the anonymous reviewers for their valuable suggestions.
Contributor Information
Xuexin Chen, School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China.
Ruichu Cai, School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China; Pazhou Laboratory (Huangpu), No. 248 Pazhou Qiaotou Street, Haizhu, Guangdong Province, Guangzhou, 510335, China.
Zhengting Huang, School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China.
Zijian Li, Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Masdar, Abu Dhabi, United Arab Emirates.
Jie Zheng, School of Information Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China; School of Information Science and Technology, Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China.
Min Wu, Institute for Infocomm Research (I2R), A*STAR, No. 2 Fusionopolis Way, Queenstown Planning, Singapore 138632, Singapore.
Conflict of interest: None declared.
Funding
This work is supported in part by funds from the National Science and Technology Major Project (2021ZD0111501), National Science Fund for Excellent Young Scholars (62122022), Natural Science Foundation of China (U24A20233), and this work is supported by the A*STAR’s Decentralised Gap Funding (I23D1AG081).
Data availability
Our source codes and pre-processed datasets are publicly available via https://github.com/CXX1113/DGIB4SL.
References
- 1. Leijen S, van Geel RMJM, Sonke GS. et al. Phase ii study of wee1 inhibitor azd1775 plus carboplatin in patients with tp53-mutated ovarian cancer refractory or resistant to first-line therapy within 3 months. J Clin Oncol 2016;34:4354–61. 10.1200/JCO.2016.67.5942 [DOI] [PubMed] [Google Scholar]
- 2. Wang S, Fan X, Li Y. et al. Kg4sl: Knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2021;37:i418–25. 10.1093/bioinformatics/btab271 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Liu X, Jiale Y, Tao S. et al. Pilsl: Pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022;38:ii106–12. 10.1093/bioinformatics/btac476 [DOI] [PubMed] [Google Scholar]
- 4. Zhang K, Min W, Liu Y. et al. Kr4sl: Knowledge graph reasoning for explainable prediction of synthetic lethality. Bioinformatics 2023;39:i158–67. 10.1093/bioinformatics/btad261 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Zhu Y, Zhou Y, Liu Y. et al. Slgnn: Synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics 2023;39:btad015. 10.1093/bioinformatics/btad015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Serrano S, Smith NA. Is attention interpretable? In: Korhonen A, Traum D, Márquez L (eds.), ACL. Florence, Italy: Association for Computational Linguistics, 2019.
- 7. Wiegreffe S, Pinter Y. Attention is not not explanation. In: Inui K, Jiang J, Ng V, Wan X (eds.), EMNLP. Hong Kong, China: Association for Computational Linguistics, 2019.
- 8. Brunner G, Liu Y, Pascual D. et al. On identifiability in transformers. In: ICLR. OpenReview.net, 2020.
- 9. Grimsley C, Mayfield E, Bursten JRS. Why attention is not explanation: Surgical intervention and causal reasoning about neural models. In: Calzolari N, Béchet F, Blache P. et al. (eds.), Proceedings of The 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, 2020.
- 10. Li Y, Sun X, Chen H. et al. Attention is not the only choice: Counterfactual reasoning for path-based explainable recommendation. IEEE Trans Knowl Data Eng 2024;36:4458–71. 10.1109/TKDE.2024.3373608 [DOI] [Google Scholar]
- 11. Helleday T. The underlying mechanism for the parp and brca synthetic lethality: Clearing up the misunderstandings. Mol Oncol 2011;5:387–93. 10.1016/j.molonc.2011.07.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wu T, Ren H, Li P. et al. Graph information bottleneck. In: Larochelle H, Ranzato M, Hadsell R, Balcan M-F, Lin H-T (eds.), NeurIPS. Vancouver, BC, Canada: Curran Associates Inc., 2022.
- 13. Sun Q, Li J, Peng H. et al. Graph structure learning with variational information bottleneck. In: AAAI. AAAI Press, 2022.
- 14. Miao S, Liu M, Li P. Interpretable and generalizable graph learning via stochastic attention mechanism. In: Chaudhuri K, Jegelka S, Song L, Szepesvári C, Niu G, Sabato S (eds.), ICML. Baltimore, Maryland, USA: PMLR, 2022.
- 15. Pan Z, Niu L, Zhang J. et al. Disentangled information bottleneck. In: AAAI. AAAI Press, 2021.
- 16. Kulesza A, Taskar B. et al. Determinantal point processes for machine learning. Found Trends Mach Learn 2012;5:123–286. 10.1561/2200000044 [DOI] [Google Scholar]
- 17. Chen X, Cai R, Fang Y. et al. Motif graph neural network. IEEE Trans Neural Networks Learn Syst 2023;35:14833–47. 10.1109/TNNLS.2023.3281716 [DOI] [PubMed] [Google Scholar]
- 18. Wang J, Zhang Q, Han J. et al. Computational methods, databases and tools for synthetic lethality prediction. Briefings Bioinform 2022;23. 10.1093/bib/bbac106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Sinha S, Thomas D, Chan S. et al. Systematic discovery of mutation-specific synthetic lethals by mining pan-cancer human primary tumor data. Nat Commun 2017;8:15580. 10.1038/ncomms15580 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Yang C, Guo Y, Qian R. et al. Mapping the landscape of synthetic lethal interactions in liver cancer. Theranostics 2021;11:9038–53. 10.7150/thno.63416 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Staheli JP, Neal ML, Navare A. et al. Predicting host-based, synthetic lethal antiviral targets from omics data. NAR Mol Med 2024;1:ugad001. 10.1093/narmme/ugad001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Liany H, Jayagopal A, Huang D. et al. Aster: A method to predict clinically relevant synthetic lethal genetic interactions. IEEE J Biomed Health Inform 2024;28:1785–96. 10.1109/JBHI.2024.3354776 [DOI] [PubMed] [Google Scholar]
-
23.
Liu Y, Wu M, Liu C. et al. Sl
mf: Predicting synthetic lethality in human cancers via logistic matrix factorization. IEEE ACM Trans Comput Biol Bioinform 2020;17:748–57. 10.1109/TCBB.2019.2909908
[DOI] [PubMed] [Google Scholar] - 24. Liany H, Jeyasekharan A, Rajan V. Predicting synthetic lethal interactions using heterogeneous data sources. Bioinformatics 2020;36:2209–16. 10.1093/bioinformatics/btz893 [DOI] [PubMed] [Google Scholar]
- 25. Huang J, Min W, Fan L. et al. Predicting synthetic lethal interactions in human cancers using graph regularized self-representative matrix factorization. BMC Bioinform 2019;20:657. 10.1186/s12859-019-3197-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Megchelenbrink W, Katzir R, Xiaowen L. et al. Synthetic dosage lethality in the human metabolic network is highly predictive of tumor growth and cancer patient survival. PNAS 2015;112:12217–22. 10.1073/pnas.1508573112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Ku AA, Hsien-Ming H, Zhao X. et al. Integration of multiple biological contexts reveals principles of synthetic lethality that affect reproducibility. Nat Commun 2020;11:2375. 10.1038/s41467-020-16078-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Barrena N, Valcárcel LV, Olaverri-Mendizabal D. et al. Synthetic lethality in large-scale integrated metabolic and regulatory network models of human cells. npj Syst Biol Appl 2023;9:32. 10.1038/s41540-023-00296-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Paladugu SR, Zhao S, Ray A. et al. Mining protein networks for synthetic genetic interactions. Bmc Bioinformatics 2008;9:1–14. 10.1186/1471-2105-9-426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Li JR, Lin L, Zhang Y-H. et al. Identification of synthetic lethality based on a functional network by using machine learning algorithms. J Cell Biochem 2019;120:405–16. 10.1002/jcb.27395 [DOI] [PubMed] [Google Scholar]
- 31. Dou Y, Ren Y, Zhao X. et al. Cssldb: Discovery of cancer-specific synthetic lethal interactions based on machine learning and statistic inference. Comput Biol Med 2024;170:108066. 10.1016/j.compbiomed.2024.108066 [DOI] [PubMed] [Google Scholar]
- 32. Cai R, Chen X, Fang Y. et al. Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers. Bioinformatics 2020;36:4458–65. 10.1093/bioinformatics/btaa211 [DOI] [PubMed] [Google Scholar]
- 33. Xinguo L, Chen G, Li J. et al. Magcn: A multiple attention graph convolution networks for predicting synthetic lethality. IEEE/ACM Trans Comput Biol Bioinform 2023;20:2681–9. [DOI] [PubMed] [Google Scholar]
- 34. Lai M, Chen G, Yang H. et al. Predicting synthetic lethality in human cancers via multi-graph ensemble neural network. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society. IEEE. [DOI] [PubMed]
- 35. Long Y, Min W, Liu Y. et al. Graph contextualized attention network for predicting synthetic lethality in human cancers. Bioinformatics 2021;37:2432–40. 10.1093/bioinformatics/btab110 [DOI] [PubMed] [Google Scholar]
- 36. Hao Z, Di W, Fang Y. et al. Prediction of synthetic lethal interactions in human cancers using multi-view graph auto-encoder. IEEE J Biomed Health Informatics 2021;25:4041–51. 10.1109/JBHI.2021.3079302 [DOI] [PubMed] [Google Scholar]
- 37. Long Y, Min W, Liu Y. et al. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022;38:2254–62. 10.1093/bioinformatics/btac100 [DOI] [PubMed] [Google Scholar]
- 38. Wang S, Feng Y, Liu X. et al. Nsf4sl: Negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 2022;38:ii13–9. 10.1093/bioinformatics/btac462 [DOI] [PubMed] [Google Scholar]
- 39. Zhang G, Chen Y, Yan C. et al. Mpasl: Multi-perspective learning knowledge graph attention network for synthetic lethality prediction in human cancer. Front Pharmacol 2024;15:1398231. 10.3389/fphar.2024.1398231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Zhang K, Feng Y, Zheng J. Prompt-based generation of natural language explanations of synthetic lethality for cancer drug discovery. In: Calzolari N, Kan M-Y, Hoste V, Lenci A, Sakti S, Xue N (eds.), LREC-COLING 2024, pp. 13131–42. Torino, Italy: ELRA and ICCL, 2024.
- 41. Fan K, Tang S, Gökbağ B. et al. Multi-view graph convolutional network for cancer cell-specific synthetic lethality prediction. Front Genet 2023;13:1103092. 10.3389/fgene.2022.1103092 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Tishby N, Pereira FCN, Bialek W. The information bottleneck method. CoRR 2000;physics/0004057. [Google Scholar]
- 43. Sun Q, Li J, Peng H. et al. Graph structure learning with variational information bottleneck. In: AAAI. AAAI Press, 2022.
- 44. Tian Y, Sun C, Poole B. et al. What makes for good views for contrastive learning? NeurIPS 2020. [Google Scholar]
- 45. Wang X, He X, Cao Y. et al. Kgat: Knowledge graph attention network for recommendation. In: Teredesai A, Kumar V, Li Y, Rosales R, Terzi E, Karypis G (eds.), KDD. Anchorage, AK, USA: Association for Computing Machinery, 2019.
- 46. Jang E, Shixiang G, Poole B. Categorical reparameterization with gumbel-softmax. In: ICLR. OpenReview.net, 2017.
- 47. Xu K, Hu W, Leskovec J. et al. How powerful are graph neural networks? In: ICLR. OpenReview.net, 2019.
- 48. Maron H, Ben-Hamu H, Serviansky H. et al. Provably powerful graph networks. NeurIPS 2019;2153–64. [Google Scholar]
- 49. Feng Y, Long Y, Wang H. et al. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat Commun 2024;15:9058. 10.1038/s41467-024-52900-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Wang J, Wu M, Huang X. et al. Synlethdb 2.0: A web-based knowledge graph database on synthetic lethality for novel anticancer drug discovery. Database 2022;2022:baac030. 10.1093/database/baac030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Choi E-H, Kim KP. E2f1 facilitates dna break repair by localizing to break sites and enhancing the expression of homologous recombination factors. Exp Mol Med 2019;51:1–12. 10.1038/s12276-019-0307-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Yeh C-K, Hsieh C-Y, Suggala A. et al. On the (in) fidelity and sensitivity of explanations. Advances in neural information processing systems 2019;32:10965–76. [Google Scholar]
- 53. Chalasani P, Chen J, Chowdhury AR. et al. Concise explanations of neural networks using adversarial training. In: Hal Daumé III, Singh A (eds.), International Conference on Machine Learning, pp. 1383–91. PMLR, 2020. [Google Scholar]
- 54. Parzen E. On estimation of a probability density function and mode. Ann Dent 1962;33:1065–76. 10.1214/aoms/1177704472 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Our source codes and pre-processed datasets are publicly available via https://github.com/CXX1113/DGIB4SL.





































































































































































