GRANet: a graph residual attention network for gene regulatory network inference

Junliang Zhou; Ningji Gong; Yanjun Hu; Hong Yu; Guoyin Wang; Hao Wu

doi:10.1093/bib/bbaf349

. 2025 Jul 25;26(4):bbaf349. doi: 10.1093/bib/bbaf349

GRANet: a graph residual attention network for gene regulatory network inference

Junliang Zhou ^1,^#, Ningji Gong ^2,^#, Yanjun Hu ³, Hong Yu ⁴, Guoyin Wang ^5,⁶, Hao Wu ^7,^8,^✉

PMCID: PMC12289554 PMID: 40708222

Abstract

The reconstruction of gene regulatory networks (GRNs) is crucial for uncovering regulatory relationships between genes and understanding the mechanisms of gene expression within cells. With advancements in single-cell RNA sequencing (scRNA-seq) technology, researchers have sought to infer GRNs at the single-cell level. However, existing methods primarily construct global models encompassing entire gene networks. While these approaches aim to capture genome-wide interactions, they frequently suffer from decreased accuracy due to challenges such as network scale, noise interference, and data sparsity. This study proposes GRANet (Graph Residual Attention Network), a novel deep learning framework for inferring GRNs. GRANet leverages residual attention mechanisms to adaptively learn complex gene regulatory relationships while integrating multi-dimensional biological features for a more comprehensive inference process. We evaluated GRANet across multiple datasets, benchmarking its performance against state-of-the-art methods. The experimental results demonstrate that GRANet consistently outperforms existing methods in GRN inference tasks. In addition, in our case study on EGR1, CBFB, and ELF1, GRANet achieved high prediction accuracy, effectively identifying both known and novel regulatory interactions. These findings highlight GRANet’s potential to advance research in gene regulation and disease mechanisms.

Keywords: single-cell RNA sequencing, gene regulatory networks, deep learning frameworks, multi-dimensional features, graph attention mechanism

Introduction

Gene regulatory networks (GRNs) characterize the regulatory relationships between transcription factors (TFs) and their target genes, playing a crucial role in exploring gene interactions, cellular behaviors, and disease mechanisms. The study of GRNs provides insights into cell fate determination, disease development, and transcriptional regulatory mechanisms [1]. The rapid development of single-cell RNA sequencing (scRNA-seq) technology has revolutionized the study of cellular heterogeneity, enabling high-resolution inference of cell-specific GRNs [2].

With advancements in high-throughput sequencing technologies, single-cell transcriptomic data offer unprecedented opportunities for GRN reconstruction [3]. Various machine learning and deep learning approaches have been employed to infer GRNs at the single-cell level. For instance, SCODE [4] models the dynamic changes in gene expression using linear ordinary differential equations (ODEs). However, the assumption of linear regulatory relationships oversimplifies the inherently complex and nonlinear nature of GRNs. GENIE3 [5], a random forest–based algorithm, formulates GRN inference as a regression problem, learning the impact of regulatory genes on their targets. While effective, GENIE3 is computationally intensive, making it less scalable for large datasets.

SINCERITIES [6] focuses on temporal single-cell gene expression data, capturing regulatory relationships by analyzing changes in gene expression distributions across time points. It employs regularized linear regression (ridge regression) and partial correlation analysis to infer activation or inhibition interactions.

With the rise of deep learning [8], methods such as GNE [7] and CNNC [9] have outperformed traditional approaches by leveraging neural networks. GNE utilizes multi-layer perceptrons (MLPs) to quantify gene pair associations, facilitating the identification of active regulatory pathways under various conditions. CNNC employs a supervised learning framework, transforming gene pair relationships into histograms, which are then processed using convolutional neural networks (CNNs) to uncover regulatory relationships.

More advanced deep learning models, such as dynDeepDRIM [10] and DeepSEM [11], further enhance GRN inference. dynDeepDRIM utilizes high-dimensional CNNs to reconstruct GRNs from time-stamped scRNA-seq data, incorporating contextual gene interactions into its predictions. DeepSEM combines variational autoencoders (VAEs) with structural equation modeling (SEM) to capture nonlinear relationships and latent mechanisms, making it particularly robust against missing data.

Graph-based models, such as NSRGRN [12] and GENELink [13], integrate topological structures and gene embeddings to enhance GRN inference. NSRGRN introduces a network structure refinement algorithm that improves the inference network by considering both local and global topologies. GENELink utilizes graph attention networks to embed single-cell gene expression data, projecting TF-gene pairs into a low-dimensional space for causal inference. Similarly, matrix factorization-based models like GMFGRN [14] exploit latent feature spaces to uncover complex regulatory patterns beyond traditional approaches.

Other notable methods further expand GRN inference capabilities. DeepRIG [15] constructs weighted gene co-expression networks and employs graph autoencoders (GAEs) to model global regulatory dynamics. LogBTF [16] integrates logistic regression with Boolean threshold functions to reconstruct GRNs from time-series expression data. PSGRN [17] employs pseudo-Siamese networks and DenseNet frameworks to learn spatiotemporal features of gene interactions, while Gene regulatory network inference based on causal discovery integrating with graph neural network (GRINCD) [18] combines causal asymmetry learning with graph representation learning to capture both linear and nonlinear regulatory relationships.

These diverse approaches demonstrate the evolving methodologies in GRN inference, showcasing the integration of traditional computational models with cutting-edge deep learning techniques to address the challenges of single-cell and large-scale datasets.

GATCL [19], an extension of GENELink, enhances graph attention networks (GATs) by replacing the matrix multiplication between the feature matrix Inline graphic and weight matrix with convolutional layers, significantly improving computational efficiency. Additionally, GATCL integrates multi-head graph attention with self-attention mechanisms, allowing for a more comprehensive consideration of multiple data features. This enables the model to adaptively assign different weights to the regulatory neighbors of each gene during optimization, dynamically selecting the most influential genes for inference. By reducing redundant information, this mechanism enhances model accuracy. However, GATCL’s limited preprocessing capabilities constrain its ability to handle noise and uncertainty in large-scale datasets. High training errors observed in multiple experiments indicate potential areas for improving its feature extraction layers.

While existing GRN inference methods demonstrate distinct strengths, they often fail to extract comprehensive information from gene expression data. In recent years, graph neural networks (GNNs), particularly GATs, have gained prominence in analyzing network-structured data due to their ability to dynamically assign weights to neighboring nodes through attention mechanisms [20]. Compared to traditional approaches, GATs are more effective in capturing the intricate regulatory relationships among genes, especially in the high-dimensional feature space of single-cell RNA-seq data.

We present the following key contributions in this study:

(i) We propose GRANet, a supervised deep learning framework based on multi-head GATs for constructing GRNs. GRANet addresses key limitations of existing methods, including inadequate gene modeling, limited integration of multi-source data, and suboptimal deep network expression capabilities. Our model demonstrates robust performance across multiple publicly available datasets.

(ii) We introduce a novel data extraction pipeline tailored for scRNA-seq data, generating three distinct gene feature representations from gene expression matrices: smoothed, discretized, and standardized feature sets. These representations enhance the integration of diverse gene expression features, improving GRN inference.

(iii) To address overfitting challenges in graph-based models applied to small-scale scRNA-seq data, we introduce an internal residual mechanism to stabilize GAT training. Additionally, we leverage multi-head attention mechanisms to model complex regulatory relationships among genes dynamically.

Materials and methods

Datasets

We evaluated the performance of GRANet in GRN inference using seven widely-used single-cell RNA sequencing (scRNA-seq) datasets, which are considered standard benchmarks for deep learning methods [21–23]. The datasets include: (i) human embryonic stem cells (hESCs); (ii) human mature liver cells (hHEPs); (iii) mouse dendritic cells (mDCs); (iv) mouse embryonic stem cells (mESCs); (v) mouse hematopoietic stem cells with erythroid lineage (mHSC-Es); (vi) mouse hematopoietic stem cells with granulocyte–monocyte lineage (mHSC-GMs); and (vii) mouse hematopoietic stem cells with lymphoid lineage (mHSC-Ls). These datasets contain real functional interaction networks sourced from the STRING database [24], nonspecific ChIP-seq experiments [25–27], and cell-type-specific ChIP-seq experiments [28–30]. The detailed results are presented in Table 1.

Table 1.

Comprehensive details of the eight datasets. The size of training sets about each ground-truth network with TFs and the 500 (or 1000) most-varying genes.

Dataset	Species	STRING	Nonspecific	Cell-type-specific
hESC	Human	4257 (5149)	3441 (4617)	4545 (7084)
hHEP	Human	7523 (9003)	4129 (5351)	9939 (15558)
mDC	Mouse	4815 (5898)	3067 (3918)	756 (1193)
mESC	Mouse	7762 (8479)	6893 (8030)	29,613 (42795)
mHSC-E	Mouse	1371 (1826)	1425 (1960)	11,557 (21975)
mHSC-GM	Mouse	748 (1311)	743 (1358)	7364 (14135)
mHSC-L	Mouse	137 (154)	279 (317)	4398 (5180)

Open in a new tab

Each scRNA-seq dataset was processed using the method proposed by Pratapa et al. [23], with interactions inferred from TFs. During this analysis, we selected the top 500 and 1000 genes showing the most significant expression level changes and including all TFs with corrected P-values below .01 [37]. Finally, we divide the dataset into three subsets: training set, validation set, and test set. Further details are provided in Supplementary Text S1. The seven scRNA-seq datasets are available from the Gene Expression Omnibus under the following accession numbers: GSE81252 (hHEP), GSE75748 (hESC), GSE98664 (mESC), GSE48968 (mDC), and GSE81682 (mHSC). Additionally, all datasets with their true networks for four different types of interactions can be accessed at https://doi.org/10.5281/zenodo.3378975.

Framework of GRANet

GRANet, similar to other GNN models [31–34], predicts relationships between nodes based on their distinct features. In the task of GRN inference, nodes are categorized as either TFs or non-TFs, with edges representing regulatory interactions. As illustrated in Fig. 1, GRANet is composed of three main components: the multi-feature set extraction module, the graph attention fusion module, and the embedding prediction module.

The multi-feature set extraction module processes the original gene expression matrix through several transformations, including standardization, smoothing, and discretization, to modify its distribution. These transformations reduce sparsity and noise in the matrix, thereby enhancing the quality of input features.

The graph attention fusion module integrates two enhanced GATs and a CNN layer. The first GAT captures local interactions between TF nodes and their immediate neighbors, generating preliminary node representations. The second GAT refines these representations by incorporating global interactions, considering broader neighborhoods, and resulting in higher-level node embeddings. This layer-wise feature extraction and information aggregation enable the model to capture the complex relationships and structural dynamics within gene regulatory networks, improving its performance.

Finally, the node embeddings are passed into the MLP layer (embedding prediction module) to further refine the representations for both TFs and non-TFs. The relationship score between two nodes is computed using the dot product of their respective embeddings.

Multi-feature set extraction module

In this study, we first reconstruct the gene expression matrix using a Variational Autoencoder (VAE) [35] to capture the underlying structure and reduce noise in the single-cell RNA sequencing (scRNA-seq) data. The VAE is a generative model that learns a low-dimensional latent representation of the high-dimensional gene expression data by optimizing a variational lower bound on the data likelihood. Specifically, the VAE consists of an encoder network, which maps the input gene expression data to a latent space, and a decoder network, which reconstructs the data from the latent representation [38]. The latent space is constrained to follow a prior distribution, encouraging the model to learn a smooth and continuous representation of the data. The VAE is trained separately on the scRNA-seq data using a loss function that combines a reconstruction loss (to ensure accurate data reconstruction) and a Kullback–Leibler (KL) divergence term (to regularize the latent space), as shown in the following equations:

(1)

(2)

where Inline graphic is the reconstructed gene expression data, is the mean of the latent space, is the log variance, and is the dimensionality of the latent space. This reconstruction step enhances data quality by denoising and imputing missing values, providing a more robust foundation for subsequent feature extraction.

Following reconstruction, multiple feature sets are extracted from the processed scRNA-seq data to capture he underlying information in gene expression more comprehensively. These multi-dimensional feature extractions enable a more accurate characterization of gene expression patterns across different spatial locations, providing rich informational support for subsequent model training and prediction.

To eliminate batch effects and scale differences, we first performed normalization on the reconstructed gene expression data. This normalization process transforms the data to have zero mean and unit variance, thereby enhancing comparability and improving model stability.

To reduce noise in the reconstructed gene expression data, we applied a moving average smoothing method. For each gene’s expression value Inline graphic , a rolling window of size 3 is used to compute the average of expression values within the window:

(3)

where Inline graphic is the window size. This process helps to reduce fluctuations in the data, thereby enhancing the stability and predictive performance of the model.

After applying the above smoothing step, we performed discretization on the reconstructed gene expression data by mapping extreme values (outliers) into a reasonable range of bins, thereby minimizing their impact on the model. The discretization steps are as follows:

First, we apply a log transformation to the nonzero reconstructed expression values Inline graphic to reduce skewness and make the data more normally distributed. Let the transformed values be denoted as . Next, we compute the mean , minimum , standard deviation , and maximum of these log-transformed values. The discretization bounds are defined as:

(4)

(5)

The bucket width is determined as:

(6)

where Inline graphic is the number of discrete bins used during the discretization. Once the discretization boundaries and bucket width are defined, each expression value is mapped to a corresponding bin:

(7)

Finally, any expression values exceeding the upper bound are capped at Inline graphic , and values below the lower bound are set to 0. This discretization process transforms continuous gene expression values into discrete bins, simplifying data processing and model training while preserving key expression information.

Graph attention fusion module

In the proposed fusion prediction module, the first layer incorporates a multi-head graph attention mechanism with internal residual connections (RGAT), as shown in Fig. 2. Unlike traditional graph attention mechanisms [32], where all attention heads share the same input feature matrix, our model assigns a distinct input to each attention head, allowing each one to capture different aspects of the data.

The architecture of the RGAT. This figure illustrates the graph attention mechanism with an internal residual connection. The input node features and their neighboring features are aggregated through attention-weighted summation to form the multi-head attention representation h Subscript i Superscript multi-head. Simultaneously, the original input node features pass through a multilayer perceptron (MLP) to generate an adaptive residual connection. Finally, the residual connection is added to the multi-head attention results to obtain the final node representation h Subscript i Superscript final. This module effectively captures complex graph structural relationships and enhances the model’s expressive power. — The architecture of the RGAT. This figure illustrates the graph attention mechanism with an internal residual connection. The input node features and their neighboring features are aggregated through attention-weighted summation to form the multi-head attention representation . Simultaneously, the original input node features pass through a multilayer perceptron (MLP) to generate an adaptive residual connection. Finally, the residual connection is added to the multi-head attention results to obtain the final node representation . This module effectively captures complex graph structural relationships and enhances the model’s expressive power.

Inline graphic — The architecture of the RGAT. This figure illustrates the graph attention mechanism with an internal residual connection. The input node features and their neighboring features are aggregated through attention-weighted summation to form the multi-head attention representation . Simultaneously, the original input node features pass through a multilayer perceptron (MLP) to generate an adaptive residual connection. Finally, the residual connection is added to the multi-head attention results to obtain the final node representation . This module effectively captures complex graph structural relationships and enhances the model’s expressive power.

The input consists of three feature sets derived from the multi-feature extraction component, along with the corresponding adjacency matrices. For each attention head, the input feature matrix Inline graphic is linearly transformed to produce a new node representation:

(8)

where Inline graphic is the linear transformation matrix for the k-th attention head, is the input dimension, and is the output dimension. The resulting matrix represents the transformed node features produced by the k-th attention head and serves as the input for subsequent attention-based aggregation. After obtaining the node representations for each head, a masking mechanism is applied to compute the attention weights Inline graphic between node and its neighbor :

(9)

where Inline graphic is the parameter vector used for computing the attention weights, denotes the concatenation of features, and [36] is the activation function. This step captures the relationship strength between nodes. These attention weights are then normalized using the softmax function to ensure they sum to 1:

(10)

where Inline graphic represents the set of neighbors for node , and is the normalized attention weight for node relative to node . The normalized attention weights are then used to aggregate features from neighboring nodes to update the feature representation for node . The outputs from all attention heads are concatenated (or averaged) to form the final node feature representation:

(11)

where Inline graphic represents the number of attention heads. The averaging operation enhances the model’s ability to capture diverse graph structural features.

To prevent vanishing gradients and accelerate convergence, residual connections are introduced. Specifically, the final feature representation Inline graphic for node is a weighted combination of the original features and the output from the multi-head attention:

(12)

where Inline graphic and are learnable parameters. This dynamic adjustment of residual coefficients during training improves the model’s performance by better integrating information across attention heads.

In the second layer, we introduce a module that combines convolutional layers with the graph residual attention mechanism, referred to as the Residual Convolutional Graph Attention Network (RCGAT). This module enhances the GRN inference by integrating spatial feature extraction with graph structure modeling. To improve the model’s ability to capture latent features, we prepend a convolutional layer to the RCGAT.

The output Inline graphic from the first RGAT layer is processed through a convolutional layer as follows:

(13)

After the output Inline graphic from the first RGAT layer undergoes convolutional embedding, the convolutional layer replaces the matrix multiplication typically used in GAT with a convolutional interface. This allows the model to extract local spatial features from the node representations. The aggregation of neighboring node features is then performed similarly to the RGAT layer:

(14)

where Inline graphic represents the attention weights for each node pair, computed as in the first RGAT layer. Finally, the multi-head attention results are fused through averaging, and a residual connection is applied with the original input features:

(15)

Embedding prediction module

After feature extraction by RCGAT, we refine the regulatory relationships between TFs and target genes by designing two independent fully connected feature extraction modules. These modules apply specific nonlinear transformations to the embedded feature matrix Inline graphic , allowing the model to learn dedicated representations for the TFs and target genes. The transformation process is as follows:

(16)

(17)

where Inline graphic are the weight matrices for the linear layers, and and are the bias terms.

After obtaining the feature representations for transcription factors and target genes, we compute the regulatory score between each TF–target gene pair using the dot product operation to quantify their potential association strength:

(18)

Finally, the GRANet model is optimized using a modified pointwise binary cross-entropy (BCE) loss function:

(19)

where Inline graphic is the weight for positive samples, improving the model’s handling of imbalanced datasets.

Results

Performance comparison with state-of-the-art methods

To comprehensively evaluate the performance of GRANet, we compared it with several state-of-the-art gene regulatory network inference methods across seven datasets (hESC, hHEP, mDC, mESC, mHSC-E, mHSC-GM, mHSC-L), two scales (TFs + 500 and TFs + 1000), and three ground-truth networks (STRING, nonspecific ChIP-seq, and cell-type-specific ChIP-seq). The comparison included a diverse range of methods, spanning statistical approaches, machine learning models, and advanced deep learning frameworks, such as GATCL, GENELink, GNE, GRINCD, SCENIC, DeepRIG, GENIE3, Pearson Correlation Coefficient (PCC), and DeepSEM. These methods represent a broad spectrum of techniques widely used for gene regulatory network inference tasks.

For all experiments, we employed standardized evaluation metrics—AUROC and AUPRC—to assess the models’ performance in classifying positive and negative samples. Each metric was averaged over 50 independent runs to ensure statistical robustness [41, 42].

Extensive experiments were conducted on datasets derived from STRING, Non-Specific, and Specific networks to validate GRANet’s effectiveness in gene regulatory network inference rigorously. In terms of AUROC, GRANet consistently outperformed existing baseline models, including GATCL and GENELink, as illustrated in Fig. 3. Specifically, GRANet achieved average performance improvements of 1.62%, 2.24%, 3.62%, 23.97%, 45.47%, 26.76%, 31.53%, 32.23%, and 34.77% over GATCL, GENELink, DeepRIG, GNE, GRINCD, SCENIC, GENIE3, PCC, and DeepSEM, respectively.

Average AUROC scores of 10 models across seven ground-truth networks on three kinds of datasets. This figure illustrates how GRANet and nine competing approaches perform in discriminating true regulatory interactions across three different ground-truth networks: (A) the STRING dataset, (B) the nonspecific ChIP-seq dataset, and (C) the cell-type-specific ChIP-seq dataset.

Similarly, the evaluation of GRANet using the AUPRC metric is presented in Fig. 4. GRANet achieved the best performance in 88% (37 out of 42) of the scRNA-seq datasets, demonstrating its robustness across diverse conditions. On average, GRANet outperformed the baseline methods by 10.66%, 25.62%, 70.48%, 88.29%, 68.21%, 102.55%, 124.14%, 123.57%, and 98.01% over GATCL, GENELink, DeepRIG, GNE, GRINCD, SCENIC, GENIE3, PCC, and DeepSEM, respectively.

Average AUPRC scores of 10 models across seven ground-truth networks on three kinds of datasets. This figure highlights the precision-recall trade-offs for GRANet compared to nine alternative methods on various ground-truth networks: (A) the STRING dataset, (B) the nonspecific ChIP-seq dataset, and (C) the cell-type-specific ChIP-seq dataset.

These results further highlight GRANet’s superior ability in distinguishing positive and negative samples, especially in scenarios where positive samples are scarce and imbalanced. Its consistent performance across various datasets highlights GRANet’s potential as a reliable tool for GRN inference.

Moreover, the experimental results demonstrate that GNN-based models consistently outperform traditional machine learning models and other neural network architectures across key performance metrics. This underscores the rising importance of GNNs in the field of GRN inference.

Compared to conventional machine learning methods, GNNs provide a more effective framework for analyzing the complex, non-Euclidean relationships among genes. Specifically, GNNs and their variants, such as GATs, graph convolutional networks, and GRNs, excel at capturing both the local topological structures of gene regulatory networks and long-range relational features through multi-layered embedding mechanisms. This dual capability makes them particularly well suited for uncovering intricate regulatory patterns within biological networks.

Ablation study

To further validate the impact of different modules on the overall performance of GRANet, we conducted a series of ablation experiments by progressively removing or replacing key components of the model and analyzing their effects on performance [39]. Specifically, we constructed four variants of GRANet (as shown in Table 2): (i) GRANet w/o M: GRANet without the multi-feature set extraction module, using only the raw gene expression matrix as input. (ii) GRANet w/o R: GRANet without the residual connections in the graph attention mechanism. (iii) GRANet w/o C: GRANet without the convolution operations in the second RCGAT layer, replaced by standard matrix multiplication. (iv) GRANet w/o V: GRANet without the VAE encoder, which removes the variational autoencoder module used for gene expression matrix reconstruction. Each variant was trained using the same hyperparameters as GRANet and evaluated its performance on the same seven benchmark scRNA-seq datasets (TFs + 500).

Table 2.

Variants of GRANet and their modifications.

Model	Multi-feature set	Residual connections	Convolution	VAE
GRANet	√	√	√	√
GRANet w/o M	×	√	√	√
GRANet w/o R	√	×	√	√
GRANet w/o C	√	√	×	√
GRANet w/o V	√	√	√	×

Open in a new tab

The check mark indicates that this module is included in the model, while the cross mark indicates that this module is not included in the model.

As shown in Fig. 5, removing any of the key components significantly degraded the model’s performance. Notably, the removal of the multi-feature set extraction module led to the most substantial performance drop, with the AUROC decreasing by 6.82% and the AUPRC decreasing by 6.45%. Removing the residual connections in GAT resulted in a 4.37% decrease in AUROC and a 4.12% decrease in AUPRC, while removing the VAE encoder caused a 5.23% decrease in AUROC and a 4.98% decrease in AUPRC. These results demonstrate that the multi-feature set extraction module, internal residual connections in GAT, embedded convolution operations in RCGAT, and the VAE encoder each contribute significantly to the performance improvement of GRANet.

Ablation experiments. (A) Distribution of AUROC values for the four variant models and GRANet across seven datasets. The average performance degradation of each variant model is 6.82%, 4.37%, 4.36%, and 5.23%, respectively. (B) Distribution of AUPRC values for the four variant models and GRANet across seven datasets. The average performance degradation of each variant model is 6.45%, 4.12%, 4.76%, and 4.98%, respectively.

To further investigate the contribution of each feature type, we conduct additional ablation experiments in the Supplementary Text S2, where the standardized, smoothed, and discretized features are evaluated independently. Three variants of GRANet are shown in Supplementary Table S1, and the AUPRC values under four different cases in the Ablation Study are shown in Supplementary Fig. S1.

Impact of different parameters on the GRANet model

To investigate the sensitivity of GRANet to hyperparameters, we systematically adjusted several key hyperparameters, including the number of attention heads in the RCGAT, the embedding dimensions of TFs, the model’s learning rate, and the dropout rates across various modules. The performance was evaluated on seven benchmark scRNA-seq datasets. Finally, we validated GRANet’s performance under some alternate settings of critical hyperparameters, with results presented in the Supplementary Table S2.

Impact of the number of attention heads in the RCGAT layer on the GRANet model

We varied the number of attention heads in the second RCGAT layer (1, 2, 3, 6, and 8), keeping other hyperparameters fixed. Performance was measured using average AUROC and AUPRC over 10 experiments (Table 3). Increasing the number of heads from 2 to 3 improved AUROC by 3.2% and AUPRC by 4.5%, indicating enhanced feature capture. However, further increases to the number of heads to 6 yielded diminishing returns, and using 8 heads slightly degraded performance while increasing training time by 35%. This suggests a trade-off between feature representation and computational efficiency, with 3 heads being optimal.

Table 3.

The impact of varying the number of heads in the multi-head attention layer.

Number of heads	AUROC	AUPRC	Time
1	0.723	0.418	4m07s
2	0.777	0.402	4m52s
3	0.802	0.421	5m36s
6	0.723	0.411	7m29s
8	0.735	0.396	8m35s
16	0.621	0.337	12m24s

Open in a new tab

Impact of embedding dimensions on the GRANet model

We tested embedding dimensions of 8, 16, 32, and 64 for TFs and target genes. An embedding dimension of 16 achieved the best performance in both the AUROC and the AUPRC (Table 4). Lower dimensions (e.g. 8) limited the model’s representational capacity, while higher dimensions (e.g. 32, 64) introduced redundancy and increased computational costs without providing significant improvements. An embedding dimension of 16 offered an optimal balance between performance and efficiency, making it the ideal choice for modeling gene regulatory networks.

Table 4.

The impact of different embedding dimensions for gene features and TFs.

Feature dimensions	AUROC	AUPRC	Time
8	0.628	0.323	5m25s
16	0.802	0.421	5m36s
32	0.786	0.431	6m37s
64	0.734	0.419	7m16s

Open in a new tab

Impact of discretization level on the GRANet model

We evaluated the impact of varying discretization levels (5, 10, 20, 25, and 30) in the multi-feature extraction module, keeping other hyperparameters constant. Performance was measured using the average AUROC and AUPRC over 10 experiments (Table 5). Increasing the discretization level from 5 to 20 improved AUROC by 5.8% and AUPRC by 7.1%, indicating that higher discretization levels capture more detailed features. However, when the discretization level exceeded 20 (e.g. 25 or 30), a slight decrease in performance was observed, accompanied by a significant increase in computational cost. These results suggest that a discretization level of 20 offers an optimal balance between feature representation and computational efficiency.

Table 5.

The impact of different discretization levels in multi-feature extraction.

Discretization level	AUROC	AUPRC	Time
5	0.728	0.373	4m51s
10	0.788	0.407	5m12s
20	0.802	0.421	5m36s
25	0.791	2.00	5m43s
30	0.760	1.00	6m08s

Open in a new tab

Case study

In this section, we further evaluated the predictive capability of GRANet in identifying potential target genes of specific TFs through a case study [40]. The experiments were conducted on cell-type-specific datasets (mHSC-GM 500). Specifically, we used the trained model to predict unknown TF–gene pairs and selected the top 25 TF–gene interaction pairs with the highest prediction scores. These predictions were validated using the publicly available hTFtarget database, and the results are summarized in Fig. 6.

Application of GRANet to predict potential target genes in the mHSC-GM dataset. (A) Ranking plots of the top 25 predicted potential target gene scores for CBFB. (B) Ranking plots of the top 25 predicted potential target gene scores for ELF1. (C) Ranking plots of the top 25 predicted potential target gene scores for EGR1.

We focused on three transcription factors: EGR1 (Early Growth Response Protein 1), CBFB (Core-Binding Factor Subunit Beta), and ELF1 (E74-Like Factor 1). EGR1 is a key TF involved in regulating various biological processes, including cell proliferation, differentiation, and apoptosis [43]. CBFB is a key component of the core-binding factor complex, playing a critical role in hematopoiesis and skeletal development [44]. ELF1, a member of the ETS family of transcription factors, regulates immune response, cell differentiation, and cancer progression [45]. Dysregulation or mutations in EGR1, CBFB, and ELF1 are closely associated with various diseases, including cancer and immune system disorders [46, 47].

The validation of the top 25 predicted target genes for EGR1, CBFB, and ELF1 using the hTFtarget database revealed highly accurate predictions. Specifically, 20 out of 25 target genes for EGR1, all 25 for CBFB, and 24 out of 25 for ELF1 were experimentally validated, as shown in Fig. 6. These results highlight the robustness and reliability of GRANet in identifying potential TF–gene regulatory relationships, demonstrating its ability to effectively predict both known and novel regulatory relationships, as shown in Fig. 7. The high validation rates for CBFB and ELF1, in particular, highlight the model’s precision in capturing biologically relevant interactions. Notably, the unvalidated target genes, especially those for EGR1 and ELF1, may represent novel regulatory mechanisms or pathways that have not yet been documented in existing databases or experimentally explored. These findings suggest that GRANet excels not only in predicting well-established TF–gene interactions but also holds significant potential for uncovering new biological insights. Further experimental validation of these unverified targets could lead to the discovery of previously unknown regulatory networks or functional roles, deepening our understanding of transcriptional regulation and its implications in health and disease.

Visualization of the gene regulatory network, by GRANet. This figure presents the visualization of a gene regulatory network predicted by the GRANet model, highlighting the regulatory relationships and their predicted strengths between transcription factors CBFB, EGR1, and ELF1 and their potential target genes. In the visualization, nodes represent genes, edges represent potential regulatory relationships, and the edge weights (e.g. 0.693, 0.532) indicate the predicted regulatory strengths. — Visualization of the gene regulatory network predicted by GRANet. This figure presents the visualization of a gene regulatory network predicted by the GRANet model, highlighting the regulatory relationships and their predicted strengths between transcription factors CBFB, EGR1, and ELF1 and their potential target genes. In the visualization, nodes represent genes, edges represent potential regulatory relationships, and the edge weights (e.g. 0.693, 0.532) indicate the predicted regulatory strengths.

Discussion and conclusion

This study introduces GRANet, a GRN inference method based on GNNs and residual attention mechanisms. By integrating multiple feature sets and incorporating a graph residual multi-head attention mechanism, GRANet effectively addresses the performance limitations of traditional methods when handling complex biological data. The model extracts and integrates standardized, smoothed, and discretized gene expression features to capture global expression patterns, mitigate noise, and highlight regulatory transitions. Extensive experimental evaluations demonstrate that GRANet outperforms current state-of-the-art GRN inference models across multiple benchmark datasets.

However, despite its strong performance in most scenarios, GRANet has some limitations. For instance, its performance degrades when processing extremely sparse datasets, particularly when the number of gene samples is insufficient [48].

Although GRANet has achieved impressive results in gene regulatory network inference, its full potential remains largely untapped. Future research could explore integrating additional types of biological data, such as proteomics and metabolomics, to construct more comprehensive multi-omics GRNs [49]. Additionally, incorporating self-supervised learning methods could reduce reliance on large amounts of labeled data, further enhancing the model’s generalization capabilities [50].

In summary, the application of artificial intelligence in single-cell biology [52–60] has substantially advanced the inference of GRNs. GRANet presents a powerful tool for this purpose, offering innovative solutions that open new avenues for research in bioinformatics [51]. While further optimization is needed, the model’s breakthroughs across multiple aspects provide fresh insights into the study of GRNs. In future work, we aim to enhance GRN inference by adopting a multi-omics approach, integrating data from various omics layers to better capture gene regulatory interactions. This multi-omics integration will not only improve the accuracy of GRN analysis but also enhance its interpretability.

Key Points

We propose a comprehensive multi-feature extraction strategy specifically designed for single-cell RNA-seq data.
We introduce GRANet, an advanced deep learning framework that integrates multi-head graph attention mechanisms, convolutional layers, and residual connections.
To address the challenges posed by noisy data and limited sample sizes, GRANet incorporates internal residual connections that stabilize training and improve generalization, thereby reducing the risk of overfitting.
We systematically assess GRANet’s performance across multiple benchmark datasets. Additionally, we conduct ablation studies to quantify the individual contributions of each architectural component, demonstrating the framework’s modular effectiveness.
In a focused case study involving key transcription factors such as EGR1, CBFB, and ELF1, GRANet not only accurately recapitulates known regulatory interactions but also predicts previously unreported mechanisms, offering new biological insights.

Supplementary Material

Supplementary_Materials_bbaf349

supplementary_materials_bbaf349.docx^{(111.4KB, docx)}

Acknowledgements

We thank members of the group for their valuable discussions and comments.

Contributor Information

Junliang Zhou, School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, No. 2 Chongwen Road, Nan'an District, Chongqing 400065, China.

Ningji Gong, Department of Emergency, The Second Hospital, Cheeloo College of Medicine, Shandong University, No. 247 Beiyuan Street, Tianqiao District, Jinan 250033, Shandong, China.

Yanjun Hu, Library, Shandong Normal University, No. 1 University Road, University Science Park, Changqing District, Jinan 250100, Shandong, China.

Hong Yu, School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, No. 2 Chongwen Road, Nan'an District, Chongqing 400065, China.

Guoyin Wang, School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, No. 2 Chongwen Road, Nan'an District, Chongqing 400065, China; National Center for Applied Mathematics in Chongqing, Chongqing Normal University, No. 37, Daxuecheng Middle Road, Shapingba District, Chongqing 401331, China.

Hao Wu, School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, No. 2 Chongwen Road, Nan'an District, Chongqing 400065, China; School of Software, Shandong University, No. 1500, Shunhua Road, High tech Zone, Jinan 250100, Shandong, China.

Author contributions

H.W. and J.Z. (Conceived the experiments), H.W., J.Z., and N.G. (Conducted and analyzed the experiments), H.W. and J.Z. (Wrote the manuscript), and H.W., Y.H., H.Y. and G.W. (Reviewed the manuscript).

Funding

This work is supported by the National Natural Science Foundation of China (Grant No. 62272278 & 61972322), the National Key Research and Development Program (Grant No. 2021YFF0704103), and Guangdong Basic and Applied Basic Research Foundation (Grant No. 2024A1515012775). The funders did not play any role in the design of the study, the collection, analysis, and interpretation of data, or the writing of the manuscript.

Data availability

The datasets utilized in this study, along with comprehensive details of the GRANet code, are available for access at https://github.com/HaoWuLab-Bioinformatics/GRANet.

References

1. Alon U. Network motifs: theory and experimental approaches. Nat Rev Genet 2007;8:450–61. 10.1038/nrg2102 [DOI] [PubMed] [Google Scholar]
2. Unger Avila P, Padvitski T, Leote AC. et al. Gene regulatory networks in disease and ageing. Nat Rev Nephrol 2024;20:616–33. 10.1038/s41581-024-00849-7 [DOI] [PubMed] [Google Scholar]
3. Papalexi E, Satija R. Single-cell RNA sequencing to explore immune cell heterogeneity. Nat Rev Immunol 2018;18:35–45. 10.1038/nri.2017.76 [DOI] [PubMed] [Google Scholar]
4. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 2016;17:333–51. 10.1038/nrg.2016.49 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Matsumoto H, Kiryu H, Furusawa C. et al. SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 2017;33:2314–21. 10.1093/bioinformatics/btx194 [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Huynh-Thu VA, Irrthum A, Wehenkel L. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One 2010;5:e12776. 10.1371/journal.pone.0012776 [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Papili Gao N, Ud-Dean SMM, Gandrillon O. et al. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics 2018;34:258–66. 10.1093/bioinformatics/btx575 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Kc K, Li R, Cui F. et al. GNE: a deep learning framework for gene network inference by aggregating biological information. BMC Syst Biol 2019;13:38–14. 10.1186/s12918-019-0694-y [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Yuan Y, Bar-Joseph Z. Deep learning for inferring gene relationships from single-cell expression data. Proc Natl Acad Sci 2019;116:27151–8. 10.1073/pnas.1911536116 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Xu Y, Chen J, Lyu A. et al. dynDeepDRIM: a dynamic deep learning model to infer direct regulatory interactions using time-course single-cell gene expression data. Brief Bioinform 2022;23:bbac424. 10.1093/bib/bbac424 [DOI] [PubMed] [Google Scholar]
11. Shu H, Zhou J, Lian Q. et al. Modeling gene regulatory networks using neural network architectures. Nat Comput Sci 2021;1:491–501. 10.1038/s43588-021-00099-8 [DOI] [PubMed] [Google Scholar]
12. Liu W, Yang Y, Lu X. et al. NSRGRN: a network structure refinement method for gene regulatory network inference. Brief Bioinform 2023;24:bbad129. 10.1093/bib/bbad129 [DOI] [PubMed] [Google Scholar]
13. Chen G, Liu ZP. Graph attention network for link prediction of gene regulations from single-cell RNA-sequencing data. Bioinformatics 2022;38:4522–9. 10.1093/bioinformatics/btac559 [DOI] [PubMed] [Google Scholar]
14. Li S, Liu Y, Shen LC. et al. GMFGRN: a matrix factorization and graph neural network approach for gene regulatory network inference. Brief Bioinform 2024;25:bbad529. 10.1093/bib/bbad529 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Wang J, Chen Y, Zou Q. Inferring gene regulatory network from single-cell transcriptomes with graph autoencoder model. PLoS Genet 2023;19:e1010942. 10.1371/journal.pgen.1010942 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Li L, Sun L, Chen G. et al. LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinformatics 2023;39:btad256. 10.1093/bioinformatics/btad256 [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Wang Q, Guo M, Chen J. et al. A gene regulatory network inference model based on pseudo-siamese network. BMC Bioinformatics 2023;24:163. 10.1186/s12859-023-05253-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Feng K, Jiang H, Yin C. et al. Gene regulatory network inference based on causal discovery integrating with graph neural network. Quant Biol 2023;11:434–50. 10.1002/qub2.26 [DOI] [Google Scholar]
19. Liu J, Zhou S, Ma J. et al. Graph attention network with convolutional layer for predicting gene regulations from single-cell ribonucleic acid sequence data. Eng Appl Artif Intell 2024; 136:108938. 10.1016/j.engappai.2024.108938 [DOI] [Google Scholar]
20. Wu Z, Pan S, Chen F. et al. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 2020;32:4–24. 10.1109/TNNLS.2020.2978386 [DOI] [PubMed] [Google Scholar]
21. Velickovic P, Cucurull G, Casanova A. et al. Graph attention networks. Stat 2017;1050:10–48550. [Google Scholar]
22. Yuan J, Cao M, Cheng H. et al. A unified structure learning framework for graph attention networks. Neurocomputing 2022;495:194–204. 10.1016/j.neucom.2022.01.064 [DOI] [Google Scholar]
23. Pratapa A, Jalihal AP, Law JN. et al. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020;17:147–54. 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Szklarczyk D, Gable AL, Lyon D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019;47:D607–13. 10.1093/nar/gky1131 [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Garcia-Alonso L, Holland CH, Ibrahim MM. et al. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res 2019;29:1363–75. 10.1101/gr.240663.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Liu ZP, Wu C, Miao H. et al. RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database 2015;2015:bav095. 10.1093/database/bav095 [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Han H, Cho JW, Lee S. et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res 2018;46:D380–6. 10.1093/nar/gkx1013 [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Moore JE, Purcaro MJ, Pratt HE. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 2020;583:699–710. 10.1038/s41586-020-2493-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Oki S, Ohta T, Shioi G. et al. Ch IP-atlas: a data-mining suite powered by full integration of public Ch IP-seq data. EMBO Rep 2018;19:e46255. 10.15252/embr.201846255 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Xu H, Baroukh C, Dannenfelser R. et al. ESCAPE: database for integrating high-content published data collected from human and mouse embryonic stem cells. Database 2013;2013:bat045. 10.1093/database/bat045 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Rusek K, Suárez-Varela J, Almasan P. et al. RouteNet: leveraging graph neural networks for network modeling and optimization in SDN. IEEE J Sel Areas Commun 2020;38:2260–70. 10.1109/JSAC.2020.3000405 [DOI] [Google Scholar]
32. Veličković P. Everything is connected: graph neural networks. Curr Opin Struct Biol 2023;79:102538. 10.1016/j.sbi.2023.102538 [DOI] [PubMed] [Google Scholar]
33. Lim J, Ryu S, Park K. et al. Predicting drug–target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 2019;59:3981–8. 10.1021/acs.jcim.9b00387 [DOI] [PubMed] [Google Scholar]
34. Zhou H, Wang W, Jin J. et al. Graph neural network for protein–protein interaction prediction: a comparative study. Molecules 2022;27:6135. 10.3390/molecules27186135 [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Pinheiro, Cinelli L, Araújo Marins M, Barros da Silva EA. et al. Variational Autoencoder[M]//Variational Methods for Machine Learning with Applications to Deep Networks. Cham: Springer International Publishing, 2021. 111–49. 10.1007/978-3-030-70679-1_5. [DOI] [Google Scholar]
36. Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models[C]//proc. icml 2013;30:3. [Google Scholar]
37. Chawla NV, Bowyer KW, Hall LO. et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002;16:321–57. 10.1613/jair.953 [DOI] [Google Scholar]
38. Kingma DP, Welling M. An introduction to variational autoencoders. Found Trends Mach Learn 2019;12:307–92. 10.1561/2200000056 [DOI] [Google Scholar]
39. Vishnusai Y, Kulakarni TR, Sowmya Nag K. Ablation of artificial neural networks. In International Conference on Innovative Data Communication Technologies and Application, pp. 453–60. Cham: Springer International Publishing, 2019. [Google Scholar]
40. Hartley J. Case study research. In: Cassell C, Symon G (eds.), Case Study Research, pp. 323–33. SAGE Publications Ltd., 2004. 10.4135/9781446280119.n26 [DOI] [Google Scholar]
41. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997;30:1145–59. 10.1016/S0031-3203(96)00142-2 [DOI] [Google Scholar]
42. Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML '06). New York, NY, USA: Association for Computing Machinery, 2006, 233–40.
43. Thiel G, Cibelli G. Regulation of life and death by the zinc finger transcription factor Egr-1. J Cell Physiol 2002;193:287–92. 10.1002/jcp.10178 [DOI] [PubMed] [Google Scholar]
44. Kundu M, Chen A, Anderson S. et al. Role of Cbfb in hematopoiesis and perturbations resulting from expression of the leukemogenic fusion gene Cbfb-MYH11. Blood 2002;100:2449–56. 10.1182/blood-2002-04-1064 [DOI] [PubMed] [Google Scholar]
45. Hollenhorst PC, McIntosh LP, Graves BJ. Genomic and biochemical insights into the specificity of ETS transcription factors. Annu Rev Biochem 2011;80:437–71. 10.1146/annurev.biochem.79.081507.103945 [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Lize M, Pilarski S, Dobbelstein M. E2F1-inducible microRNA 449a/b suppresses cell proliferation and promotes apoptosis. Cell Death Differ 2010;17:452–8. 10.1038/cdd.2009.188 [DOI] [PubMed] [Google Scholar]
47. Ju Y, Fang S, Liu L. et al. The function of the ELF3 gene and its mechanism in cancers. Life Sci 2024;346:122637. 10.1016/j.lfs.2024.122637 [DOI] [PubMed] [Google Scholar]
48. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional organization. Nat Rev Genet 2004;5:101–13. 10.1038/nrg1272 [DOI] [PubMed] [Google Scholar]
49. Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol 2017;18:1–15. 10.1186/s13059-017-1215-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Jing L, Tian Y. Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 2020;43:4037–58. 10.1109/TPAMI.2020.2992393 [DOI] [PubMed] [Google Scholar]
51. Collins FS, Morgan M, Patrinos A. The human genome project: lessons from large-scale biology. Science 2003;300:286–90. 10.1126/science.1084564 [DOI] [PubMed] [Google Scholar]
52. Wu Y, Shi Z, Zhou X. et al. scHiCyclePred: a deep learning framework for predicting cell cycle phases from single-cell hi-C data using multi-scale interaction information. Commun Biol 2024;7:923. 10.1038/s42003-024-06626-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
53. Shi Z, Wu H. CTPredictor: a comprehensive and robust framework for predicting cell types by integrating multi-scale features from single-cell hi-C data. Comput Biol Med 2024;173:108336. 10.1016/j.compbiomed.2024.108336 [DOI] [PubMed] [Google Scholar]
54. Zhang Y, Zhang P, Wu H. Enhancer-MDLF: a novel deep learning framework for identifying cell-specific enhancers. Brief Bioinform 2024;25:bbae083. 10.1093/bib/bbae083 [DOI] [PMC free article] [PubMed] [Google Scholar]
55. Liu H, Li D, Wu H. Lnclocator-imb: an imbalance-tolerant ensemble deep learning framework for predicting long non-coding RNA subcellular localization. IEEE J Biomed Health Inform 2023;28:538–47. 10.1109/JBHI.2023.3324709 [DOI] [PubMed] [Google Scholar]
56. Zhang P, Wu H. IChrom-deep: an attention-based deep learning model for identifying chromatin interactions. IEEE J Biomed Health Inform 2023;27:4559–68. 10.1109/JBHI.2023.3292299 [DOI] [PubMed] [Google Scholar]
57. Zhang P, Zhang H, Wu H. iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Res 2022;50:10278–89. 10.1093/nar/gkac824 [DOI] [PMC free article] [PubMed] [Google Scholar]
58. Wu H, Wu Y, Jiang Y. et al. scHiCStackL: a stacking ensemble learning-based method for single-cell hi-C classification using cell embedding. Brief Bioinform 2022;23:bbab396. 10.1093/bib/bbab396 [DOI] [PubMed] [Google Scholar]
59. Wu H, Zhang P, Ai Z. et al. StackTADB: a stacking-based ensemble learning model for predicting the boundaries of topologically associating domains (TADs) accurately in fruit flies. Brief Bioinform 2022;23:bbac023. 10.1093/bib/bbac023 [DOI] [PubMed] [Google Scholar]
60. Zhang P, Wu Y, Zhou H. et al. CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types. Bioinformatics 2022;38:4497–504. 10.1093/bioinformatics/btac575 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary_Materials_bbaf349

supplementary_materials_bbaf349.docx^{(111.4KB, docx)}

Data Availability Statement

The datasets utilized in this study, along with comprehensive details of the GRANet code, are available for access at https://github.com/HaoWuLab-Bioinformatics/GRANet.

[ref1] 1. Alon U. Network motifs: theory and experimental approaches. Nat Rev Genet 2007;8:450–61. 10.1038/nrg2102 [DOI] [PubMed] [Google Scholar]

[ref2] 2. Unger Avila P, Padvitski T, Leote AC. et al. Gene regulatory networks in disease and ageing. Nat Rev Nephrol 2024;20:616–33. 10.1038/s41581-024-00849-7 [DOI] [PubMed] [Google Scholar]

[ref3] 3. Papalexi E, Satija R. Single-cell RNA sequencing to explore immune cell heterogeneity. Nat Rev Immunol 2018;18:35–45. 10.1038/nri.2017.76 [DOI] [PubMed] [Google Scholar]

[ref4] 4. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 2016;17:333–51. 10.1038/nrg.2016.49 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref5] 5. Matsumoto H, Kiryu H, Furusawa C. et al. SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 2017;33:2314–21. 10.1093/bioinformatics/btx194 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref6] 6. Huynh-Thu VA, Irrthum A, Wehenkel L. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One 2010;5:e12776. 10.1371/journal.pone.0012776 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref7] 7. Papili Gao N, Ud-Dean SMM, Gandrillon O. et al. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics 2018;34:258–66. 10.1093/bioinformatics/btx575 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] 8. Kc K, Li R, Cui F. et al. GNE: a deep learning framework for gene network inference by aggregating biological information. BMC Syst Biol 2019;13:38–14. 10.1186/s12918-019-0694-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] 9. Yuan Y, Bar-Joseph Z. Deep learning for inferring gene relationships from single-cell expression data. Proc Natl Acad Sci 2019;116:27151–8. 10.1073/pnas.1911536116 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] 10. Xu Y, Chen J, Lyu A. et al. dynDeepDRIM: a dynamic deep learning model to infer direct regulatory interactions using time-course single-cell gene expression data. Brief Bioinform 2022;23:bbac424. 10.1093/bib/bbac424 [DOI] [PubMed] [Google Scholar]

[ref11] 11. Shu H, Zhou J, Lian Q. et al. Modeling gene regulatory networks using neural network architectures. Nat Comput Sci 2021;1:491–501. 10.1038/s43588-021-00099-8 [DOI] [PubMed] [Google Scholar]

[ref12] 12. Liu W, Yang Y, Lu X. et al. NSRGRN: a network structure refinement method for gene regulatory network inference. Brief Bioinform 2023;24:bbad129. 10.1093/bib/bbad129 [DOI] [PubMed] [Google Scholar]

[ref13] 13. Chen G, Liu ZP. Graph attention network for link prediction of gene regulations from single-cell RNA-sequencing data. Bioinformatics 2022;38:4522–9. 10.1093/bioinformatics/btac559 [DOI] [PubMed] [Google Scholar]

[ref14] 14. Li S, Liu Y, Shen LC. et al. GMFGRN: a matrix factorization and graph neural network approach for gene regulatory network inference. Brief Bioinform 2024;25:bbad529. 10.1093/bib/bbad529 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] 15. Wang J, Chen Y, Zou Q. Inferring gene regulatory network from single-cell transcriptomes with graph autoencoder model. PLoS Genet 2023;19:e1010942. 10.1371/journal.pgen.1010942 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] 16. Li L, Sun L, Chen G. et al. LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinformatics 2023;39:btad256. 10.1093/bioinformatics/btad256 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref17] 17. Wang Q, Guo M, Chen J. et al. A gene regulatory network inference model based on pseudo-siamese network. BMC Bioinformatics 2023;24:163. 10.1186/s12859-023-05253-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] 18. Feng K, Jiang H, Yin C. et al. Gene regulatory network inference based on causal discovery integrating with graph neural network. Quant Biol 2023;11:434–50. 10.1002/qub2.26 [DOI] [Google Scholar]

[ref19] 19. Liu J, Zhou S, Ma J. et al. Graph attention network with convolutional layer for predicting gene regulations from single-cell ribonucleic acid sequence data. Eng Appl Artif Intell 2024; 136:108938. 10.1016/j.engappai.2024.108938 [DOI] [Google Scholar]

[ref20] 20. Wu Z, Pan S, Chen F. et al. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 2020;32:4–24. 10.1109/TNNLS.2020.2978386 [DOI] [PubMed] [Google Scholar]

[ref21] 21. Velickovic P, Cucurull G, Casanova A. et al. Graph attention networks. Stat 2017;1050:10–48550. [Google Scholar]

[ref22] 22. Yuan J, Cao M, Cheng H. et al. A unified structure learning framework for graph attention networks. Neurocomputing 2022;495:194–204. 10.1016/j.neucom.2022.01.064 [DOI] [Google Scholar]

[ref23] 23. Pratapa A, Jalihal AP, Law JN. et al. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020;17:147–54. 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref24] 24. Szklarczyk D, Gable AL, Lyon D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019;47:D607–13. 10.1093/nar/gky1131 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] 25. Garcia-Alonso L, Holland CH, Ibrahim MM. et al. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res 2019;29:1363–75. 10.1101/gr.240663.118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] 26. Liu ZP, Wu C, Miao H. et al. RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database 2015;2015:bav095. 10.1093/database/bav095 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref27] 27. Han H, Cho JW, Lee S. et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res 2018;46:D380–6. 10.1093/nar/gkx1013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref28] 28. Moore JE, Purcaro MJ, Pratt HE. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 2020;583:699–710. 10.1038/s41586-020-2493-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref29] 29. Oki S, Ohta T, Shioi G. et al. Ch IP-atlas: a data-mining suite powered by full integration of public Ch IP-seq data. EMBO Rep 2018;19:e46255. 10.15252/embr.201846255 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref30] 30. Xu H, Baroukh C, Dannenfelser R. et al. ESCAPE: database for integrating high-content published data collected from human and mouse embryonic stem cells. Database 2013;2013:bat045. 10.1093/database/bat045 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] 31. Rusek K, Suárez-Varela J, Almasan P. et al. RouteNet: leveraging graph neural networks for network modeling and optimization in SDN. IEEE J Sel Areas Commun 2020;38:2260–70. 10.1109/JSAC.2020.3000405 [DOI] [Google Scholar]

[ref32] 32. Veličković P. Everything is connected: graph neural networks. Curr Opin Struct Biol 2023;79:102538. 10.1016/j.sbi.2023.102538 [DOI] [PubMed] [Google Scholar]

[ref33] 33. Lim J, Ryu S, Park K. et al. Predicting drug–target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 2019;59:3981–8. 10.1021/acs.jcim.9b00387 [DOI] [PubMed] [Google Scholar]

[ref34] 34. Zhou H, Wang W, Jin J. et al. Graph neural network for protein–protein interaction prediction: a comparative study. Molecules 2022;27:6135. 10.3390/molecules27186135 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref35] 35. Pinheiro, Cinelli L, Araújo Marins M, Barros da Silva EA. et al. Variational Autoencoder[M]//Variational Methods for Machine Learning with Applications to Deep Networks. Cham: Springer International Publishing, 2021. 111–49. 10.1007/978-3-030-70679-1_5. [DOI] [Google Scholar]

[ref36] 36. Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models[C]//proc. icml 2013;30:3. [Google Scholar]

[ref37] 37. Chawla NV, Bowyer KW, Hall LO. et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 2002;16:321–57. 10.1613/jair.953 [DOI] [Google Scholar]

[ref38] 38. Kingma DP, Welling M. An introduction to variational autoencoders. Found Trends Mach Learn 2019;12:307–92. 10.1561/2200000056 [DOI] [Google Scholar]

[ref39] 39. Vishnusai Y, Kulakarni TR, Sowmya Nag K. Ablation of artificial neural networks. In International Conference on Innovative Data Communication Technologies and Application, pp. 453–60. Cham: Springer International Publishing, 2019. [Google Scholar]

[ref40] 40. Hartley J. Case study research. In: Cassell C, Symon G (eds.), Case Study Research, pp. 323–33. SAGE Publications Ltd., 2004. 10.4135/9781446280119.n26 [DOI] [Google Scholar]

[ref41] 41. Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997;30:1145–59. 10.1016/S0031-3203(96)00142-2 [DOI] [Google Scholar]

[ref42] 42. Davis J, Goadrich M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML '06). New York, NY, USA: Association for Computing Machinery, 2006, 233–40.

[ref43] 43. Thiel G, Cibelli G. Regulation of life and death by the zinc finger transcription factor Egr-1. J Cell Physiol 2002;193:287–92. 10.1002/jcp.10178 [DOI] [PubMed] [Google Scholar]

[ref44] 44. Kundu M, Chen A, Anderson S. et al. Role of Cbfb in hematopoiesis and perturbations resulting from expression of the leukemogenic fusion gene Cbfb-MYH11. Blood 2002;100:2449–56. 10.1182/blood-2002-04-1064 [DOI] [PubMed] [Google Scholar]

[ref45] 45. Hollenhorst PC, McIntosh LP, Graves BJ. Genomic and biochemical insights into the specificity of ETS transcription factors. Annu Rev Biochem 2011;80:437–71. 10.1146/annurev.biochem.79.081507.103945 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref46] 46. Lize M, Pilarski S, Dobbelstein M. E2F1-inducible microRNA 449a/b suppresses cell proliferation and promotes apoptosis. Cell Death Differ 2010;17:452–8. 10.1038/cdd.2009.188 [DOI] [PubMed] [Google Scholar]

[ref47] 47. Ju Y, Fang S, Liu L. et al. The function of the ELF3 gene and its mechanism in cancers. Life Sci 2024;346:122637. 10.1016/j.lfs.2024.122637 [DOI] [PubMed] [Google Scholar]

[ref48] 48. Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional organization. Nat Rev Genet 2004;5:101–13. 10.1038/nrg1272 [DOI] [PubMed] [Google Scholar]

[ref49] 49. Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol 2017;18:1–15. 10.1186/s13059-017-1215-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref50] 50. Jing L, Tian Y. Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 2020;43:4037–58. 10.1109/TPAMI.2020.2992393 [DOI] [PubMed] [Google Scholar]

[ref51] 51. Collins FS, Morgan M, Patrinos A. The human genome project: lessons from large-scale biology. Science 2003;300:286–90. 10.1126/science.1084564 [DOI] [PubMed] [Google Scholar]

[ref52] 52. Wu Y, Shi Z, Zhou X. et al. scHiCyclePred: a deep learning framework for predicting cell cycle phases from single-cell hi-C data using multi-scale interaction information. Commun Biol 2024;7:923. 10.1038/s42003-024-06626-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref53] 53. Shi Z, Wu H. CTPredictor: a comprehensive and robust framework for predicting cell types by integrating multi-scale features from single-cell hi-C data. Comput Biol Med 2024;173:108336. 10.1016/j.compbiomed.2024.108336 [DOI] [PubMed] [Google Scholar]

[ref54] 54. Zhang Y, Zhang P, Wu H. Enhancer-MDLF: a novel deep learning framework for identifying cell-specific enhancers. Brief Bioinform 2024;25:bbae083. 10.1093/bib/bbae083 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref55] 55. Liu H, Li D, Wu H. Lnclocator-imb: an imbalance-tolerant ensemble deep learning framework for predicting long non-coding RNA subcellular localization. IEEE J Biomed Health Inform 2023;28:538–47. 10.1109/JBHI.2023.3324709 [DOI] [PubMed] [Google Scholar]

[ref56] 56. Zhang P, Wu H. IChrom-deep: an attention-based deep learning model for identifying chromatin interactions. IEEE J Biomed Health Inform 2023;27:4559–68. 10.1109/JBHI.2023.3292299 [DOI] [PubMed] [Google Scholar]

[ref57] 57. Zhang P, Zhang H, Wu H. iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Res 2022;50:10278–89. 10.1093/nar/gkac824 [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref58] 58. Wu H, Wu Y, Jiang Y. et al. scHiCStackL: a stacking ensemble learning-based method for single-cell hi-C classification using cell embedding. Brief Bioinform 2022;23:bbab396. 10.1093/bib/bbab396 [DOI] [PubMed] [Google Scholar]

[ref59] 59. Wu H, Zhang P, Ai Z. et al. StackTADB: a stacking-based ensemble learning model for predicting the boundaries of topologically associating domains (TADs) accurately in fruit flies. Brief Bioinform 2022;23:bbac023. 10.1093/bib/bbac023 [DOI] [PubMed] [Google Scholar]

[ref60] 60. Zhang P, Wu Y, Zhou H. et al. CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types. Bioinformatics 2022;38:4497–504. 10.1093/bioinformatics/btac575 [DOI] [PubMed] [Google Scholar]

PERMALINK

GRANet: a graph residual attention network for gene regulatory network inference

Junliang Zhou

Ningji Gong

Yanjun Hu

Hong Yu

Guoyin Wang

Hao Wu

Abstract

Introduction

Materials and methods

Datasets

Table 1.

Framework of GRANet

Figure 1.

Multi-feature set extraction module

Graph attention fusion module

Figure 2.

Embedding prediction module

Results

Performance comparison with state-of-the-art methods

Figure 3.

Figure 4.

Ablation study

Table 2.

Figure 5.

Impact of different parameters on the GRANet model

Impact of the number of attention heads in the RCGAT layer on the GRANet model

Table 3.

Impact of embedding dimensions on the GRANet model

Table 4.

Impact of discretization level on the GRANet model

Table 5.

Case study

Figure 6.

Figure 7.

Discussion and conclusion

Key Points

Supplementary Material

Acknowledgements

Contributor Information

Author contributions

Funding

Data availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases