Abstract
The inference of gene regulatory networks (GRNs) is critical for understanding the regulatory mechanisms underlying cellular development, functional specialization, and disease progression. Predicting regulatory gene interactions—often framed as a link prediction task—is a foundational step toward modeling cellular behavior. However, GRN inference from gene coexpression data alone is limited by noise, low interpretability, and difficulty in capturing indirect regulatory signals. Additionally, challenges such as data sparsity, nonlinearity, and complex gene interactions hinder accurate network reconstruction. To address these issues, we propose, a novel graph transformer (GT) based framework (GT-GRN) that enhances GRN inference by integrating multimodal gene embeddings. Our method combines three complementary sources of information: (i) autoencoder-based embeddings, which capture high-dimensional gene expression patterns while preserving biological signals; (ii) structural embeddings, derived from previously inferred GRNs and encoded via random walks and a Bidirectional Encoder Representations from Transformers (BERT) based language model to learn global gene representations; (iii) positional encodings, capturing each gene’s role within the network topology . These heterogeneous features are fused and processed using a GT, allowing the joint modeling of both local and global regulatory structures. Experimental results on benchmark datasets show that GT-GRN outperforms existing GRN inference methods in predictive accuracy and robustness. Furthermore, it reconstructs cell-type-specific GRNs with high fidelity and produces gene embeddings that generalize to other tasks such as cell-type annotation.
Keywords: network inference, graph transformer, graph generation, gene expression, single-cell RNA seq, microarray, data fusion, embedding, global embeddings
Introduction
Systems Biology seeks to understand the big picture in the complex biological systems, focusing on the extraction of relevant biological information within an organism at the cellular level. Biological components, such as genes, interact with each other to reconstruct gene regulatory networks (GRNs) from observational gene expression data [1]. This process is used to unveil a complex web of interactions, shedding light on the underlying patterns that govern gene regulation. Inferring GRNs from gene expression data is crucial for understanding the molecular interaction patterns among genes. A gene network consists of interlinked genes, where the expression of a gene influences the activity of other genes in the network [2]. An effective approach to describing GRNs involves the use of graphical and mathematical modeling, often grounded in graph-theoretic formalism to capture complex interactions between genes. Formally, a GRN is represented as a network of nodes and edges, where the nodes represent genes, and edges represent the regulatory interactions between them [3]. GRN inference involves predicting the connections among macromolecules by analyzing their relative expression patterns.
Technologies such as DNA microarray [4], single-cell RNA sequencing (scRNA-seq) [5], and single-nucleus RNA sequencing (snRNA-seq) [6] have revolutionized transcriptomics by offering diverse and detailed insights into gene expression. Although each of these technologies has its unique strengths, they also come with certain limitations. Common limitations include noisy gene expression data, which often complicate the inference process. Furthermore, the dynamic and nonlinear nature of gene–gene regulatory interactions present a significant challenge, as traditional or linear methods often fail to capture the complex relationships comprehensively. In addition, GRNs tend to be sparse, further reducing the overall accuracy of inference methods. In the case of single-cell technologies, significant dropout events introduce a large number of zero counts in the expression matrix, adding another layer of complexity [7]. These challenges highlight the need for more robust and sophisticated approaches to effectively analyze and interpret gene expression data.
A wide range of supervised and unsupervised GRN inference methods has been developed to uncover the intricate relationships within gene networks [8, 9]. Early approaches relied on relatively simple techniques such as correlation analysis, mutual information (MI)-based methods, and differential equation models. Attempts have been made to understand these complex relationships [10–12]. However, many of these methods exhibit inherent limitations. Thus, the development of more reliable GRN inference techniques remains an important research goal, and numerous intelligent computational strategies have been proposed to address this challenge.
Despite these advancements notable limitations persist. A key concern is that many approaches rely solely on a single source of information, typically gene expression data, to infer GRNs. This alone is often insufficient for accurate and reliable network prediction. In addition, some methods fail to incorporate knowledge from previously inferred GRNs, restricting their ability to build on existing insights. Furthermore, several techniques overlook the integration of topological information, which is essential to capture structural properties critical to robust GRN inference.
Rather than focusing solely on each aforementioned issue, our approach adopts an integrated perspective driven by the intuition that combining multiple complementary sources of information, beyond gene expression alone, can enhance the quality of GRN inference. We propose GT-GRN, a novel approach that integrates the strengths of both unsupervised inference methods and supervised learning frameworks. Our method combines outcomes from the available inference techniques to minimize method-specific biases, ultimately deriving a more realistic and biologically meaningful GRN. To integrate multiple networks, rather than relying on GNN-based methods, which often suffer from over-smoothing when stacking multiple layers. We adopt a state-of-the-art unsupervised approach based on NLP that effectively captures and integrates information across networks. GT-GRN leverages the latest advancements in Graph Transformer models to enhance GRN inference. GT-GRN integrates three distinct representations derived from input expression networks: (i) topological features, which capture the structural properties of the network; (ii) gene expression values, which are crucial for identifying gene interactions; and (iii) the positional importance of genes, which reflects their functional relevance within the network. By fusing the multimodal embeddings from diverse perspectives, our framework improves both the interpretability and predictive power of inferred GRNs, making it a robust solution for various biological applications. GT-GRN superiority comes from multiple design decisions.
(1) Multinetwork integration: A key challenge in supervised GRN inference is the absence of ground-truth networks. True GRNs are often incomplete or unavailable, so we must rely on inferred networks as proxies. However, using a single inferred network can introduce bias or overlook critical interactions. We incorporate multiple networks inferred by different methods, harnessing their complementary strengths. While various inference models exist, each with its own set of advantages and limitations, combining these diverse sources allows us to leverage their shared strengths. This approach helps mitigate methodological bias, ultimately enhancing the confidence and accuracy of our GRN predictions.
(2) Gene expression embedding: Capturing meaningful representations of gene expression data through advanced embedding techniques can provide a richer understanding of the underlying regulatory mechanisms and improve GRN inference.
(3) GT frameworks: Traditional GNNs rely on local message-passing mechanisms to infer graph structures. However, adopting GT-based frameworks offers a more effective encoding strategy by leveraging global attention mechanisms, enabling better capture of complex regulatory relationships in GRNs.
The contributions of our present work are listed below:
We capture the quantitative characteristics of gene expression profiles through an autoencoder that learns biologically meaningful latent representations, effectively summarizing complex gene activity patterns while preserving essential regulatory signals (Section “Gene Expression Feature Encoding” ).
We introduce a method to consolidate prior knowledge from multiple inferred GRNs by converting networks into text-like sequences, enabling a BERT-based masked language model to learn global gene embeddings that integrate structural information across all networks (Section “Global Embeddings via multinetwork integration of the inferred GRNs”).
We propose a novel framework, GT-GRN, which leverages attention mechanisms within a GT model to learn rich gene embeddings by integrating multisource data—including gene expression profiles, structured inferred knowledge, and graph positional encodings from the input graph. These unified gene embeddings effectively capture the underlying biological relationships between genes, facilitating enhanced GRN inference (Section “Graph transformer for GRN inference”).
We demonstrate that GT-GRN effectively advances cell-type-specific GRN reconstruction. Moreover, the superior quality of the learned embeddings enables their successful application to cell type annotation tasks, highlighting the model’s robustness and generalizability.
The remainder of the paper is organized as follows. Section “Related work” reviews related work in GRN inference. Section “Materials and methodology” describes the proposed GT-GRN framework, including gene expression embedding, multinetwork integration, and graph positional encoding, along with details of the datasets used. Section “Results and analysis” presents experimental results and analysis. Section “Application of GT-GRN on cell-type classification” demonstrates the classification capabilities of GT-GRN. Finally, Section “Conclusion” concludes the paper by summarizing key findings and describing directions for future research.
Related work
With decades of effort within the research community dedicated to deciphering gene regulatory relationships from gene expression data, numerous methods have been proposed for reconstructing GRNs [13]. Traditional approaches include regression-based techniques [14] and MI-based methods, such as Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE) [15], Minimum Redundancy Networks (MRNET) [16], and Context Likelihood of Relatedness (CLR) [17], which assess statistical dependencies between genes to infer potential regulatory interactions. Network Structure Controlling-based GRN inference method (NSCGRN) [18] is a global network partitioning and local network motif-based control framework for GRN inference. Global structure dominates the overall network structure to enforce hierarchy and sparsity while the local topology is refined using four known network motifs to adjust the specific patterns to improve biological plausibility.
Efforts from the area of computational biology have addressed data imbalance and noise with innovative modeling strategies based on complex-valued polynomial models [19]. Other approaches based on optimization, such as PGRNIG [20], which combines a parallel whale optimization algorithm with decomposition and regularization strategies, have shown high accuracy and speed in GRN inference from time-series data.
Supervised machine learning methods have also been explored for GRN inference. Support vector machines (SVMs) have been utilized to reconstruct biological networks through local modeling approaches [21]. Extensions such as CompareSVM [22] and GRADIS [23] further leverage classification-based frameworks to enhance network prediction accuracy.
With the advent of deep learning, more powerful and data-driven models have emerged. For instance, Daoudi et al. [24] proposed a deep neural network (DNN) model to infer GRNs from experimental data. Turki et al. [25] integrated both supervised and unsupervised learning techniques to perform link prediction on time-series gene expression data. Mao et al. [26] introduced a 3D convolutional neural network (CNN) model utilizing single-cell transcriptomic data, employing a novel labeling trick to enhance performance. GNE [27], a graph-based deep learning framework, unified known gene interactions, and expression profiles to robustly infer GRNs in a scalable manner. Other works such as Teji et al., [28, 29], use synthetic data to evaluate various embedding models to evaluate for GRN inference in link prediction setup.
Significant progress has been made in developing approaches that incorporate machine learning techniques to infer the network from gene expression [30]. A recent trend in network-science has gained significant momentum in modeling graph-based applications powered by graph neural networks (GNNs). For example, Wang et al. propose GRGNN [31] method to reconstruct GRNs from gene expression data in a supervised and semi-supervised framework. The problem is formulated as a graph classification problem for GRN inference on DREAM5 benchmarks. Q-graph attention network (GAT) [32] proposes a quadratic complexity neuronal network using dual attention mechanism for GRN inference. The model is validated by introducing adversarial perturbations to the gene expression data on E. coli (Escherichia coli) and S. cerevisiae datasets. Huang et al. [33] propose a GNN-based model called MIGGRI for GRN inference using spatial expression images that capture gene regulation from multiple images. DeepRIG [34] emphasizes learning global regulatory structures by embedding entire graphs using a graph autoencoder, thereby capturing comprehensive latent representations. GMFGRN [35] applies graph convolutional networks (GCNs) to factorize scRNA-seq data into gene and cell embeddings, which are then used in a multilayer perceptron (MLP) for interaction prediction. GNNLink [36] frames GRN inference as a link prediction problem by employing a GCN-based interaction graph encoder to capture and infer potential regulatory dependencies between genes. AnomalGRN [37] addresses the challenges of heterogeneity and sparsity in GRNs by reformulating GRN inference as a node prediction task. To tackle the pronounced imbalance between positive and negative links, the authors cast the problem as a graph anomaly detection (GAD) task, enabling the identification of anomalous regulatory patterns within the network.
Another line of work also concentrates on Transformer-based architectures for GRN inference. TRENDY [38] leverages transformer models to construct a pseudo-covariance matrix as part of the WENDY [39] framework. Rather than generating GRNs from scratch, it enhances existing inferred GRNs. However, it does not incorporate additional structural side information into the inference process. STGRNS [40] is an interpretable transformer-based method for inferring GRNs from scRNA-seq data. This method only considers two genes for gene regulation excluding the possibility of indirect regulation for prediction. GRN-Transformer [41] utilizes multiple statistical features extracted from scRNA-seq data and uses inferred GRN extracted from a single inference algorithm PIDC [42].
Despite these advancements, many GRN inference methods still rely on a single source of data or even purely topological information. They often emphasize local or pairwise gene interactions. This narrow focus limits the depth and breadth of biological insights. Although deep learning approaches such as MLPs, CNNs, and GNNs have significantly improved inference performance, they frequently process each data modality in isolation, missing opportunities for deeper integration.
The present research takes a step forward by proposing a novel GT framework that integrates multiple sources of biological information for GRN inference. Unlike conventional methods that rely on convolution-based architectures, our approach leverages graph-based attention mechanisms to effectively model complex regulatory relationships. A key strength of this framework lies in its ability to fuse diverse embeddings derived from gene expression data, input graph structures, and both existing and previously inferred regulatory networks. By jointly leveraging these complementary sources within a unified model, it enables more accurate and biologically meaningful inference of GRNs.
Materials and methodology
This section outlines the methodology of the proposed GT-GRN framework for GRN inference, followed by a description of the datasets used for evaluation. The framework is composed of three key modules: (i) encoding gene expression profiles as embedding features using unsupervised deep learning; (ii) extracting global gene embeddings through multinetwork integration; and (iii) capturing graph positional encodings from the input network structure. These complementary representations—gene expression embeddings, prior gene representations, and graph positional encodings—are fused to enhance GRN interaction prediction within a GT model. The effectiveness of our approach is then evaluated using publicly available gene expression datasets.
Gene expression feature encoding
Gene expression data are increasingly complex due to advances in profiling technologies, making traditional linear models insufficient to capture its intricate patterns. Unsupervised deep learning, particularly variational autoencoders (VAEs) [43], offers a powerful way to nonlinearly encode such data into compact, informative representations that better reflect underlying biological structures. Figure 1 illustrates the overview VAEs for encoding gene expressions. VAEs are a probabilistic deep generative class of neural networks designed to reconstruct input data by learning a compressed, low-dimensional representation that effectively characterizes the input. Using VAE for gene expression encoding, we can efficiently capture complex expression dynamics and generate compact feature representations for further analysis and downstream tasks. VAEs are a powerful framework for unsupervised learning and generally comprise two interconnected components: an encoder and a decoder.
Figure 1.
Variational autoencoder (VAE) for gene expression embedding, where the encoder maps the original gene expression matrix
into latent variables (
), and the decoder reconstructs
from the latent representation
using a neural network to minimize the reconstruction loss.
Encoder: It maps the input gene expression matrix
to a latent representation space
. It approximates the posterior distribution
using a neural network. The encoder outputs the parameters, i.e. the mean and variance of a multivariate Gaussian distribution
, that serves as an approximation of the true posterior
. This process captures the underlying biological variability and regulatory patterns among genes.Decoder: It takes a sample
from the latent space and maps it back to the original gene expression space, generating a reconstructed matrix
. This is modeled by the likelihood
, which represents the probability of generating the observed gene expression profiles
given the latent variables
. The decoder, implemented via a neural network, learns to reconstruct biologically plausible gene expression patterns from the learned latent representations.
During training, the VAE aims to learn the parameters of the encoder and the decoder network parameters by maximizing the Evidence Lower Bound (ELBO) which is given by:
![]() |
(1) |
where
is the reconstruction term that reconstructs the input gene expression data
given the latent representation
.
is the KL (Kullback–Leibler) divergence that quantifies the distance between the approximate posterior
and the prior distribution
.
Since sampling from learned distributions is inherently nondifferentiable, it hinders the use of gradient-based optimization during backpropagation. To address this, the reparameterization trick introduces a differentiable transformation by expressing the random variable as
, where,
is a deterministic function and
is an auxiliary noise variable drawn from a fixed, independent distribution. The above problem can be rewritten as:
![]() |
![]() |
Where
,
is the element-wise product and
is the identity matrix, which serves as the covariance matrix.
Global embeddings via multinetwork integration of the inferred gene regulatory networks
We integrate multinetwork information to understand gene interactions as prior knowledge. Integrating data from diverse inference methods, each with unique strengths and limitations, provides a holistic and reliable view of the network. This approach overcomes the shortcomings of relying on a single method, enabling robust downstream analysis. Figure 2 illustrates the workflow.
Figure 2.
Global gene embeddings via multinetwork integration, where gene expression data are processed through inference algorithms to generate networks, sampled via random walks into node sequences beginning with a [CLS] token, tokenized and embedded, and passed through a transformer trained via masked node prediction to produce final gene embeddings.
Unsupervised network integration via random walks and transformers
We present an unsupervised learning approach for integrating multiple networks to generate global embeddings. Let a graph
represent an inferred network, where
denotes the set of nodes and
represents the set of edges. The graph is characterized by its adjacency matrix
, where each entry
reflects the relationship between nodes
and
. Specifically,
if an edge
connects
and
, and
otherwise, indicating no direct connection.
We consider a collection of
networks,
, sharing the same set of
nodes but differing in the number of edges in each network. We capture the structural information of the networks by converting them into text-like sequences using random walks, similar to node2vec [44]. The walks are encoded through an embedding matrix
, where
is the size of the vocabulary (total nodes across all networks) and
is the desired embedding dimension. Positional encodings are used to account for node order as described in [45]:
![]() |
(2) |
Here,
represents the
th coordinate of the position encoding at sequence position
. These encodings are concatenated with the original input features or the embedding matrix.
The embedding matrix and the decoding layer are initialized with uniform random values, while the transformer layer is initialized using Xavier’s initialization [46]. During training, all parameters are updated. Each sequence begins with a special classification token
, while other tokens correspond to node-specific vectors from the embedding matrix. The final hidden state of the
token for a given sequence serves as the sequence representation.
Masked language learning with BERT
We utilize the Masked Language Modeling (MLM) approach as implemented in BERT [46]. At its core, this method employs a transformer encoder composed of
identical blocks. Each block includes a self-attention mechanism followed by a feedforward neural network (FFN), as described in [45].
Let
denotes an input sequence of
tokens, where each token is represented by a
-dimensional vector. A self-attention layer processes this sequence using the following transformation:
![]() |
(3) |
where
,
, and
. Here,
,
, and
are learnable matrices that project the input into query, key, and value spaces of node sequences, respectively.
is the dimension of the key vectors.
The feedforward layer, applied independently to each token, performs the transformation:
![]() |
(4) |
where
and
are learnable matrices, and
and
are bias vectors. FFN is the feed-forward network and
is the global gene embeddings.
The MLM task involves masking a random subset of input tokens and predicting their identities based on the remaining context. This encourages the model to capture bidirectional contextual relationships within sequences. Specifically, we mask
of the tokens (representing nodes) in each sequence and train the model to recover the masked tokens using a cross-entropy loss function:
![]() |
(5) |
where
is the batch size,
is the sequence length and
is the number of classes (total number of possible tokens that can be predicted).
is a binary indicator equal to
if the correct class of token
in batch
is
, and
is the predicted probability for this classification. The final embeddings are extracted from the embedding layer represented as
.
This enables the model to learn rich contextual embeddings for nodes, capturing both structural and positional relationships within the networks.
Graph transformer for gene regulatory network inference
After deriving features from available GRNs and gene expression data, we utilize the GT to learn comprehensive representations by injecting the underlying regulatory structure. Since GT is specifically designed to model complex dependencies in graph-structured data, it effectively captures gene-gene interactions based on attention mechanisms, making it well suited for GRN-based representation learning. GT-GRN is illustrated in Fig. 3.
Figure 3.
Architecture diagram of GT-GRN, where graph transformer layer processes input graph with gene expression, graph positional, and global embeddings to generate gene representations, followed by a link predictor module that estimates connections between two genes using their embeddings
and
.
Graph positional encodings
NLP-oriented Transformers are supplied with Positional Encodings. At the heart of GT, graph positional encodings hold a special place which is important for encoding node positions. From the available graph structure, we make use of Laplacian eigenvectors and use them as graph positional encoding (
) information. This is helpful to encode distance-aware information, i.e. nearby nodes would have similar positional features and vice versa. Eigenvectors are defined as the factorization of the graph Laplacian matrix:
![]() |
(6) |
where
is the input adjacency matrix,
is the identity matrix of size
.
is the degree matrix,
is the eigen-values and
are the eigenvectors. We then use the
smallest significant eigen-vectors of a node as its positional encoding, which is denoted by
for node
.
Input to GT-GRN
The input to the GT layer is the graph structure
and its associated features
. The features
are constructed as a combination of gene expression embeddings (
), global gene embeddings (
), and graph positional embeddings (
), which are derived from the graph
.
- For gene expression embeddings, each gene
is passed through a linear projection layer to embed it into a
-dimensional space.
where,
(7)
is a learnable weight matrix,
is the learnable bias vector, and
is the projected embedding. - For the global gene embeddings, each vector
is also embedded into a
-dimensional space using a separate linear projection layer. The transformation is defined as:
where,
(8)
is a learnable weight matrix,
is the learnable bias vector. - The graph positional encodings are extracted from the input graph
. For a particular gene’s positional encoding
is embedded into
-dimensional space using a linear projection layer which is given by:
where,
(9)
is a learnable weight matrix,
is a learnable bias vector.
Finally, the gene expression embeddings
, global gene embeddings
, and graph positional embeddings
are each projected into a shared
-dimensional space through separate linear transformation layers. These projected representations are then summed element-wise (often called fusion by summation) to form the final node features
:
![]() |
(10) |
where,
is the unified, per-gene embedding that fuses its expression profile, its topological position in the regulatory network, and a dataset-wide global context. This final representation
is then injected into the GT layer along with the adjacency matrix
of the input graph
.
Graph transformer layer
The node update in the GT at layer
is defined as follows:
![]() |
(11) |
![]() |
(12) |
and
,
denotes number of attention heads, and
denotes the concatenation of the number of heads.
,
,
,
, The attention outputs
are then passed to an FFN preceded and succeeded by residual connections and normalization layers as:
![]() |
(13) |
![]() |
(14) |
![]() |
(15) |
where
,
, and
,
are the intermediate representations. Norm could be either Layer-Norm [47] or BatchNorm [48].
Link prediction with learned representations
The final module is designed to predict edges between nodes using the learned node representations
obtained from the GT layer. The module takes as input the embeddings
, where
is the number of nodes and
is the embedding dimension, along with an edge index representing node pairs. For each edge
, the embeddings of the source node
and the destination node
are extracted and concatenated to form a feature vector. This vector is passed through a decoder network consisting of a multilayer perceptron (MLP) with a hidden layer, ReLU activation, and an output layer, which reduces the concatenated vector to a scalar. The scalar represents the predicted likelihood of an edge between the nodes
and
. By utilizing the updated node embeddings
, this module effectively learns to identify and score potential edges in the graph.
Next, we discuss the experimental setup used to demonstrate the superiority of GT-GRN.
Experimental setup
The performance of GT-GRN is evaluated on Linux based NVIDIA RTX A3000 GPU as the computing machine. The deep learning libraries used here are pytorch(https://pytorch.org/), dgl (https://www.dgl.ai/), scikit-learn (https://scikit-learn.org/stable/), Pytorch Geometric (https://pytorch-geometric.readthedocs.io/en/latest/index.html∖#).
Datasets
We establish our findings on two scRNA-seq human cell types: human embryonic stem cells (hESC) [49] and mouse embryonic stem cells (mESC) [50] from the BEELINE [51] study. The cell-type-specific ChIP-seq ground-truth networks are used as a reference for these datasets. Additionally, we use the synthetic expression profiles generated using GeneNetWeaver (GNW) [52], a simulation tool developed for DREAM (Dialogue on Reverse Engineering Assessment and Methods) along with their corresponding ground-truth networks. The details of the datasets have been discussed in Table 1.
Table 1.
Dataset statistics
| Species/cell types | Type | Source |
|
|
|---|---|---|---|---|
| Yeast | Microarray | GNW | 4000 | 11,323 |
| hESC-500 | scRNA-seq | BEELINE | 910 | 3940 |
| mESC-500 | scRNA-seq | BEELINE | 1120 | 20,923 |
| hESC-1000 | scRNA-seq | BEELINE | 1410 | 6139 |
| mESC-1000 | scRNA-seq | BEELINE | 1620 | 30,254 |
Preprocessing of raw data
We preprocess the raw scRNA-seq data using an established method [51] to handle redundancy. We filter out low-expressed genes and prioritized the variable ones. Primarily, the genes expressed in <10% of cells were removed. Then, we computed the variance and
-values for each gene, selecting those with P-values below.01 after Bonferroni correction. Gene expression levels were log-transformed for normalization. This yielded a feature matrix
, where
is the number of genes and
is the number of cells. Furthermore, we adopt the approach of Pratapa et al. [51] to assess performance across different network sizes. Specifically, we rank genes by variance and select the most variable transcription factors (TFs), along with the top 500 and 1000 genes with the highest variability.
Baseline methods
We evaluate the efficacy of GT-GRN against the existing baselines methods commonly used for inferring GRNs are shown in Table 2.
Table 2.
Summary of GRN inference methods classified by category
| Category | Method | Description |
|---|---|---|
| Graph neural network | GNNLink [36]a | Uses a GCN-based interaction graph encoder to capture gene expression patterns. |
| GENELink [53]b | Leverages a GAT to infer GRNs via attention mechanisms. | |
| GNE [27]c | Uses an MLP to encode gene expression profiles and network topology for predicting gene regulatory links. | |
| MI | ARACNE [15]d | Infers networks based on adaptive partitioning (AP) and MI. |
| BC3NET [54]e | An ensemble technique derived from the C3NET algorithm employing bagging. | |
| C3MTC [55]f | Infers networks where edge weights are defined by MI values. | |
| C3NET [56]g | Uses MI and a maximization step to capture causal structure. | |
| Feature selection | MRNET [16]h | Applies supervised gene selection using MRMR (maximum relevance/minimum redundancy). |
| Ensemble tree-based | GRNBOOST2 [57]i | A fast inference algorithm using stochastic gradient boosting regression. |
| GENIE3 [58]i | A classic inference algorithm using random forest or extra trees regression. |
a https://github.com/sdesignates/GNNLink bhttps://github.com/zpliulab/GENELink chttps://github.com/kckishan/GNE dhttps://bioconductor.org/packages/release/bioc/html/minet.html ehttps://cran.r-project.org/web/packages/bc3net/index.html fhttps://cran.r-project.org/web/packages/bc3net/index.html ghttps://cran.r-project.org/web/packages/c3net/index.html hhttps://bioconductor.org/packages/release/bioc/html/minet.html ihttps://github.com/aertslab/arboreto jhttps://github.com/aertslab/arboreto
Results and Discussion
We report results using both single-cell and microarray gene expression datasets, selecting representative methods from each major category of GRN inference techniques against GT-GRN. These include MI-based methods, feature selection approaches, ensemble tree-based models, and graph neural network frameworks. This diverse selection allows us to comprehensively evaluate performance across different inference paradigms, ensuring a balanced comparison that highlights the strengths and limitations of each method.
Gene regulatory network inference via full network reconstruction
Fundamentally, GRN inference aims to reconstruct the entire regulatory network, capturing the full complexity of gene interactions. By striving for complete network reconstruction, it seeks to reveal the intricate web of regulatory relationships among all involved genes, reflecting the true, comprehensive regulatory architecture [59]. We evaluate the effectiveness of GT-GRN alongside existing methods designed for GRN inference. Following a similar motivation, we report these results for GT-GRN and its baseline methods in Fig. 4. In the figure, the results of the full network reconstruction, highlight the comparative performance of different methods. For the scRNA dataset, the performance of GT-GRN remains consistently higher with minor variations for all datasets. This suggests that the GT-GRN method is robust for different cell types and sequencing depths.
Figure 4.
Full network reconstruction performance of various methods for different datasets in terms AUROC score. (a) BEELINE’s scRNA-seq datasets and (b) GNW’s Yeast dataset.
For the Yeast (microarray) dataset, the GT-GRN method significantly outperforms all other network inference methods, indicating its superior ability to capture gene interactions. Baseline methods such as ARACNE, BC3NET, C3MTC, C3NET, and MRNET perform at similar levels. In general, GT-GRN appears to be the most effective method for the microarray dataset, while for scRNA expression profiles the method demonstrate stable performance across conditions in terms of AUROC score.
Gene regulatory network inference via link prediction
There exist many benchmark methods that treat GRN inference as a link prediction problem, focusing on identifying only a limited subset of interactions. We report results using this conventional approach. Table 3 reports the performance results for the candidate datasets in terms of Area Under the Receiver-Operating Characteristic Curve (AUROC) and Area Under the Precision-Recall Curve (AUPRC) metrics. To assess the predictive performance of GT-GRN, we present a comparison with various baseline methods. In the table it is clearly evident that the GT-GRN consistently outperforms other methods in all datasets, achieving the highest AUROC and excelling in AUPRC, particularly for the mESC-1000 and mESC-500 datasets. GNE emerges as a strong contender, especially in the hESC-1000 dataset, where it achieves the highest AUPRC
and the AUROC of
, indicating its effectiveness in balancing precision and recall. GENELink maintains strong performance across most datasets, while GNNLink performs well in AUROC but lags in AUPRC. In contrast, GENIE3 and GRNBOOST2 show consistently lower scores, indicating challenges in handling these complex datasets. Notably, GNE outperforms all methods in both AUROC and AUPRC for the hESC-1000 dataset, highlighting its scalability. For the Yeast-4000 dataset, GT-GRN demonstrates superior performance compared to its baseline counterparts. Overall, GT-GRN proves to be the most reliable across datasets, while GNE stands out for its precision, making them the top choices for biological network predictions in this study.
Table 3.
AUROC and AUPRC scores for different methods across the various scRNA-seq datasets
| Dataset | Method | AUROC | AUPRC |
|---|---|---|---|
| mESC-1000 | GT-GRN | 0.9483 | 0.8990 |
| GNNLink | 0.8833 | 0.8660 | |
| GENELink | 0.9133 | 0.8103 | |
| GNE | 0.8984 | 0.8925 | |
| mESC-500 | GT-GRN | 0.9402 | 0.8853 |
| GNNLink | 0.8768 | 0.8331 | |
| GENELink | 0.9057 | 0.8004 | |
| GNE | 0.8378 | 0.8416 | |
| hESC-500 | GT-GRN | 0.8793 | 0.5932 |
| GNNLink | 0.8251 | 0.4542 | |
| GENELink | 0.8618 | 0.5581 | |
| GNE | 0.8402 | 0.8466 | |
| hESC-1000 | GT-GRN | 0.8784 | 0.8604 |
| GNNLink | 0.8442 | 0.5011 | |
| GENELink | 0.8657 | 0.5610 | |
| GNE | 0.9025 | 0.9042 |
Bold values indicate the best performance.
Impact of hyperparameters
Additionally, we investigate the role of various hyperparameters (HPs) that influence the overall performance of the models. Optimal selection of HPs is a difficult task and time-consuming activity. We analyze the behavior of the learning models compared to GT-GRN by selecting key HPs from each baseline model to assess their impact on the overall performance of each model. Table 4 describes the parametric configurations that we tuned for each learning model with their respective explanations. We summarize the overall impact of HP tuning for different models using a boxplot, as shown in Fig. 5. This visualization presents a comprehensive comparison of our model, GT-GRN, against several baseline approaches in terms of AUROC performance across multiple datasets.
Table 4.
Tuned hyperparameters (HPs) for different models, where an epoch represents one full pass over the dataset, the learning rate controls the update pace, the output dimension defines the final prediction layer shape, attention heads compute and aggregate attention over input elements, and layers indicate the number of transformation steps applied sequentially to produce the output and pass to the next layer
| Model | Epochs | Learning rate | Output dimensions | Attention heads | Layers |
|---|---|---|---|---|---|
| GT-GRN | – | 0.001, 0.003, 0.0005 | – | 2,4,8 | 4,6,8 |
| GNNLink | 100, 200, 300 | 0.001, 0.005, 0.01 | 128, 256, 512 | – | – |
| GENELink | 10, 20, 30 | 0.001, 0.003, 0.0005 | 128, 256, 512 | – | – |
| GNE | 10, 20, 30 | 0.001, 0.005, 0.01 | 128, 256, 512 | – | – |
Figure 5.
Overall HP tuning plot for various models.
A key observation is that GT-GRN consistently achieves higher median AUROC scores with notably lower variance across datasets, highlighting its robustness and reliability under varying experimental conditions. Each model was tuned with its respective optimal HPs, yet GT-GRN exhibits both stability and effectiveness across settings, clearly outperforming the existing baselines on all candidate datasets. In contrast, GNNLink shows a higher variance, particularly in the hESC-500 dataset, where its performance fluctuates significantly. GENELink and GNE display relatively stable performances, though both fall short of the superior AUROC achieved by GT-GRN. In particular, the mESC-1000 dataset highlights the clear dominance of GT-GRN, with its AUROC surpassing that of all other methods by a substantial margin.
Application of GT-GRN on cell-type classification
Cell-type-specific GRNs are crucial for defining transcriptional states during development, with each cell-type being characterized by a unique set of active TFs. These GRNs offer an unbiased method for studying gene regulation, providing valuable insights into the mechanisms driving cellular diversity. In this context, we explore the effectiveness of GT-GRN in reconstructing cell-type-specific GRNs, with the goal of cell-type annotation. To achieve this, we apply GT-GRN to scRNA-seq data from over 8000 human peripheral blood mononuclear cells (PBMCs8k), sourced from 10X Genomics (https://www.10xgenomics.com/datasets/8-k-pbm-cs-from-a-healthy-donor-2-standard-2-1-0). The data are preprocessed using the Scanpy framework [60], ensuring efficient handling and analysis of the single-cell data. For the ground truth network, we utilize the hTFtarget database [61], which integrates ChIP-seq data, TF binding sites, and epigenetic modification information. This comprehensive resource provides detailed insights into gene regulation and TF-target interactions, making it an invaluable tool for studying gene regulatory mechanisms.
In order to perform cell-type annotation, it is essential to first reconstruct the GRN. This involves generating embeddings that represent the cell types based on their GRNs. These embeddings serve as the foundation for cell-type classification, enabling accurate annotation of the cell types based on their unique regulatory patterns. By leveraging the power of GT-GRN in inferring cell-type-specific GRNs, we aim to advance the cell-type annotation in scRNA-seq data, ultimately improving our understanding of the complex regulatory landscapes that define cellular identities.
GT-GRN for PBMC network reconstruction
We investigate GRN inference using GT-GRN by formulating it as a network regeneration problem on the PBMC dataset. First, we filter the data by removing genes expressed in <5% of cells and discarding cells that express <200 genes. Next, we normalize the total counts per cell to 10,000, ensuring comparability across cells. To further refine the data, we apply MAGIC [62] imputation, which reduces noise and improves expression patterns. Finally, we perform a logarithmic transformation to improve interpretability and optimize the data for downstream analysis.
After preprocessing, we employ GT-GRN to reconstruct the PBMC GRN and compare its performance against existing baseline methods. Comparison of PBMC’s hTFTarget (Gold standard) and generated networks in Table 5 reveals key structural differences and predictive performance variations. GT-GRN achieves the highest AUROC (0.9852) while maintaining balanced connectivity and clustering, making it the best-performing model. Supplementary Fig. S1 reports the ROC curve of the GT-GRN model showing high discriminative ability with performance well above the random baseline (dashed line). GENELink and GNNLink exhibit dense connectivity, high clustering, and shorter path length but are highly disassortative, indicating a strong preference for high-degree nodes connecting to low-degree ones. GNE, with lower connectivity and clustering, results in longer path lengths and the lowest AUROC (0.7596) but retains some structural similarities to the Gold network. The Gold standard itself maintains moderate connectivity and a sparse clustering structure, serving as a key benchmark. We also measure the quality of the generated graph network characteristics from the candidate methods with the input network (hTFTarget) using a single measurement score using Pearson correlation coefficient. The results show that all methods exhibit strong correlation, with GNE (0.9992) achieving the highest agreement, followed closely by GT-GRN (0.9838). However, GNNLink and GENELink report the similar score of 0.9817. These insights highlight the trade-offs between network structure and predictive performance, guiding model selection for biological network analysis.
Table 5.
Network Characteristics Comparison of PBMC’s hTFTarget and generated networks with AUROC Score. Maximum degree computes the degree over all vertices. Assortativity is the Pearson correlation of degrees of connected nodes. Triangle count denotes the connection between two nodes. Clustering coefficient measure of the tendency of nodes in a network to form triangles. Characteristic path length represents the average shortest path length between all nodes pairs in a network. PCC is the Pearson Correlation Coefficient between hTFTarget and generated network characterisitics
| Model | Maximum degree | Assortativity | Triangle count | Clustering coefficient | Characteristic path length | AUROC | PCC |
|---|---|---|---|---|---|---|---|
| Gold | 1837 | −0.4762 | 9596 | 0.0060 | 2.7392 | – | – |
| GT-GRN | 2593 | −0.6614 | 207,579 | 0.0255 | 2.1934 | 0.9852 | 0.9838 |
| GENELink | 3999 | −0.9867 | 5,671,530 | 0.0389 | 1.9731 | 0.8810 | 0.9817 |
| GNNLink | 3999 | −0.9867 | 5,671,530 | 0.0389 | 1.9731 | 0.8467 | 0.9817 |
| GNE | 924 | −0.1872 | 4014 | 0.0094 | 2.9621 | 0.7596 | 0.9992 |
Further, we analyze the degree-distribution plot of the generated networks in comparison to the input PBMC’s hTFTarget network. Figure 6 describes the log–log degree distribution plot compares the degree distributions of the input (Gold) and generated networks (GT-GRN, GENELink, GNNLink, and GNE). The Gold network follows a natural decay which is scale-free in degree distribution, while GT-GRN shows a similar trend with slight deviations. GNE displays a more scattered pattern, indicating the similar tailed degree-distribution. GENELink and GNNLink exhibit significantly higher maximum degrees, i.e. these generated networks generate more high-degree nodes than the input network.
Figure 6.
Log–log degree distribution plot of the input PBMC’s hTFTarget network and generated network for different models.
To optimize the architecture of GT-GRN model, we conducted a comprehensive HP search across different numbers of layers and attention heads on PBMC dataset. We evaluated model performance using the AUROC metric across varying combinations of input modalities. Positional embeddings, global and gene expression embeddings (unimodal), all pairwise combinations (bimodal), and the full trimodal input. Supplementary Fig. S2 shows the AUROC performance for each configuration focusing on the number of layers and attention heads. Each line represents a different head configuration (2, 4, and 8). For single-modal embeddings (top row), performance varies moderately with layer depth: positional embeddings show a decline at higher layers, whereas global and gene expression embeddings remain relatively stable. In two-modal combinations (middle row), AUROC is generally higher than in single-modal cases, indicating that combining modalities improves predictive performance. Some combinations benefit from deeper layers, while others peak at intermediate depths. For the three-modal combination (bottom row), integrating all three embeddings achieves the highest AUROC overall, although the optimal layer and head configuration differ slightly, reflecting complex interactions between modalities.
Furthermore, we report the computational efficiency of GT-GRN in comparison to the baselines on PBMC dataset, as detailed in Supplementary Table T1. It highlights a clear trade-off between predictive performance and resource requirements. GT-GRN incurs a higher computational cost compared with lightweight methods such as GNNLink and GNE, requiring
1.5 h for execution on the PBMC dataset, while GNNLink and GNE complete in under 12 min and 1 min, respectively. Although GT-GRN is more resource-intensive, this additional cost stems from its GT architecture, which jointly integrates positional, global, and gene expression embeddings. In contrast, baseline methods rely on simpler architectures with limited feature integration, resulting in faster runtimes but reduced representational capacity. Importantly, GT-GRN remains significantly more efficient than GENELink, which exceeds 2 h of runtime, suggesting that our method balances accuracy and computational feasibility.
Next, we assess how well our embeddings capture the community structure by clustering gene embeddings. We utilize the Leiden algorithm [63] to cluster the resultant gene embeddings. Figure 7 presents a uniform manifold approximation and projection (UMAP) dimensional reduction of gene representations for various methods. The methods compared include Gene Expression data, GENELink, GNNLink, GNE, and GT-GRN. The visualization clearly shows the distinct community structures produced by GNNLink, and GT-GRN embeddings that indicates effective preservation of biological modules. Notably, GT-GRN embeddings exhibit tighter and more biologically coherent clusters that align with functional gene modules, suggesting that the model captures regulatory programs. These preserved clusters provide evidence of pathway-level organization, highlighting the ability of GT-GRN to reveal biologically meaningful communities relevant to cellular processes.
Figure 7.
UMAP visualization of genes representations for PBMC’s dataset according to different methods.
GT-GRN for cell-type annotation
We further investigate these embeddings for cell-type annotation task. We manually annotate cell-types for gene expression data and focus on four cell-types with highest number of cells, CD4+ T cells, CD14+ Monocyte cells, and CD8+ T cells. We train a three-layered multilayered perceptron classifier for annotating cell-types. The classifier is trained using multiclass classification setting with five-fold cross validation. We benchmark GT-GRN against GENELink, GNNLink, and GNE methods. Figure 8 demonstrates that GT-GRN effectively captures the cell-types using gene representation classification setup.
Figure 8.
Cell-type classification. AUROC and AUPRC score of GT-GRN, GENELink, GNNLink, GNE-based embeddings in annotation cell-types using MLP classifier in five-fold cross-validation setting.
Next, we delve into the individual contributions of each embedding modality within the GT-GRN framework through a systematic ablation study on the PBMC dataset. This dataset provides a biologically rich and diverse single-cell expression landscape, making it an ideal benchmark for evaluating the role of each embedding component in GRN inference.
Ablation studies
To assess the overall efficacy and robustness of GT-GRN. We conducted the ablation study in two stages: first at the modality level, followed by the GT layer level. At the modality level, we systematically examined the contributions of structural positional encodings, global embeddings, and gene expression embeddings, both individually and in combination. Subsequently, we performed ablation experiments to assess the role of its internal components of GT layer. We first report the results of the modality-level study, and then present the findings from the GT layer ablation.
Modality-level ablation study
The modality-level ablation study systematically examines the individual and combined contributions of its embedding modalities: structural positional encodings, global embeddings, and gene expression embeddings. This experiment is crucial, given the multicomponent nature of our framework, as it allows us to systematically assess the role and efficacy of each component in the GRN inference process under a link prediction setup. We organize our baselines into three categories: uni-modal, bi-modal, and tri-modal, where the prefixes “uni,” “bi,” and “tri” denote the number of information sources used. This study establishes the necessity of each module in the overall architecture, demonstrating that omitting any single modality leads to a significant performance drop—thereby justifying the integration of all components for optimal GRN reconstruction.
Structural positional encodings: In this uni-modal baseline, we have used the
of the input network which is then fed into the GT-GRN. Here, each gene positional encodings is of length 512. This vector is fed into the model to predict possibility of link with other gene vectors based on the input link information.Global embeddings: This is a uni-model baseline, that extracts the global knowledge of inferred GRNs in a multinetwork integration framework using BERT model. The output length of each gene embedding here is 512. The vector is given as input into the model to estimate the likelihood of forming links with other gene vectors for link inference.
Gene expression embeddings: In this uni-modal baseline, raw gene expression is converted into a embedding vector to capture the latent information using auto-encoder model. The length of the embedding vector for each gene is 512 which is then used to estimate the likelihood of the link using the GT-GRN framework.
Structural positional encodings + Global embeddings: This multimodal approach combines the structural positional encodings of the input network with global embeddings derived from the BERT-based multinetwork integration framework. The fused representation is used in the GT-GRN model to enhance link prediction performance.
Structural positional encodings + Gene expression embeddings: This approach integrates the structural positional encodings of the input network with gene expression embeddings obtained through an autoencoder. The combined vector representation is fed into the GT-GRN model to infer potential links between genes.
-
Global embeddings + Gene expression embeddings
In this setup, global embeddings capturing inferred GRN knowledge are combined with gene expression embeddings. The resulting representation is used to predict gene interactions within the GT-GRN framework.
Structural positional encodings + Global embeddings + Gene expression embeddings: This comprehensive multimodal approach fuses all three embeddings—structural positional encodings, global embeddings, and gene expression embeddings—to provide a richer representation for link inference. This integrated approach aims to leverage complementary information from multiple modalities to enhance predictive performance.
Table 6 reports the ablation study results from different feature sets for GT-GRN using AUROC scores. Among unimodal representations, Global Embeddings achieve the highest AUROC of 0.8860, followed by gene expression embeddings (0.8693) and structural positional encodings with AUROC score of 0.8480, indicating that global information is the most informative for link inference.
Table 6.
Ablation study in terms of different features for PBMC’s dataset for GT-GRN framework
| Modality | Feature sets | AUROC |
|---|---|---|
| Unimodal | Structural positional encodings | 0.8480 |
| Global embeddings | 0.8860 | |
| Gene expression embeddings | 0.8693 | |
| Bimodal | Structural positional encodings + Global embeddings | 0.8843 |
| Structural positional encodings + Gene expression embeddings | 0.8666 | |
| Global embeddings + Gene expression embeddings | 0.8841 | |
| Trimodal | Structural positional encodings + Global embeddings + Gene expression embeddings | 0.9852 |
The bold value indicates the best performance.
Bimodal combinations improve performance, with Global embeddings + Structural positional encodings (0.8843) and Global embeddings + Gene expression embeddings (0.8841) performing best. The trimodal combination of all three embeddings achieves the highest AUROC of 0.9852, demonstrating that integrating multiple modalities provides the most effective representation for GRN inference.
Overall, the study highlights the importance of multimodal integration, with Global Embeddings playing a key role in enhancing predictive performance.
Graph transformer layer ablation study
This study explores how different components of the GT layer impact the model’s performance, specifically in the context of predicting GRNs. The ablation study isolates specific components to assess their importance for the overall performance of the model. These components include attention heads, depth, FFN, normalization, and residual connections.
Table 7 presents the impact of different components of the GT layer on GRN inference performance. It summarizes various model variants, detailing the specific modifications made to each component, and reports the corresponding AUROC scores. This analysis highlights how changes to the GT layer architecture influence model effectiveness, providing insights into the relative importance of each component. Firstly, the full model demonstrates the highest performance, with an AUROC of 0.9852, emphasizing the importance of all components working in harmony to provide the most effective representation for GRN inference. When individual components, such as attention heads, FFN, or layer depth, are removed, the performance decreases, highlighting the critical role each element plays in the model’s effectiveness.
Table 7.
Ablation study of GT Layer components
| Model variant | Description of change | Metric (AUROC) |
|---|---|---|
| Full model | All components included | 0.9852 |
| Reduced heads | Head = 1 | 0.9741 |
| Reduced depth | Layers = 2 | 0.9619 |
| Feedforward network | Remove FFN | 0.8815 |
| Normalization | Disable normalization | 0.8442 |
| Residual connections | Remove skip-connections between layers | 0.8436 |
The bold value indicates the best performance.
Among the various ablations, the most significant performance drops occur when the FFN is removed or normalization is disabled. This indicates that these components are especially vital for the success of the GT layer. In general, the ablation study underscores the substantial benefits of incorporating multiple attention heads, a deeper structure, a FFN, normalization, and residual connections within the GT layer. Each of these components improves the predictive capabilities of the model and their removal leads to a marked decline in performance. Therefore, to achieve optimal results in GRN inference, it is essential to retain all these components.
Conclusion
In this work, we proposed a novel GRN inference framework, GT-GRN, which leverages GTs to infer regulatory links by incorporating graph-based techniques. Our approach begins by generating gene embeddings through an autoencoder. We then integrate prior network knowledge from known GRNs using an NLP-based BERT model, where these graphs are converted into sequences to extract contextual embeddings. Additionally, we incorporate graph positional information to enhance the inference process.
Through extensive experiments, GT-GRN demonstrates superior link prediction performance compared to baseline methods. We further assess the quality of the generated embeddings by evaluating their community structure, showing that GT-GRN effectively supports cell-type annotation in real PBMC gene expression datasets. Our ablation study reveals that the combination of gene expression data, global gene context, and positional information significantly contributes to improved GRN inference precision. While the current work focuses on accurate reconstruction of GRNs using multimodal embeddings, ongoing efforts are directed toward extending the framework to prioritize disease-associated genes. By leveraging the inferred GRNs, the goal is to identify key regulatory hubs and pathways that may play pivotal roles in disease development and progression.
Key Points
GT-GRN is a novel graph transformer framework for enhanced gene regulatory network inference.
It leverages multimodal embeddings by integrating gene expression data, prior biological network knowledge, and graph positional encodings.
Proposed model outperforms baseline models, achieving higher predictive accuracy and robustness on various datasets.
Achieves strong performance on cell-type classification using the peripheral blood mononuclear cell single-cell RNA sequence dataset and provides a scalable and extensible framework for diverse biological network analysis tasks.
Supplementary Material
Contributor Information
Binon Teji, Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, 6th Mile, Tadong 737102, Sikkim, India.
Swarup Roy, Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, 6th Mile, Tadong 737102, Sikkim, India; Department of Computer Science and Engineering, Tezpur University, Napaam, Tezpur 784028, Assam, India.
Dinabandhu Bhandari, Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata 700107, West Bengal, India.
Jugal Kalita, Department of Computer Science, University of Colorado, Colorado Springs, CO, 80918, United States.
Conflict of interest
None declared.
Funding
The research work is supported by the Department of Biotechnology (DBT), GoI, under the project BT/PR51150/NER/95/1996/2023. The work is also partially supported by IDEAS-TIH, ISI-Kolkata.
Data availability
All data and code used in this study are publicly available. The source code for GT-GRN can be accessed at https://github.com/Netralab/GT-GRN.
The datasets used in our experiments are available at the following locations:
References
- 1. Bellot P, Olsen C, Salembier P. et al. Netbenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference. BMC Bioinformatics 2015;16:1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. de la Fuente A. What are gene regulatory networks? In: Handbook of Research on Computational Methodologies in Gene Regulatory Networks, Hershey, PA, USA. pages 1–27. IGI Global, 2010, 10.4018/978-1-60566-685-3.ch001. [DOI] [Google Scholar]
- 3. Guzzi PH, Roy S. Biological Network Analysis: Trends, Approaches, Graph Theory, and Algorithms. USA: Elsevier. 2020. [Google Scholar]
- 4. Shyamsundar R, Kim YH, Higgins JP. et al. A DNA microarray survey of gene expression in normal human tissues. Genome Biol 2005;6:404–9. 10.1186/gb-2005-6-9-404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Kolodziejczyk AA, Kim JK, Svensson V. et al. The technology and biology of single-cell RNA sequencing. Mol Cell 2015;58:610–20. 10.1016/j.molcel.2015.04.005 [DOI] [PubMed] [Google Scholar]
- 6. Grindberg RV, Yee-Greenbaum JL, McConnell MJ. et al. RNA-sequencing from single nuclei. Proc Natl Acad Sci USA 2013;110:19802–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Talwar D, Mongia A, Sengupta D. et al. Autoimpute: autoencoder based imputation of single-cell RNA-seq data. Sci Rep 2018;8:16329. 10.1038/s41598-018-34688-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Huynh-Thu VA, Sanguinetti G. Gene Regulatory Network Inference: An Introductory Survey. In: Sanguinetti G, Huynh-Thu V. (eds) Gene Regulatory Networks. Methods in Molecular Biology, vol 1883. Humana Press, New York, NY. 2019. 10.1007/978-1-4939-8882-2_1 [DOI] [PubMed] [Google Scholar]
- 9. Jha M, Roy S, Kalita JK. Prioritizing disease biomarkers using functional module based network analysis: a multilayer consensus driven scheme. Comput Biol Med 2020;126:104023. 10.1016/j.compbiomed.2020.104023 [DOI] [PubMed] [Google Scholar]
- 10. Roy S, Bhattacharyya DK, Kalita JK. Reconstruction of gene co-expression network from microarray data using local expression patterns. BMC Bioinformatics 2014;15:1–14. 10.1186/1471-2105-15-S7-S10 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Sebastian S, Roy S, Kalita J. A generic parallel framework for inferring large-scale gene regulatory networks from expression profiles: application to Alzheimer’s disease network. Brief Bioinform 2023;24:bbac482. [DOI] [PubMed] [Google Scholar]
- 12. Sebastian S, Roy S, Kalita J. Network-based analysis of Alzheimer’s disease genes using multi-omics network integration with graph diffusion. J Biomed Inform 2025;164: 104797. 10.1016/j.jbi.2025.104797 [DOI] [PubMed] [Google Scholar]
- 13. Mochida K, Koda S, Inoue K. et al. Statistical and machine learning approaches to predict gene regulatory networks from transcriptome datasets. Front Plant Sci 2018;9:421043. 10.3389/fpls.2018.01770 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Haury A-C, Mordelet F, Vera-Licona P. et al. Tigress: trustful inference of gene regulation using stability selection. BMC Syst Biol 2012;6:1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Margolin AA, Nemenman I, Basso K. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. London, UK: BioMed Central; 2006;7:1–15. 10.1186/1471-2105-7-S1-S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Meyer PE, Kontos K, Lafitte F. et al. Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol 2007;2007:1–9. 10.1155/2007/79879 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Faith JJ, Hayete B, Thaden JT. et al. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 2007;5:e8. 10.1371/journal.pbio.0050008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Liu W, Sun X, Yang L. et al. NSCGRN: a network structure control method for gene regulatory network inference. Brief Bioinform 2022;23:1–14. 10.1093/bib/bbac156 [DOI] [PubMed] [Google Scholar]
- 19. Bao W, Yang B. Protein acetylation sites with complex-valued polynomial model. Front Comp Sci 2024;18:183904. 10.1007/s11704-023-2640-9 [DOI] [Google Scholar]
- 20. Yang B, Bao W, Chen B. PGRNIG: novel parallel gene regulatory network identification algorithm based on GPU. Brief Funct Genomics 2022;21:441–54. 10.1093/bfgp/elac028 [DOI] [PubMed] [Google Scholar]
- 21. Bleakley K, Biau G, Vert J-P. Supervised reconstruction of biological networks with local models. Bioinformatics 2007;23:i57–65. 10.1093/bioinformatics/btm204 [DOI] [PubMed] [Google Scholar]
- 22. Gillani Z, Akash MSH, Matiur Rahaman MD. et al. CompareSVM: supervised, support vector machine (SVM) inference of gene regularity networks. BMC Bioinformatics 2014;15:1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Razaghi-Moghadam Z, Nikoloski Z. Supervised learning of gene-regulatory networks based on graph distance profiles of transcriptomics data. NPJ Syst Biol Appl 2020;6:21. 10.1038/s41540-020-0140-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Daoudi M, Meshoul S. Deep neural network for supervised inference of gene regulatory network. In Modelling and Implementation of Complex Systems: Proceedings of the 5th International Symposium, MISC 2018, December 16–18, 2018, Laghouat, Algeria 5, p. 149–157. Springer, 64, 10.1007/978-3-030-05481-6_11. [DOI] [Google Scholar]
- 25. Turki T, Wang JTL, Rajikhan I. Inferring gene regulatory networks by combining supervised and unsupervised methods. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 140–5. Anaheim, CA, USA: IEEE (Institute of Electrical and Electronics Engineers), 2016. [Google Scholar]
- 26. Mao G, Pang Z, Zuo K. et al. Gene regulatory network inference using convolutional neural networks from scRNA-seq data. J Comput Biol 2023;30:619–31. 10.1089/cmb.2022.0355 [DOI] [PubMed] [Google Scholar]
- 27. Kc K, Li R, Cui F. et al. GNE: a deep learning framework for gene network inference by aggregating biological information. BMC Syst Biol 2019;13:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Teji B, Das JK, Roy S. et al. Predicting missing links in gene regulatory networks using network embeddings: a qualitative assessment of selective embedding techniques. In: Intelligent Systems: Proceedings of ICMIB 2021, pp. 143–154. IGIT Sarang; Sarang, Odisha, India: Springer, 2022. [Google Scholar]
- 29. Teji B, Roy S, Dhami DS. et al. Graph embedding techniques for predicting missing links in biological networks: an empirical evaluation. IEEE Trans Emerg Top Comput 2023;12:190–201. [Google Scholar]
- 30. Dewey GT, Galas DJ. Gene regulatory networks. In: Madame Curie Bioscience Database [Internet]. Austin, TX, USA: Landes Bioscience, 2013. [Google Scholar]
- 31. Wang J, Ma A, Ma Q. et al. Inductive inference of gene regulatory network using supervised and semi-supervised graph neural networks. Comput Struct Biotechnol J 2020;18:3335–43. 10.1016/j.csbj.2020.10.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Zhang H, An X, He Q. et al. Quadratic graph attention network (q-GAT) for robust construction of gene regulatory networks. arXiv, arXiv:2303.14193. 2023, preprint: not peer reviewed. https://arxiv.org/pdf/2303.14193
- 33. Huang Y, Gufeng Y, Yang Y. Miggri: a multi-instance graph neural network model for inferring gene regulatory networks for drosophila from spatial expression images. PLoS Comput Biol 2023;19:e1011623. 10.1371/journal.pcbi.1011623 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Wang J, Chen Y, Zou Q. Inferring gene regulatory network from single-cell transcriptomes with graph autoencoder model. PLoS Genet 2023;19:e1010942. 10.1371/journal.pgen.1010942 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Li S, Liu Y, Shen L-C. et al. GMFGRN: a matrix factorization and graph neural network approach for gene regulatory network inference. Brief Bioinform 2024;25:bbad529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Mao G, Pang Z, Zuo K. et al. Predicting gene regulatory links from single-cell RNA-seq data using graph neural networks. Brief Bioinform 2023;24:1–11. 10.1093/bib/bbad414 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Zhou Z, Wei J, Liu M. et al. AnomalGRN: deciphering single-cell gene regulation network with graph anomaly detection. BMC Biol 2025;23:73. 10.1186/s12915-025-02177-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Tian X, Patel Y, Wang Y. Trendy: gene regulatory network inference enhanced by transformer. bioRxiv, 2024, preprint: not peer reviewed, 2024–10. https://www.biorxiv.org/content/10.1101/2024.10.14.618189v1 [DOI] [PMC free article] [PubMed]
- 39. Wang Y, Zheng P, Cheng Y-C. et al. Wendy: covariance dynamics based gene regulatory network inference. Math Biosci 2024;377:109284. 10.1016/j.mbs.2024.109284 [DOI] [PubMed] [Google Scholar]
- 40. Jing X, Zhang A, Liu F. et al. STGRNS: an interpretable transformer-based method for inferring gene regulatory networks from single-cell transcriptomic data. Bioinformatics 2023;39:btad165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Shu H, Ding F, Zhou J. et al. Boosting single-cell gene regulatory network reconstruction via bulk-cell transcriptomic data. Brief Bioinform 2022;23:1–12. 10.1093/bib/bbac389 [DOI] [PubMed] [Google Scholar]
- 42. Chan TE, Stumpf MPH, Babtie AC. Gene regulatory network inference from single-cell data using multivariate information measures. Cell Systems 2017;5:251–267.e3. 10.1016/j.cels.2017.08.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Kingma DP, Welling M. Auto-encoding variational bayes. arXiv, arXiv:1312.6114. 2013, preprint: not peer reviewed. https://arxiv.org/pdf/1312.6114
- 44. Grover A, Leskovec J. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, pp. 855–64, ACM; 2016. [DOI] [PMC free article] [PubMed]
- 45.Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is All You Need. In: Advances in Neural Information Processing Systems (NeurIPS 2017), vol. 30, pp. 5998–6008; 2017. [Google Scholar]
- 46. Devlin J. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv, arXiv:1810.04805, 2018, preprint: not peer reviewed. https://arxiv.org/pdf/1810.04805
- 47. Ba JL. Layer normalization. arXiv, arXiv:1607.06450. 2016, preprint: not peer reviewed. https://arxiv.org/pdf/1607.06450
- 48. Ioffe S. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv, arXiv:1502.031672015, preprint: not peer reviewed. https://arxiv.org/pdf/1502.03167
- 49. Yuan Y, Bar-Joseph Z. Deep learning for inferring gene relationships from single-cell expression data. Proc Natl Acad Sci USA 2019;116:27151–8. 10.1073/pnas.1911536116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Gray Camp J, Sekine K, Gerber T. et al. Multilineage communication regulates human liver bud development from pluripotency. Nature 2017;546:533–8. 10.1038/nature22796 [DOI] [PubMed] [Google Scholar]
- 51. Pratapa A, Jalihal AP, Law JN. et al. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020;17:147–54. 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Schaffter T, Marbach D, Floreano D. Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 2011;27:2263–70. 10.1093/bioinformatics/btr373 [DOI] [PubMed] [Google Scholar]
- 53. Chen G, Liu Z-P. Graph attention network for link prediction of gene regulations from single-cell rna-sequencing data. Bioinformatics 2022;38:4522–9. 10.1093/bioinformatics/btac559 [DOI] [PubMed] [Google Scholar]
- 54. Matos Simoes R, de Emmert-Streib F. Bagging statistical network inference from large-scale gene expression data. PLoS One 2012;7:e33624. 10.1371/journal.pone.0033624 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Matos Simoes R de, Emmert-Streib F. Influence of statistical estimators of mutual information and data heterogeneity on the inference of gene regulatory networks. PLoS One 2011;6:e29279. 10.1371/journal.pone.0029279 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Altay G, Emmert-Streib F. Inferring the conservative causal core of gene regulatory networks. BMC Syst Biol 2010;4:1–13. 10.1186/1752-0509-4-132 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Moerman T, Santos SA, González-Blas CB. et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics 2019;35:2159–61. 10.1093/bioinformatics/bty916 [DOI] [PubMed] [Google Scholar]
- 58. Huynh-Thu VA, Irrthum A, Wehenkel L. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One 2010;5:e12776. 10.1371/journal.pone.0012776 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Teji B, Roy S, Guzzi PH. et al. Application of generative graph models in biological network regeneration: a selective review and qualitative analysis. In: 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 5786–92. Lisbon, Portugal: IEEE, 2024. [Google Scholar]
- 60. Alexander wolf F, Angerer P, Theis FJ. Scanpy: large-scale single-cell gene expression data analysis. Genome Biol 2018;19:1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Zhang Q, Liu W, Zhang H-M. et al. htftarget: a comprehensive database for regulations of human transcription factors and their targets. Genom Proteom Bioinformat 2020;18:120–8. 10.1016/j.gpb.2019.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Van Dijk D, Sharma R, Nainys J. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 2018;174:716–729.e27. 10.1016/j.cell.2018.05.061 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Traag VA, Waltman L, Van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 2019;9:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data and code used in this study are publicly available. The source code for GT-GRN can be accessed at https://github.com/Netralab/GT-GRN.
The datasets used in our experiments are available at the following locations:






















