Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2025 Oct 24;21(10):e1013603. doi: 10.1371/journal.pcbi.1013603

Improved gene regulatory network inference from single cell data with dropout augmentation

Hao Zhu 1,*, Donna K Slonim 1,*
Editor: Saurabh Sinha2
PMCID: PMC12574904  PMID: 41134910

Abstract

A major challenge in working with single-cell RNA sequencing data is the prevalence of “dropout,” when some transcripts’ expression values are erroneously not captured. Addressing this issue, which produces zero-inflated count data, is crucial for many downstream data analyses including the inference of gene regulatory networks (GRNs). In this paper, we introduce two novel contributions. First, we propose Dropout Augmentation (DA), a simple but effective model regularization method to improve resilience to zero inflation in single-cell data by augmenting the data with synthetic dropout events. DA offers a new perspective to solve the “dropout” problem beyond imputation. Second, we present DAZZLE, a stabilized and robust version of the autoencoder-based structure equation model for GRN inference using the DA concept. Benchmark experiments illustrate the improved performance and increased stability of the proposed DAZZLE model over existing approaches. The practical application of the DAZZLE model on a longitudinal mouse microglia dataset containing over 15,000 genes illustrates its ability to handle real-world single cell data with minimal gene filtration. The improved robustness and stability of DAZZLE make it a practical and valuable addition to the toolkit for GRN inference from single-cell data. Finally, we propose that Dropout Augmentation may have wider applications beyond the GRN-inference problem. Project website: https://bcb.cs.tufts.edu/DAZZLE.

Author summary

The prevalence of false zeros in single-cell data, or “dropout,” affects many downstream analyses. A common approach is to eliminate these zeros through data imputation. We propose an alternative solution that focuses on regularizing the model and increasing model robustness against dropout noise. Counter-intuitively, this is done by augmenting the input data with a small number of zeros to simulate additional dropout noise. Validation is performed on the task of gene regulatory network inference. Our proposed model, DAZZLE, which uses the dropout augmentation idea, shows improved performance and robustness.

1. Introduction

Gene Regulatory Network (GRN) inference from expression data offers a contextual model of the interactions between genes in vivo. [13]. Understanding these interactions is crucial to gain insight into the development, pathology, and key points of regulation that may be amenable to therapeutic intervention [4].

While GRN inference from bulk transcriptomic data has a long history, many recent studies consider the contextual specificity offered by single-cell RNA sequence data (scRNA-seq) [5]. Single cell RNA sequencing allows researchers to analyze transcriptomic profiles of individual cells, providing a more detailed and accurate view of cellular diversity than traditional bulk methods. However, opportunities come with challenges. A recent benchmark paper on GRN inference summarized the major issues in single-cell data that cause challenges for GRN inference: cellular diversity, inter-cell variation in sequencing depth, cell-cycle problems, and sparsity due to dropout [6].

Despite these challenges, many methods have been proposed for context-specific GRN inference from single-cell RNA-sequencing data alone. Among established methods, GENIE3 [7] and GRNBoost2 [8] are tree-based approaches, initially proposed for bulk data, that have been found to work well on single-cell data without modification. LEAP [9] estimates pseudotime to infer gene co-expression over several lagged windows, suggesting that the lags can be used to infer regulatory relationships. SCODE [10] and SINGE [11] apply a similar pseudotime idea, combined with ordinary differential equations (ODEs) and Granger causality ensembles, to model the results. PIDC uses partial information decomposition to incorporate mutual information among sets of genes, modeling cellular heterogeneity[12].

Other methods infer GRNs by integrating transcriptomic and other data sources. For example, SCENIC [13] starts by identifying gene co-expression modules using GENIE3/GRNBoost2, followed by identifying key transcription factors (TFs) that regulate these modules or regulons. scMTNI [14] studies GRNs in different cell clusters using a multi-task learning framework. GRNUlar [15] uses recently developed unrolled algorithms to infer undirected GRNs from single-cell data by incorporating TF information. NetREX-CF [16] performs optimizations based on prior GRN networks and uses collaborative filtering to address the incompleteness of prior data. PANDA [17] further optimizes prior GRN networks using massage passing. However, single cell data alone is much more widely available and accessible in specific contexts than integrated multi-omic data sets.

The application of neural networks (NNs) in the analysis of single-cell data has advanced rapidly in the last couple of years. One of the leading NN-based GRN inference methods, DeepSEM, [18] parameterizes the adjacency matrix and uses a variational autoencoder (VAE) [19] architecture optimized on reconstruction error. In fact, on the BEELINE benchmarks [6] where the “right" networks are (approximately) known, DeepSEM reports better performance than other methods and runs significantly faster than most.

However, as shown later in this paper, one of the issues with DeepSEM is that as training continues, the quality of the inferred networks may degrade quickly. A possible explanation is that soon after the model converges, it may begin to over-fit the dropout noise in the data.

Single-cell data is often characterized by an excessive number of zero expression counts, referred to as “zero-inflation.” For example, in the nine data sets examined in [20], 57 to 92 percent of the observed counts are zeros. Among these zero values, “dropout” describes the situation when transcripts, often those with low or moderate expression in a cell, are not counted by the sequencing technology. Later droplet-based protocols, such as inDrops [21] and 10X Genomics Chromium [22], helped improved detection rates. However, the “dropout” problem still persists, as even recent methods have relatively low sensitivity [23,24].

Therefore, there has been research into data imputation methods for use in single-cell analysis. Several methods have been proposed to identify and replace missing data with imputed values [2327]. Yet many of these methods depend on restrictive assumptions, and some require additional information, such as GRNs or bulk transcriptomic data.

In this paper, we introduce two novel contributions to the fields of single cell analysis and GRN inference. First, we propose “dropout augmentation" (DA), a novel approach to mitigate the impact of the zero-inflation problem by augmenting the data with a small amount of simulated dropout noise. We found that this idea, while seemingly counter-intuitive, can effectively regularize models so that they remain robust against dropout noise.

It has long been known that by adding noise to the input data during training, we can improve the robustness, and sometimes even the performance, of many machine learning models. Bishop first pointed out that adding noise is equivalent to Tikhonov regularization [28]. Hinton further introduced the idea of using random “dropout" on either input or model parameters to improve training performance [29]. Thus, the theoretical foundations of DA are also solid.

Our second contribution is the DAZZLE model, or Dropout Augmentation for Zero-inflated Learning Enhancement. DAZZLE uses the same VAE-based GRN learning framework introduced by DeepSEM and DAG-GNN [18,30], but it employs dropout augmentation and several other model modifications. These include a new method optimizing the adjacency matrix sparsity control strategy, a simplified model structure, and a closed-formed prior. Compared to DeepSEM, DAZZLE shows better model stability and robustness in our benchmark experiments. We further illustrate how DAZZLE’s network inference facilitates interpreting typical-sized data sets efficiently, in this case explaining microglial expression dynamics across the mouse lifespan. (This noise augmentation concept has been further developed in our RegDiffusion software [31], which relies instead on a diffusion-based learning framework.)

2. Results

2.1. Dropout augmentation and the DAZZLE model

GRN inference in DAZZLE is based on the structure equation model (SEM) framework previously employed by DAG-GNN and DeepSEM [18,30]. The input of the model is the gene expression matrix representing the scRNAseq data, where each raw count of x is transformed to log(x+1) to reduce the variance and avoid taking the log of zero. We assume that the rows of the input matrix represent cells and the columns represent genes. The adjacency matrix A is parameterized and used in both sides of an autoencoder, as shown in Fig 1. The model input is simply a single-cell gene expression matrix, where the rows correspond to cells and the columns to genes. The model is trained to reconstruct the input while the weights of the trained adjacency matrix are retrieved as a by-product of training. Since ground truth networks are never available to the training model, this type of SEM model should be considered an unsupervised learning method for GRN inference. We include a detailed explanation of why the learned adjacency matrix A represents the underlying GRN, as well as other methodological details for DA and DAZZLE, in the Methods section.

Fig 1. One of the major differences between DAZZLE and DeepSEM is the use of Dropout Augmentation.

Fig 1

Dropout augmentation regularizes model training by simulating small amounts of random dropout at each training iteration such that the model is protected against the negative impact of dropout noise. Rounded boxes indicate trainable model parameters.

One unique design aspect that differentiates DAZZLE from DeepSEM is the use of DA, as shown in Fig 1. As a model regularization method, DA can be applied to any model design that is continuously optimized. At each training iteration, we introduce a small amount of simulated dropout noise by sampling a proportion of the expression values and setting them to zeros. With multiple training iterations, the model is exposed to multiple versions of the same data with slightly different batches of dropout noise. As a result, it is less likely to over-fit any particular batch.

DAZZLE also includes a noise classifier to predict the chance that each zero is an augmented dropout value; this classifier is trained together with the autoencoder. Since we generate the locations of the augmented dropout, we can confidently use them for training. The purpose of this classifier is to move the values that are more likely to be dropout noise to a similar region in the latent space Z, so that the decoder will learn to put less weight on them when reconstructing the input data.

We made several additional model design and training choices that further distinguish our model from that of DeepSEM. First, we improved the stability of the model by delaying the introduction of the sparse loss term by a customizeable number of epochs. Another difference is that, to estimate the prior, DeepSEM estimates a separate latent variable while DAZZLE uses a closed-form Normal distribution. These changes lead to reduced model sizes and computational time. For example, to process the BEELINE-hESC dataset with 1,410 genes, the original DeepSEM implementation used 2,584,205 parameters and ran in 49.6 seconds (clock time) on an H100 GPU. Eliminating some unnecessary calculations, our DAZZLE implementation reduces the model to 2,022,030 parameters (a 21.7% reduction) without changing the size of the hidden layers. On the same device, our implementation finished inference in 24.4 seconds (a 50.8% reduction in running time). Finally, while DeepSEM is trained with two separate optimizers in an alternating manner (one on the adjacency matrix and the other on the rest of the neural networks), DAZZLE is trained using a single optimizer with different learning rates. This improvement helps DAZZLE stay modular, so it can be integrated with other network components more easily in the future.

2.2. DAZZLE improves GRN inference on BEELINE benchmarks

We performed a benchmark comparison of DAZZLE to DeepSEM [18], GENIE3 [7], GRNBoost2 [8], and PIDC [12] on the BEELINE single cell benchmark, which includes seven datasets (two from human and five from mouse) and three sets of ground truth networks. The details of the BEELINE benchmarks are described in the Methods section. We chose to compare to GENIE3 and GRNBoost2 as representatives of non-deep-learning methods because these two decision-tree-based methods are among the most widely used GRN inference algorithms. PIDC is another strong baseline method using mutual information on triplets of genes. Recent benchmarks on single cell data [32,6] also showed that these methods generally outperformed SCODE [10], ppcor [33], and SINCERITIES [34]. We also included DeepSEM, the model that inspired this work and the previous front-runner.

Note that a single run of DeepSEM is not stable (a point that we discuss further in later Results sections). Thus, DeepSEM was proposed as an ensemble algorithm, combining results from a set of 10 runs. Here, we include both single runs (1x) and the ensemble model (10x) in the comparison for both DAZZLE and DeepSEM. For fair comparison, both DAZZLE and DeepSEM use the same size neural networks, and both are trained for 120 iterations as suggested by the DeepSEM paper. The hyperparameter settings are identical except for the changes already mentioned in Sect 2.1 (see also the comparison of hyperparameters in Sect 2 of S1 Text).

The main benchmark results are provided in Table 1. The numbers reported are the average area under the precision recall curve (AUPRC) ratios over 10 runs, where higher values represent better performance. In Table 1, the evaluation is done separately for the STRING network, the Non-celltype-specific ChIP-Seq network, and the Celltype-specific ChIP-Seq network. Note that for celltype-specific ChIP-Seq, we followed the recommendations of the DeepSEM paper and applied a very small L1 sparsity regulation to the adjacency matrix for better performance.

Table 1. DAZZLE shows improved GRN inference capacity on BEELINE benchmarks.

hESC hHep mDC mESC mHSC-E mHSC-GM mHSC-L
# of Genes 1410 1448 1321 1620 1204 1132 692
# of Cells 758 425 383 421 1071 889 847
Ground Truth: STRING
# of True Edges 5,149 9,000 5,898 8,479 1,826 1,311 154
GENIE3 1.98 (0.01) 1.86 (0.01) 1.72 (0.01) 2.05 (0.01) 4.19 (0.05) 6.27 (0.04) 7.02 (0.08)
GRNBoost2 1.67 (0.01) 1.50 (0.01) 1.43 (0.01) 1.88 (0.02) 3.65 (0.05) 5.16 (0.11) 7.07 (0.08)
PIDC 2.01 (0.00) 1.90 (0.00) 1.58 (0.00) 2.06 (0.00) 5.09 (0.00) 6.27 (0.00) 6.72 (0.00)
DeepSEM-1x 2.00 (0.06) 1.66 (0.04) 1.55 (0.06) 2.20 (0.02) 5.27 (0.16) 5.93 (0.38) 7.28 (0.60)
DeepSEM-10x 2.10 (0.03) 1.82 (0.05) 1.68 (0.02) 2.33 (0.03) 5.68 (0.11) 6.91 (0.12) 7.47 (0.14)
DAZZLE-1x 2.39 (0.05) 1.80 (0.02) 1.58 (0.03) 2.27 (0.04) 5.87 (0.21) 6.56 (0.10) 7.50 (0.07)
DAZZLE-10x 2.44 (0.02) 1.82 (0.01) 1.62 (0.01) 2.34 (0.03) 6.07 (0.03) 6.63 (0.04) 7.53 (0.05)
Ground Truth: Non-celltype-specific ChIP-Seq
# of True Edges 4,597 5,335 3,918 8,030 1,960 1,358 317
GENIE3 0.97 (0.00) 1.01 (0.01) 1.56 (0.01) 1.65 (0.01) 2.54 (0.02) 3.56 (0.03) 2.78 (0.04)
GRNBoost2 1.01 (0.01) 1.09 (0.01) 1.36 (0.02) 1.49 (0.02) 2.39 (0.04) 2.84 (0.03) 2.45 (0.05)
PIDC 1.13 (0.00) 1.20 (0.00) 1.65 (0.00) 1.42 (0.00) 2.65 (0.00) 3.35 (0.00) 2.91 (0.00)
DeepSEM-1x 1.22 (0.05) 1.29 (0.08) 1.73 (0.07) 1.58 (0.05) 2.94 (0.32) 3.27 (0.15) 2.50 (0.56)
DeepSEM-10x 1.24 (0.01) 1.41 (0.03) 1.93 (0.02) 1.66 (0.02) 3.18 (0.06) 3.48 (0.12) 3.01 (0.20)
DAZZLE-1x 1.27 (0.05) 1.44 (0.04) 1.85 (0.04) 1.62 (0.05) 3.30 (0.05) 3.45 (0.06) 3.00 (0.12)
DAZZLE-10x 1.29 (0.01) 1.44 (0.01) 1.89 (0.02) 1.64 (0.01) 3.35 (0.02) 3.51 (0.03) 3.21 (0.11)
Ground Truth: Celltype-specific ChIP-Seq
# of True Edges 7,050 15,410 1,193 42,795 21,975 14,135 5,180
GENIE3 0.96 (0.00) 1.10 (0.00) 1.01 (0.01) 1.06 (0.00) 0.99 (0.00) 0.98 (0.00) 1.05 (0.00)
GRNBoost2 1.01 (0.00) 1.02 (0.00) 1.00 (0.01) 1.02 (0.00) 0.99 (0.00) 0.99 (0.00) 1.01 (0.00)
PIDC 0.98 (0.00) 1.03 (0.00) 1.04 (0.00) 1.02 (0.00) 0.95 (0.00) 0.96 (0.00) 0.99 (0.00)
DeepSEM-1x 1.06 (0.02) 1.03 (0.01) 1.04 (0.03) 1.03 (0.01) 1.01 (0.00) 1.02 (0.01) 1.06 (0.01)
DeepSEM-10x 1.07 (0.01) 1.03 (0.00) 1.05 (0.01) 1.03 (0.00) 1.01 (0.00) 1.02 (0.00) 1.06 (0.00)
DAZZLE-1x 1.10 (0.02) 1.02 (0.01) 1.03 (0.03) 1.03 (0.01) 1.01 (0.01) 1.01 (0.01) 1.08 (0.01)
DAZZLE-10x 1.10 (0.01) 1.03 (0.00) 1.05 (0.01) 1.03 (0.01) 1.01 (0.00) 1.08 (0.00) 1.08 (0.00)

Metric: AUPRC Ratio; Number of target genes: 1000.

Numbers reported are mean and std. of AUPRC Ratio compared with random guess over 10 runs. Higher ratios indicate better performance. Here, italicized text in dark shading indicate the best algorithms and lightly shaded cells indicate the 2nd best algorithms. In this experiment, we ran each algorithm on all available data with the default settings. The 10x models ensemble inferred networks from 10 runs and the 1x models are simply single-run models.

Further explanations of the model improvements are discussed in Sects 2.3 and 2.4.

Among all the evaluations in Table 1 the ensemble version of DAZZLE (DAZZLE-10x) has the best performance of all methods in over half the cases, and in all the other cases it ranks either 2nd or 3rd, with a result that is within 6% of the top score. The DAZZLE-10x results also typically have lower variance compared to the DeepSEM-10x results, illustrating the stability of the DAZZLE model. Further, across nearly all benchmarks, a single run of DAZZLE, DAZZLE-1x, has a higher AUPRC ratio than the comparable single-run DeepSEM-1x. Yet, DAZZLE-1x is typically also more stable. Some DeepSEM-1x results in Table 1 have substantial variance, but this unwanted model behavior is eliminated in DAZZLE-1x; the reasons for this are explained in Sect 2.4. We see similar findings when we use Early Precision Ratio (EPR) or the Area Under the Receiver Operating Curve (AUROC) as the evaluation metrics (Supplement Tables B and C in S1 Text).

For the cell-type specific data, the highest observed AUPRC ratio for any data set or method is 1.10, and the highest observed EPR is 1.20 (just slightly better than random performance), suggesting that for all methods, reproducing these cell-specific networks is nearly impossible. A possible explanation of this effect appears in S2 Fig, which shows that the results are highly variable across training iterations. We confirmed that using very low L1 regularization yields better performance on this ground truth network; the reason for this is still unclear.

In summary, under similar training conditions, DAZZLE outperforms DeepSEM and demonstrates more stable performance. We found that while a single run of DeepSEM can produce unstable results, the ensemble strategy proposed in the DeepSEM paper significantly enhances its performance and reduces variance. However, this improvement comes at the cost of a tenfold increase in execution time. For DAZZLE, the ensemble strategy also yields excellent results, making DAZZLE-10x the best performer in our comparison. Note however that the performance of DAZZLE-1x itself approaches that of an ensemble method, in essence because it functions as an ensemble method, as described in the Methods section below. For datasets with a large number of genes and cells, a single pass of DeepSEM-1x or DAZZLE-1x can take several hours to compute, even on modern GPUs. In such cases, a single pass of DAZZLE-1x offers a sufficiently-good solution at a more reasonable cost.

2.3. Dropout augmentation contributes to model robustness

In this section, we discuss the effects of dropout augmentation alone on GRN inference. S2 Fig shows the quality of the inferred networks from DeepSEM-1x as a function of the number of training steps. Specifically, it shows that the quality of the inferred networks from DeepSEM tends to drop quickly if the model is over-trained, suggesting that the model may be overfitting to unwanted patterns in the training data. One possible solution is to stop training early, as suggested in the DeepSEM paper, which ends model training at 120 iterations. The tricky part is that the point of peak performance is very difficult to predict and may depend highly on the number of genes and cells in the dataset. In practice, when ground-truth networks are not available, it is very difficult to identify a good convergence point for the model. It would be ideal if inferred GRN accuracy remained robust, or at least if it dropped slowly and consistently after the performance peak, so that the cost of picking a sub-optimal stopping point would be low. We believe that dropout augmentation is potentially an effective solution to this problem.

For fairness, all the comparisons shown in Fig 2 were performed on his DAZZLE-1x. We varied only the percentage of augmented dropout values while keeping all the other hyper-parameters the same. Without DA (thick black lines, representing a DA probability of 0%), a common pattern is that the AUPRC Ratios drop after the performance peaks, similar to what we observed with the DeepSEM method (S2 Fig). However, when we train the model with some amount of DA, in most cases the AUPRC Ratios either stay flat or decrease at a slower pace. For some data sets, such as hESC and hHep, DA also improves accuracy significantly. However, we observed that different datasets and ground truth networks seem to require different optimal levels of dropout augmentation. For hESC and hHep on the STRING network, 20% dropout augmentation yields the best AUPRC scores, but that amount of DA is too high for mHSC-E, mHSC-GM, and mHSC-L since it either slows down the model’s convergence or leads to lower accuracy. After reviewing all cases, we recommend using 10% DA probability as the default for DAZZLE.

Fig 2. Appropriate amount of augmented dropout helps maintain model robustness and may contribute to better performance.

Fig 2

Color reflects the probability of dropout augmentation. The two thick lines represent two important conditions, 0% - no augmented dropout, and 10% - the default dropout augmentation level we recommend. Dashed lines show the default number of training iterations used in DeepSEM and DAZZLE.

To further study the actual impact of DA, we conducted a controlled ablation study on the BEELINE benchmarks (with STRING as ground truth) by training variants of DAZZLE-1x (with different DA probabilities) and DeepSEM-1x under identical hyperparameter settings (NN learning rate = 1×104, Adj matrix learning rate = 2×105, batch size = 64). In all cases, we zeroed out a certain proportion of data points before the models saw the data to simulate background dropout noise. As shown in S1 Fig, DAZZLE-1x maintains its performance advantages over DeepSEM-1x in nearly all cases. In many cases, such as hESC, hHep, mDC, mHSC-E, mHSC-GM, and mHSC-L, DAZZLE-1x running on data with 10% additional background dropout noise performs better than DeepSEM-1x running on the full data set, demonstrating DAZZLE’s resilience with zero-inflated data.

2.4. Sparsity control strategy improves DAZZLE’s stability

In addition to increased robustness, DAZZLE also generates more stable GRN inference results thanks to its improved sparsity control strategy. As mentioned in Sect 2.2, results from DeepSEM-1x are not stable. Fig 3 examines this issue using 100 runs of DeepSEM-1x and DAZZLE-1x on the hESC dataset evaluated using the STRING network. In Fig 3a, we show histograms of the AUPRC ratios for both methods. The accuracy of the 100 runs of DeepSEM-1x can be separated into two groups. As shown in S3 Fig, other benchmark data sets produced similar results. Further investigation suggested that the L1 regularization on the adjacency matrix might be the main cause of those less ideal results.

Fig 3. Comparison of 100 runs of DeepSEM-1x and DAZZLE-1x on hESC evaluated using the STRING network.

Fig 3

In DeepSEM-1x, the prioritization of the L1 sparsity control at early stage is the main cause of unstable GRN inference performance. DAZZLE solves this issue by delaying the introduction of the L1 sparse loss by 5 iterations. a) Histogram of AUPRC ratios. b) Sparse loss over time for DeepSEM-1X, colored by AUPRR at convergence. c) Sparse loss over time for DAZZLE-1x, colored by AUPRR at convergence.

Both DAZZLE and DeepSEM are trained with L1 regularization on the adjacency matrix. This regularization term ensures that the model doesn’t add unnecessary weights to the adjacency matrix. It also ensures the singularity of the inferred adjacency matrix. Experiments have shown that a large enough coefficient for this sparse loss is required to generate a meaningful adjacency matrix prediction in GRN inference. However, as shown in Fig 3b, for DeepSEM-1x, where this L1 sparse loss is introduced at the very beginning of the training process, the optimization of this loss may be prioritized such that the values in the adjacency matrix quickly drop to near zero within the first several training steps. Most runs that do this end up in the low performance group. In DAZZLE, we propose a simple solution to overcome this limitation by delaying the introduction of the L1 sparse loss by a few training steps. As shown in Fig 3c, after 5 steps of training without the L1 sparse loss, the values and the gradients on the adjacency matrix parameters are stabilized sufficiently so the introduction of the sparse loss is less likely to destabilize the training process. We also see similar findings on other benchmark datasets; detailed results are included in S3 Fig.

2.5. Case study: Using DAZZLE to predict temporal changes in GRNs in mouse microglia

To test the effectiveness of running DAZZLE on more typical-sized sets of single-cell data, we applied the method to published data characterizing mouse microglia at different developmental stages [35]. For each time point, DAZZLE inferred an adjacency matrix for all the input genes. All edges whose weights have an absolute value over 0.001 are extracted and analyzed.

For this case study, we focused on validation using literature evidence, since our goal was to confirm those context-specific regulatory interactions. General-purpose regulation databases, such as TRRUST v2, are invaluable for general quantitative benchmarking (as described in Sect 3.3.1 and Table 1), but literature curation is better suited for verifying if the discovered links are relevant to the specific cell types and conditions analyzed here.

These GRN inference results confirm that gene regulation is a dynamic process that changes across the lifespan. Fig 4a lists the top ten regulated genes at each time point, ranked by the summed edge weights of regulating relationships on each gene. At the earliest life stages, most of the top regulated genes are associated with cell proliferation and differentiation. For example, Tuba1a (Tubulin alpha 1a) encodes proteins in microtubules, which form the mitotic spindle for cell division and motility structures that move cells to their correct positions. At later ages, however, we see more regulation of immune response genes. Many of these genes, such as Tmem176B (transmembrane protein 176B), H2.D1 (histocompatibility 2, D region locus 1), and PISD (phosphatidylserine decarboxylase), encode key proteins, receptors, and enzymes of microglial immune response.

Fig 4. a. Top 10 regulated genes at each life stage in mouse microglia.

Fig 4

b. Predicted local networks around Tmem119 and Apoe in mouse microglia. Here edge weights are min-max scaled at each time point. Top genes are selected according to the maximum weight at all time points.

Fig 4b shows a closer view of two specific genes of interest. First, since Tmem119 (transmembrane protein 119) is often used as a biomarker to differentiate microglia from other immune cells in the brain [36], we chose it as the center of the local network to analyze. Fig 4b shows that regulation of and by Tmem119 is only heavily active starting in the late juvenile stage. Seven of the top 10 predicted regulators, C1qc, Cd81, Cx3cr1, Hexb, Lgmn, P2ry12, and Selplg, are commonly considered to be part of the microglial transcriptional “signature," as they are generally expressed at low levels in other immune cells [3741]. In addition, Csf1r has recently been identified as a regulator of pathogenesis for microglia and macrophages [42]. It is reasonable to hypothesize that this surrounding local neighborhood includes much of the core functionality of healthy microglia.

Apoe (apolipoprotein E) is another well-studied gene that encodes a protein playing a central role in lipid metabolism, neurobiology, and neurodegenerative diseases. Unlike Tmem119, Apoe is highly regulated at embryonic day E14. As the mice mature, the relative impact of Apoe drops, but it increases again in old age. The list of top predicted links by DAZZLE is consistent with recent studies showing the variety of Apoe’s roles in cellular activities. For example, at E14, the top three genes regulating Apoe are Ftl1 (ferritin light polypeptide 1), Tmsb4x (thymosin beta 4 X-linked), and Itm2b (integral membrane protein 2B). Although the connection between Apoe and Ftl1 is not well established, a recent study [43] shows that Apoe deficiency leads to increased iron levels. Another study [44] suggests iron loading is a prominent marker of activated microglia in Alzheimer’s disease patients. Further, both genes are located on mouse chromosome 7, about 20cM apart. This evidence suggests a plausible regulatory connection between Ftl1 and Apoe and further reflects the important role of iron in early brain development.

Fig 4 illustrates DAZZLE’s predicted regulation patterns for these two typical well expressed genes. However, we have found that DAZZLE’s regulatory predictions make sense even for genes whose overall expression levels are low. (See, e.g., comparable images for Ifit3 (Interferon Induced Protein With Tetratricopeptide Repeats 3) on the project web site https://bcb.cs.tufts.edu/DAZZLE/hammond.html.)

Another molecule worth mentioning here is Malat1 (Metastasis Associated Lung Adenocarcinoma Transcript 1), which appears in our predicted networks as one of the top regulators for many microglia core genes, including (Fig 4) Tmem119, Apoe, and H2.D1. As a long non-coding RNA (lncRNA), Malat1 has been identified in many pathological processes with immunological components, including several types of cancer [45] and diabetes [46]. It has also been identified as a key regulator in the microglial inflammatory response [47,48], but beyond that its function in microglia is mostly unexplored.

Overall, our analysis of this data set confirms that DAZZLE can handle typical real-world single-cell data with more than 10,000 genes and thousands of cells. On examination, the predicted networks appear generally consistent with current research on gene regulation in microglia. Beyond previously identified links, DAZZLE also suggests novel yet plausible links that may be confirmed through future experiments.

3. Materials and methods

3.1. Dropout augmentation

Previous research suggests that zero values in single cell data include both biologically real zeros, corresponding to truly absent genes, and random dropout events. A successful single-cell model should remain robust regardless of how dropout values are distributed. This idea informs the dropout augmentation algorithm.

Let Xn×m be a gene expression matrix from a single-cell experiment, where n is the number of cells and m is the number of genes. We randomly sample a proportion (p) of data and temporarily replace these values with zeros. Alternatively, the augmented data could be treated as the sum of the original expression data X and a dropout noise term E, where E is the Hadamard product of negative X and a masking term derived from Bernoulli sampling. By denoting the mask of dropout augmentation as MDA and the probability of augmented dropout by p, we can write the dropout noise term E and the augmented data X as follows:

E=XMDA,where MDAij~Bernoulli(p) (1)
X=X+E. (2)

In theory, the dropout augmentation algorithm could be applied to any iterative learning algorithm, such as expectation maximization, gradient boosting, or neural networks. As shown in Fig 1, MDA is re-sampled at every training step, so the augmented data X changes in every iteration. During the entire training process, the model is only exposed to the altered X instead of X.

3.2. GRN inference with DAZZLE

The task of GRN inference is to infer a weighted adjacency matrix Am×m based on the expression data X. Previous methods DAG-GNN [30] and DeepSEM [18] rely on a linear additive assumption that can be written as

X=XA+Z, (3)

where Zn×m is a random variable characterizing the noise, essentially describing the gap between the overall expected counts of the genes based on their regularizers (XA) and the observed counts (X). To embrace the idea that the observed data are noisy due to dropout, we modify this assumption and rewrite Eq 3 as:

X=XA+Z, (4)

where Z is defined for X analogously to the definition of Z for X. Since dropout is so prevalent in single-cell data, we believe Eq 4 describes the actual situation more accurately. Following a similar transformation to that done in DAG-GNN and DeepSEM, by rearranging the terms we can rewrite Eq 4 in the following two forms:

Z=X(IA), (5)
X=Z(IA)1. (6)

Eq 5 infers Z from X and Eq 6 is a generative model that reconstructs X based on the noise sum. These two equations naturally fit into a VAE framework with Eq 5 as an encoder and Eq 6 as a decoder. When we parameterize both the VAE model and the adjacency matrix, the encoder can be denoted by qϕ(Z|X) and the decoder can be denoted by pθ(X|Z), with A being part of ϕ and θ. In this case, Z is the latent variable.

For a VAE, the problem of finding the set of parameters θ that maximizes the log evidence log(P(X)) is intractable. Instead, it is common to maximize the evidence lower bound (ELBO), which we write as

ELBO=DKL(qϕ(Z|X)||pθ(Z))+𝔼z~qϕ(Z|X)[logpθ(X|Z)], (7)

where the first term is the KL divergence and the second term can be thought of as the reconstruction loss.

The random variable Z describes the deviation of the observed value X from the expected value XA. In a particular cell, if the expression value of a particular gene happens to be observed as 0 due to dropout events, we will more likely see a larger deviation Z. In other words, Z contains information that can be used to infer whether a value comes from a dropout event. Following this rationale, we can add a classifier CDA based on the specified dropout augmentation masked MDA, as shown in Eq 8 below. As a naïve approach, here we choose a simple 3-layer multi-layer perceptron (MLP, activated by tanh) followed by sigmoid as the DA classification:

M^DA=sigmoid(CDA(Z)). (8)

The loss function of this classifier is simply a binary cross entropy function.

LBCE=𝔼[MDAlogM^DA+(IMDA)log(IM^DA)] (9)

The dropout augmentation classifier can be trained either separately or together with the main model using the same optimizer. In our experiments, we add the classification loss to the ELBO function scaled by a hyper parameter χ. Additionally, following the design of DeepSEM, we include an L1 sparse loss term that regulates the sparsity of the learned adjacency matrix. The final form of the objective function is to minimize the following loss function:

Loss=𝔼z~qϕ(Z|X)[logpθ(X|Z)]+α|A|+βDKL(qϕ(Z|X)||pθ(Z))+γLBCE(MDA,M^DA). (10)

3.3. Datasets used

3.3.1. BEELINE single-cell benchmarks.

The BEELINE benchmarks consist of both synthetic expression data based on curated ground truth networks, as well as seven pre-processed real single-cell RNA-seq datasets [6]. These scRNA-seq datasets come from both human and mouse samples and have undergone different pre-processing steps, including normalization, depending on the original data format (e.g. raw counts or processed data). In some aspects, this variety reflects the wide array of differences we encounter in real-world data.

Next, BEELINE combines the scRNA-seq data with three different sources of “ground truth" data about regulatory relationships, including the functional interaction network represented in the STRING database v11 [49], non cell-type specific transcriptional networks, and cell-type specific networks. The Non-specific ChIPSeq network combines links from DoRothEA [50], RegNetwork [51], and TRRUST v2 [52]. The cell-type specific networks were created by the BEELINE authors for each dataset by searching through the ENCODE [53], ChIP-Atlas [54], and ESCAPE [55] databases. To generate a benchmarking dataset, BEELINE identifies highly variable transcription factors and genes and randomly samples from this pool to create a benchmark of the desired size.

3.3.2. Hammond microglial data.

To assess DAZZLE’s performance in a more practical context, we use a published data set from [35] (data available from NCBI’s Gene Expression Omnibus database [56] under accession number GSE121654). The Hammond mouse microglial dataset includes RNA sequencing counts for cells underlying several possible comparisons. In our analysis, we selected the data from five mouse developmental stages, each of which includes single cell data from four healthy male mice. To preprocess the data, following suggestions from [57] for the same data, we filtered out cells with fewer than 400 or more than 3,000 unique genes, cells with more than 10,000 UMIs, and cells with over 3% of reads mapping to mitochondrial genes. Note that here we adopt a cross-sectional slicing strategy and treated each developmental time point as independent samples. The top regulations for key regulators are then compared to existing literature to validate the biological plausibility of the inferred networks.

Many standard analysis approaches further reduce the data set size by filtering the gene set, keeping only the most variable genes. However, here, we only remove genes with a raw count of zero transcripts detected in all cells. We further removed all gene models, mitochondrial genes, and ribosomal genes from this pool to simplify the interpretation of the resulting networks. The expression values were normalized using natural log transformation.

After this cell and gene filtering, the final data set includes 49,972 cells from five time points across the mouse lifespan: Embryonic (embryonic day E14, 15,673 genes and 11,262 cells), Early Postnatal (postnatal day P4/5, 13,316 cells and 15,039 genes), Late Juvenile (P30, 9,431 cells and 13,929 genes), Adulthood (P100, 8,259 cells and 13,998 genes), and Old Age (P540, 7,704 cells and 14,140 genes). Note that compared to the original paper [35], here we are using a very different approach, analyzing changes in potential regulatory links across time, rather than attempting to identify microglial subpopulations defined by specific injury-responsive cell clusters.

3.4. Evaluation metrics

In this report, we follow the recommendations from the BEELINE paper and use Area Under the Precision Recall Curve Ratio (AUPRC Ratio) and Early Precision Ratio as the main evaluation metrics. The main reason why AUPRC is preferred over Area Under the Receiver Operating Characteristic (AUROC) is that ultimately we are classifying a potential link between a TF and a target gene to be either exist or not exist. In the ground truth data, usually there are far more non-existing edges than existing edges. For example, in the hESC dataset, there are 5,149 edges in a data set with 578,100 potential edges. AUPRC is generally considered a better metric when there is a class imbalance between the positive and negative groups [58], as in this case. For easy comparison with other methods, we still provide results evaluated by the AUROC metric in Table C in supplement S1 Text. The AUROC results show similar trends to those of the other metrics.

In this paper, AUPRC is approximated using its discretized form without interpolation, as shown in Eq 11:

AUPRC=01P(R)dRn=1NPn·(RnRn1), (11)

where P(R) is the Precision at Recall level R, N is the number of unique predictions, and Rn and Pn are the Recall and Precision scores at item n. This approximation is often referred to as average precision [59] and is the same analysis performed in the DeepSEM paper. The AUPRC Ratio is calculated by dividing the calculated AUPRC score with the theoretical AUPRC score of a random predictor. In this case, the expected precision of a random predictor is equal to the proportion of positive cases. On the Precision-Recall plot, the performance of a random predictor forms a horizontal straight line (see Fig 5D in [58]). Therefore, the AUPRC of a random predictor is simply the proportional of positive cases and we can calculate AUPRC Ratio with Eq 12.

AUPRC RatioPositivesTotal Instances·n=1NPn·(RnRn1) (12)

Early Precision (EP) is the fraction of true positives within the top-k candidates, where k is the number of edges in the ground truth network. Early Precision Ratio further divides that fraction with the expected Early Precision of a random predictor, which is the edge density of the ground truth network [6].

Since the number of all possible edges and the number of true edges are usually very large in actual GRNs, the values of the AUPRC and Early Precision themselves tend to be small. As claimed in the BEELINE paper, converting them to ratios make it easier to understand the performance across benchmarks.

4. Discussion

In this study, we tackle the dropout problem in real-world single-cell data paradoxically by adding more dropout. As previously mentioned, while the idea of using dropout to improve the robustness of machine learning models has existed for a long time, thanks to the pioneering work of Bishop and Hinton [28,29], it has rarely been discussed and recognized in the ’omics community as a useful method although our data suffers greatly from sparsity and noise. Our work shows that it can be useful in the task of GRN inference, but it seems to be a reasonable assumption that it would also help many other applications in the single-cell domain with robustness and perhaps improved performance at very low cost. Of course, as in other machine learning domains, dropout augmentation is not a magic bullet that can guarantee a significant amount of performance gain. But it is clear that that Dropout Augmentation can improve model robustness, which is equally important, making observed performance gains believable. Building on the principles established in this paper, we have since developed RegDiffusion [31], a follow-up method that frames the noise augmentation concept within a more formal diffusion model framework. Instead of adding dropout noise, RegDiffusion incrementally adds Gaussian noise and learns to reverse the process. This subsequent approach offers further improvements in computational speed and inference accuracy, demonstrating the extensibility of the core idea introduced in DAZZLE.

In the specific case of GRN inference, our proposed method DAZZLE not only stabilizes the output but also produces better predictions. On the BEELINE benchmarks, a single-run of DAZZLE yields comparable results to an ensemble containing of 10 repeated trials of the previously most-accurate method. Our experiment with the microglia dataset shows that DAZZLE has the capacity to run on large single-cell data with minimal gene filtration. While we do not have ground-truth networks suitable for direct comparison in these cases, the predicted networks are consistent with current understanding and include plausible novel links. These novel links may be good candidates for future investigations of key regulatory relationships.

Finally, DAZZLE, like GENIE3 and GRNBOOST2, only requires the gene expression matrix as the input. Therefore, since DAZZLE has better performance and runs faster, it can be seemlessly used in popular downstream GRN analysis tools such as SCENIC [13,60] and SCENIC Plus [61]. As suggested in the SCENIC papers, the results of DAZZLE could be pruned using cisTarget data [62] to remove unlikely edges and further improve the accuracy. The learned regulation could be used to calculate AUCell scores[13] to describe the TF activities for each cell and to build dimension reduction plots for cells. We provide a tutorial on how to integrate the results from DAZZLE and RegDiffusion into the SCENIC workflow on the documentation site for RegDiffusion (https://tuftsbcb.github.io/RegDiffusion). We are also developing a GPU-based SCENIC calculation pipeline called flashscenic (https://github.com/haozhu233/flashscenic).

One limitation of this method is that it can not directly predict if the regulation is positive or negative. One of the advantages of DeepSEM and DAZZLE is that they use neural networks to do nonlinear transformation. At the end, the learned adjacency matrix is in fact describing the relationships among the non-linearly transformed data. Therefore, positive/negative regulation is not directly computable. One possible solution to this problem would be, after the existence of regulation is confirmed, to create another simple linear model between the TF and the target and use the sign of the linear model as the direction of the inferred relationship.

A practical limitation of the current model architecture is that the space complexity of this model also scales quadratically. With 15,000 genes, the model requires 30 Gb GPU memory, which still fits in a single modern GPU. However, for even larger use cases, the method may require multiple GPUs. Another limitation is that the current version is designed to be applied to each individual dataset (time point or cell cluster). Thus, it does not help us develop a universal understanding of gene interactions. How to relax these restrictions, how to learn these connections in a more efficient way, and how to interpret the inferred networks biologically are important questions to consider in future work.

Supporting information

S1 Fig. Additional ablation study shows the usefulness of Dropout Augmentation.

Certain proportions of data points (x-axis) were drop to simulate background dropout noise at the very beginning.

(TIFF)

pcbi.1013603.s001.tiff (883.4KB, tiff)
S2 Fig. Quality of inferred GRN from DeepSEM-1x: AUPRC Ratio as a function of the number of training iterations.

Quality may quickly downgrade after the performance peak. Dashed line is the recommended stopping point from DeepSEM. Note that for the celltype-specific data sets, performance is particularly volatile.

(TIFF)

pcbi.1013603.s002.tif (565.7KB, tif)
S3 Fig. Distribution of 100 runs of DAZZLE-1x and DeepSEM-1x on BEELINE evaluated using the STRING network.

Results from DAZZLE-1x tend to be more stable than results from DeepSEM-1x.

(TIFF)

pcbi.1013603.s003.tif (289.9KB, tif)
S1 Text. Addition supplement information including hyperparameter choices and additional benchmarks.

(PDF)

pcbi.1013603.s004.pdf (148.4KB, pdf)

Acknowledgments

We thank Liping Liu, Rebecca Batorsky, Yijie Wang, and Teresa Przytycka for their thoughtful comments. We also appreciate the help of Hantao Shu, Jianyang Zeng, and Jianzhu Ma for sharing their data and code from the original DeepSEM paper.

Data Availability

The source code for this project (software + preprocessing scripts) is available at https://github.com/TuftsBCB/dazzle. The processed data used in this study is deposited at https://zenodo.org/records/15762315. The BEELINE benchmark data was obtained from the authors of DeepSEM but could be regenerated using the provided data and code from BEELINE (https://github.com/Murali-group/Beeline). The actual gene expression data files used in BEELINE came from from GEO datasets with the following accession numbers: GSE81252 (hHEP), GSE75748 (hESC), GSE98664 (mESC), GSE48968 (mDC), and GSE81682 (mHSC). The Hammond microglia data is obtained from NCBI’s GEO database under accession number GSE121654 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121654).

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Davidson E, Levin M. Gene regulatory networks. Proc Natl Acad Sci U S A. 2005;102(14):4935. doi: 10.1073/pnas.0502024102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008;9(10):770–80. doi: 10.1038/nrm2503 [DOI] [PubMed] [Google Scholar]
  • 3.Penfold CA, Wild DL. How to infer gene networks from expression profiles, revisited. Interface Focus. 2011;1(6):857–70. doi: 10.1098/rsfs.2011.0053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014;2:38. doi: 10.3389/fcell.2014.00038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Svensson V, Vento-Tormo R, Teichmann SA. Exponential scaling of single-cell RNA-seq in the past decade. Nat Protoc. 2018;13(4):599–604. doi: 10.1038/nprot.2017.149 [DOI] [PubMed] [Google Scholar]
  • 6.Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17(2):147–54. doi: 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P. Inferring regulatory networks from expression data using tree-based methods. PLoS One. 2010;5(9):e12776. doi: 10.1371/journal.pone.0012776 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Moerman T, Aibar Santos S, Bravo González-Blas C, Simm J, Moreau Y, Aerts J, et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics. 2019;35(12):2159–61. doi: 10.1093/bioinformatics/bty916 [DOI] [PubMed] [Google Scholar]
  • 9.Specht AT, Li J. LEAP: constructing gene co-expression networks for single-cell RNA-sequencing data using pseudotime ordering. Bioinformatics. 2017;33(5):764–6. doi: 10.1093/bioinformatics/btw729 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Matsumoto H, Kiryu H, Furusawa C, Ko MSH, Ko SBH, Gouda N, et al. SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics. 2017;33(15):2314–21. doi: 10.1093/bioinformatics/btx194 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Deshpande A, Chu L-F, Stewart R, Gitter A. Network inference with Granger causality ensembles on single-cell transcriptomics. Cell Rep. 2022;38(6):110333. doi: 10.1016/j.celrep.2022.110333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Chan TE, Stumpf MPH, Babtie AC. Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst. 2017;5(3):251-267.e3. doi: 10.1016/j.cels.2017.08.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, et al. SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017;14(11):1083–6. doi: 10.1038/nmeth.4463 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Zhang S, Pyne S, Pietrzak S, Halberg S, McCalla SG, Siahpirani AF, et al. Inference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets. Nat Commun. 2023;14(1):3064. doi: 10.1038/s41467-023-38637-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Shrivastava H, Zhang X, Song L, Aluru S. GRNUlar: a deep learning framework for recovering single-cell gene regulatory networks. J Comput Biol. 2022;29(1):27–44. doi: 10.1089/cmb.2021.0437 [DOI] [PubMed] [Google Scholar]
  • 16.Wang Y, Lee H, Fear JM, Berger I, Oliver B, Przytycka TM. NetREX-CF integrates incomplete transcription factor data with gene expression to reconstruct gene regulatory networks. Commun Biol. 2022;5(1):1282. doi: 10.1038/s42003-022-04226-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Glass K, Huttenhower C, Quackenbush J, Yuan G-C. Passing messages between biological networks to refine predicted interactions. PLoS One. 2013;8(5):e64832. doi: 10.1371/journal.pone.0064832 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Shu H, Zhou J, Lian Q, Li H, Zhao D, Zeng J, et al. Modeling gene regulatory networks using neural network architectures. Nat Comput Sci. 2021;1(7):491–501. doi: 10.1038/s43588-021-00099-8 [DOI] [PubMed] [Google Scholar]
  • 19.Kingma DP, Welling M. Auto-encoding variational bayes. arXiv preprint 2013. https://arxiv.org/abs/1312.6114
  • 20.Ghazanfar S, Bisogni AJ, Ormerod JT, Lin DM, Yang JYH. Integrated single cell data analysis reveals cell specific networks and novel coactivation markers. BMC Syst Biol. 2016;10(Suppl 5):127. doi: 10.1186/s12918-016-0370-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015;161(5):1187–201. doi: 10.1016/j.cell.2015.04.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049. doi: 10.1038/ncomms14049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rao J, Zhou X, Lu Y, Zhao H, Yang Y. Imputing single-cell RNA-seq data by combining graph convolution and autoencoder neural networks. iScience. 2021;24(5):102393. doi: 10.1016/j.isci.2021.102393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kang Y, Zhang H, Guan J. scINRB: single-cell gene expression imputation with network regularization and bulk RNA-seq data. Brief Bioinform. 2024;25(3):bbae148. doi: 10.1093/bib/bbae148 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Li WV, Li JJ. An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat Commun. 2018;9(1):997. doi: 10.1038/s41467-018-03405-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Arisdakessian C, Poirion O, Yunits B, Zhu X, Garmire LX. DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data. Genome Biol. 2019;20(1):211. doi: 10.1186/s13059-019-1837-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Qi Y, Han S, Tang L, Liu L. Imputation method for single-cell RNA-seq data using neural topic model. Gigascience. 2022;12:giad098. doi: 10.1093/gigascience/giad098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bishop CM. Training with noise is equivalent to tikhonov regularization. Neural Computation. 1995;7(1):108–16. doi: 10.1162/neco.1995.7.1.108 [DOI] [Google Scholar]
  • 29.Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint 2012. https://arxiv.org/abs/1207.0580
  • 30.Yu Y, Chen J, Gao T, Yu M. DAG-GNN: DAG structure learning with graph neural networks. arXiv preprint 2019. p. 7154–63.
  • 31.Zhu H, Slonim D. From noise to knowledge: diffusion probabilistic model-based neural inference of gene regulatory networks. J Comput Biol. 2024;31(11):1087–103. doi: 10.1089/cmb.2024.0607 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kang Y, Thieffry D, Cantini L. Evaluating the reproducibility of single-cell gene regulatory network inference algorithms. Front Genet. 2021;12:617282. doi: 10.3389/fgene.2021.617282 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kim S. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun Stat Appl Methods. 2015;22(6):665–74. doi: 10.5351/CSAM.2015.22.6.665 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Papili Gao N, Ud-Dean SMM, Gandrillon O, Gunawan R. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics. 2018;34(2):258–66. doi: 10.1093/bioinformatics/btx575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hammond TR, Dufort C, Dissing-Olesen L, Giera S, Young A, Wysoker A, et al. Single-Cell RNA sequencing of microglia throughout the mouse lifespan and in the injured brain reveals complex cell-state changes. Immunity. 2019;50(1):253-271.e6. doi: 10.1016/j.immuni.2018.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bennett ML, Bennett FC, Liddelow SA, Ajami B, Zamanian JL, Fernhoff NB, et al. New tools for studying microglia in the mouse and human CNS. Proc Natl Acad Sci U S A. 2016;113(12):E1738-46. doi: 10.1073/pnas.1525528113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Masuda T, Amann L, Sankowski R, Staszewski O, Lenz M, D Errico P, et al. Novel Hexb-based tools for studying microglia in the CNS. Nat Immunol. 2020;21(7):802–15. doi: 10.1038/s41590-020-0707-4 [DOI] [PubMed] [Google Scholar]
  • 38.Holtman IR, Raj DD, Miller JA, Schaafsma W, Yin Z, Brouwer N, et al. Induction of a common microglia gene expression signature by aging and neurodegenerative conditions: a co-expression meta-analysis. Acta Neuropathol Commun. 2015;3:31. doi: 10.1186/s40478-015-0203-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Boche D, Gordon MN. Diversity of transcriptomic microglial phenotypes in aging and Alzheimer’s disease. Alzheimers Dement. 2022;18(2):360–76. doi: 10.1002/alz.12389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Schwabenland M, Brück W, Priller J, Stadelmann C, Lassmann H, Prinz M. Analyzing microglial phenotypes across neuropathologies: a practical guide. Acta Neuropathol. 2021;142(6):923–36. doi: 10.1007/s00401-021-02370-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Pettas S, Karagianni K, Kanata E, Chatziefstathiou A, Christoudia N, Xanthopoulos K, et al. Profiling microglia through single-cell RNA sequencing over the course of development, aging, and disease. Cells. 2022;11(15):2383. doi: 10.3390/cells11152383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hagan N, Kane JL, Grover D, Woodworth L, Madore C, Saleh J, et al. CSF1R signaling is a regulator of pathogenesis in progressive MS. Cell Death Dis. 2020;11(10):904. doi: 10.1038/s41419-020-03084-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ma J, Qian C, Bao Y, Liu M-Y, Ma H-M, Shen M-Q, et al. Apolipoprotein E deficiency induces a progressive increase in tissue iron contents with age in mice. Redox Biol. 2021;40:101865. doi: 10.1016/j.redox.2021.101865 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kenkhuis B, Somarakis A, de Haan L, Dzyubachyk O, IJsselsteijn ME, de Miranda NFCC, et al. Iron loading is a prominent feature of activated microglia in Alzheimer’s disease patients. Acta Neuropathol Commun. 2021;9(1):27. doi: 10.1186/s40478-021-01126-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Amodio N, Raimondi L, Juli G, Stamato MA, Caracciolo D, Tagliaferri P, et al. MALAT1: a druggable long non-coding RNA for targeted anti-cancer approaches. J Hematol Oncol. 2018;11(1):63. doi: 10.1186/s13045-018-0606-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Gordon AD, Biswas S, Feng B, Chakrabarti S. MALAT1: a regulator of inflammatory cytokines in diabetic complications. Endocrinol Diabetes Metab. 2018;1(2):e00010. doi: 10.1002/edm2.10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zhou H-J, Wang L-Q, Wang D-B, Yu J-B, Zhu Y, Xu Q-S, et al. Long noncoding RNA MALAT1 contributes to inflammatory response of microglia following spinal cord injury via the modulation of a miR-199b/IKKβ/NF-κB signaling pathway. Am J Physiol Cell Physiol. 2018;315(1):C52–61. doi: 10.1152/ajpcell.00278.2017 [DOI] [PubMed] [Google Scholar]
  • 48.Cai L-J, Tu L, Huang X-M, Huang J, Qiu N, Xie G-H, et al. LncRNA MALAT1 facilitates inflammasome activation via epigenetic suppression of Nrf2 in Parkinson’s disease. Mol Brain. 2020;13(1):130. doi: 10.1186/s13041-020-00656-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607–13. doi: 10.1093/nar/gky1131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 2019;29(8):1363–75. doi: 10.1101/gr.240663.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Liu Z-P, Wu C, Miao H, Wu H. RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse. Database (Oxford). 2015;2015:bav095. doi: 10.1093/database/bav095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Han H, Cho J-W, Lee S, Yun A, Kim H, Bae D, et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018;46(D1):D380–6. doi: 10.1093/nar/gkx1013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. doi: 10.1038/nature11247 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Zou Z, Ohta T, Miura F, Oki S. ChIP-Atlas 2021 update: a data-mining suite for exploring epigenomic landscapes by fully integrating chip-seq, ATAC-seq and Bisulfite-seq data. Nucleic acids research. 2022;50(W1):W175–W182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Xu H, Baroukh C, Dannenfelser R, Chen EY, Tan CM, Kou Y, et al. ESCAPE: database for integrating high-content published data collected from human and mouse embryonic stem cells. Database (Oxford). 2013;2013:bat045. doi: 10.1093/database/bat045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Edgar R, Domrachev M, Lash AE. Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30(1):207–10. doi: 10.1093/nar/30.1.207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Green LA, O’Dea MR, Hoover CA, DeSantis DF, Smith CJ. The embryonic zebrafish brain is seeded by a lymphatic-dependent population of mrc1+ microglia precursors. Nat Neurosci. 2022;25(7):849–64. doi: 10.1038/s41593-022-01091-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;10(3):e0118432. doi: 10.1371/journal.pone.0118432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Manning CD, Raghavan P, Schutze H. Introduction to information retrieval. Cambridge University Press; 2008. [Google Scholar]
  • 60.Van de Sande B, Flerin C, Davie K, De Waegeneer M, Hulselmans G, Aibar S, et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat Protoc. 2020;15(7):2247–76. doi: 10.1038/s41596-020-0336-2 [DOI] [PubMed] [Google Scholar]
  • 61.Bravo González-Blas C, De Winter S, Hulselmans G, Hecker N, Matetovici I, Christiaens V, et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat Methods. 2023;20(9):1355–67. doi: 10.1038/s41592-023-01938-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Herrmann C, Van de Sande B, Potier D, Aerts S. i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules. Nucleic Acids Res. 2012;40(15):e114. doi: 10.1093/nar/gks543 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013603.r001

Decision Letter 0

Saurabh Sinha, Jian Ma

4 Feb 2025

PCOMPBIOL-D-24-01735

Improved gene regulatory network inference from single cell data with dropout augmentation

PLOS Computational Biology

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS Computational Biology. As with all papers, your manuscript was reviewed by members of the editorial board. Based on our assessment, we have decided that the work does not meet our criteria for publication and will therefore be rejected. If external reviews were secured, reviewers' comments will be included at the bottom of this email.

We are sorry that we cannot be more positive on this occasion. We very much appreciate your wish to present your work in one of PLOS's Open Access publications. Thank you for your support, and we hope that you will consider PLOS Computational Biology for other submissions in the future.

Yours sincerely,

Saurabh Sinha

Academic Editor

PLOS Computational Biology

Jian Ma

Section Editor

PLOS Computational Biology

Additional Editor Comments (if provided):

The reviewers have raised substantial concerns about the computational innovation in the work, as well as empirical demonstration of the new method's advantage over established methods.

[Note: HTML markup is below. Please do not edit.]

Reviewers' Comments (if peer reviewed):

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this manuscript, the authors considers inferring gene regulatory networks (GRNs) from single cell data. The authors introduce a strategy, DA, that dropouts some random data points. Then DA is applied to enhance a known GRN inference method, DeepSEM, and the new method is named DAZZLE. DAZZLE is tested on some data sets, and it performs better than other methods on BEELINE data sets.

DAZZLE performs better than DeepSEM. Why can you claim that the improvement is from solving the zero inflation problem? For deep learning models, interpretability is always an issue, and it is difficult to state that the improvement is due to certain reasons. One reasonable solution is: for a good data set without many zeros, apply DeepSEM to obtain performance measurement 1. Then randomly dropout some data points and apply DAZZLE to obtain performance measurement 2. If measurement 2 is close to or better than measurement 1, then you can claim that the zero inflation problem is solved.

Section 2.5, DAZZLE is applied to data at each stage of development. If I understand correctly (cf. Eq.3), DeepSEM and DAZZLE work for gene expression data at stationary. However, gene expression during development is not very stationary. In this case, are the temporal changes of GRN real, or is DAZZLE applied to an improper situation and produces unreliable results?

DAZZLE chooses to delay the introduction of sparsity loss, and stops training after 120 iterations. Since the training procedure is not described in detail, I am a little confused: did you do hyperparameter fine-tuning (including the dropout augmentation probability) and evaluation on the same data sets? If so, how can you guarantee that the good performance is not due to overfitting?

The code is well-organized. Good job.

Besides the above problems, the novelty (minor modifications of a published method) might not meet the standard of PLOS CB. I think it is better to revise it thoroughly and submit to a lower-level journal.

In introduction, explain in detail about the data type. What type of data is used for training? What types of data can DAZZLE be applied to? For data types, you just specify that they should be scRNAseq. Are they measured at one time point, or multiple time points (or you need to infer pseudo-time)? If you need data from multiple time points, do you need to measure the same cell multiple times? Do you need the cells to be intervened (i.e., driven away from stationary) before measurement?

In introduction, explain whether your method can determine the direction of regulation between gene i and gene j. Also, does it distinguish between activation and inhibition?

The first time that you use “adjacency matrix”, mention that it is the GRN you want to infer.

The introduction should contain a little more details about your methods: roughly describe the structure of your model, and the training data. Readers who haven’t read the DeepSEM paper might find that the introduction is not informative enough, and the whole paper is difficult to read.

L.112, DAZZLE looks self-contained. Raise an example about what other network components can be integrated with DAZZLE.

BEELINE paper recommends GENIE3, GRNBoost2, and PIDC. Any reason that PIDC is not compared?

L.125-126: grammar problem.

AUROC and AUPRC are both commonly used in the evaluation of GRN inference methods. Any reason that AUROC is not reported? (L.386-390 explains that AUPRC is better than AUROC, although I personally prefer adding AUROC, since it has some good properties.)

Explain more about BEELINE. What is STRING?

L.292-293, each position has an independent probability p to be dropped? Rewrite this sentence in a rigorous way (as you did in L.296-299).

L.301, training step: is this the same as epoch?

Eq.4: how can you prevent A=I? Add some explanations.

Eq.6: why is I-A invertible?

Eq.9, as BCE, why is it I (identity matrix) - M, not ones(m, n) - M?

What if a position is 0 by natural dropout, and it is reset to 0 by the mask?

Eq.11: n is already used as cell number. Use another notation. Also, it seems that the accurate expression is \sum (P_n + P_{n-1}) / 2 \times (R_n - R_{n-1}). Is your approximation necessary?

About the measurements, there is another issue: can your method distinguish between positive regulation (activation) and negative regulation (inhibition)? Do the ground truth GRNs in your datasets distinguish them? If yes, AUPRC needs to be modified since it is designed for binary classification.

Reviewer #2: In the manuscript, the authors proposed the innovative Dropout Augmentation (DA) method and the DAZZLE model to address the zero-inflation issue in single-cell RNA sequencing data, enhancing the accuracy and stability of gene regulatory network (GRN) inference. The DA strategy regularizes data by adding simulated dropout noise, and the DAZZLE model, under the structural equation model framework, uses the variational autoencoder architecture with optimizations like adjacency matrix control, structure simplification and closed-form priors. However, several points need further discussion:

1. Single-cell multi-omics data for GRN inference is a leading trend. Despite using the BEELINE dataset for benchmarking scRNA-seq-based GRN inference methods, the authors should explore the DAZZLE model’s feasibility and effectiveness in leveraging multi-omics data and its integration for better GRN inference.

2. Figure 1 shows DAZZLE’s advantage over DeepSEM due to DA. Authors should clarify the DA classifier’s role and impact on the final GRN inference results to help readers understand its contribution.

3. It would be beneficial for the authors to discuss DAZZLE’s unique advantages over existing methods, especially those brought by the Dropout Augmentation technique. Notably, its role in enhancing GRN inference stability warrants in-depth exploration, such as analyzing how it reduces inference errors and perturbations for more reliable GRN predictions.

4. Table 1 indicates DAZZLE’s performance boost over other methods, yet explanations focus on dropout and structural enhancements, lacking in-depth analysis of how they translate to accurate GRN predictions. Theoretical and empirical justifications for this are needed.

5. As seen in Table 1, DAZZLE was compared with classic methods. Given the rapid methodological advances in GRN inference, further comparison with cutting-edge methods is recommended to showcase DAZZLE’s innovation.

6. The authors used AUPRC and AUPRC Ratio to evaluate DAZZLE, a common practice. But considering the complexity of GRN inference, more comprehensive metrics like AUPR, F1-Score and AUROC should be incorporated to fully evaluate and present the capabilities and limitations of DAZZLE.

Reviewer #3: The main innovation of this manuscript lies in the proposal of the Dropout Augmentation (DA) method, which can effectively address the zero-inflation issue in scRNA-seq data. In addition, this manuscript combines the DA with the DeepSEM method to propose the DAZZLE model. Both DeepSEM and DAZZLE are gene regulatory network inference methods based on structure equation model. The differences between the above two methods are the use of DA. My primary concerns about this manuscript are the innovativeness and effectiveness, which is detailed as follows.

1. Since the main innovation of this manuscript is DA, it would better demonstrate the importance of DA if the authors could combine DA with other methods instead of just with DeepSEM. Another reason for this consideration is that the current innovation of this paper is relatively modest. The biggest difference compared to DeepSEM is DA, while DeepSEM is a study published in 2021, which is relatively old in terms of publication year.

2. I have observed that this manuscript, following the recommendation from BEELINE, has adopted AUPRC and EPR for performance demonstration. However, BEELINE also utilized AUROC. I recommend that the authors present the experimental results for AUROC. Although AUPRC is more appropriate for assessing imbalanced datasets than AUROC, it is worth noting that AUROC, being the most common evaluation metric in the field of GRN inference, can reflect the predictive ability of the model.

3. According to the results in Table 1, DAZZLE did not achieve the best performance on some datasets. For instance, when inferring the STRING network, GENIE3 exhibited the optimal performance on the hHEP and mDC datasets; when inferring non-specific networks, GENIE3 showed the best performance on the mESC and mHSC-GM datasets; and when reconstructing cell type-specific networks, GENIE3 achieved the best performance on the hHEP dataset. These experimental results indicate that the predictive performance of DAZZLE cannot surpass that of GENIE3, which was published in 2010.

4. The comparison methods are GENIE3, GRNBoost2, and DeepSEM. The authors could include more studies published after 2021 for comparison.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: None

Reviewer #3: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013603.r003

Decision Letter 1

Saurabh Sinha, Jian Ma

10 Aug 2025

PCOMPBIOL-D-24-01735R1

Improved gene regulatory network inference from single cell data with dropout augmentation

PLOS Computational Biology

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS Computational Biology.

Two of the original referees re-reviewed your revision: one of them approves (Reviewer 2) and the other (Reviewer 1) was not convinced. Reviewer 3 did not respond to the re-review request, so we obtained an additional opinion from a fourth reviewer who is favorable, with some minor suggestions. Given the mixed reviewer opinions and our own reading of your revision, we invite you to submit a revised manuscript addressing the following minor comments. Your responses will be verified by the editorial team, for a quicker turnaround time.

Reviewer 1: “Explain that each entry of the gene expression matrix represents the scRNAseq result, where the original value x is replaced by log(1+x). Otherwise, readers might wonder how 0 is handled.”

Reviewer 1: “PIDC method has a GitHub page: https://github.com/Tchanders/network_inference_tutorials. What do you mean by “this package is no longer available”?”

Reviewer 4: Comments 1, 2, 6.

(Please see reviewer comments below.)

Please submit your revised manuscript within 30 days Oct 10 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Saurabh Sinha

Academic Editor

PLOS Computational Biology

Jian Ma

Section Editor

PLOS Computational Biology

Journal Requirements:

1) We note that your Figures and Supplementary Figures files are duplicated on your submission. Please remove any unnecessary or old files from your revision, and make sure that only those relevant to the current version of the manuscript are included.

2) Your manuscript's sections are not in the correct order.  Please amend to the following order: Abstract, Introduction, Results, Discussion, and Methods.

3) We notice that your supplementary figures are uploaded with the file type 'Figure'. Please amend the file type to 'Supporting Information'. Please ensure that each Supporting Information file has a legend listed in the manuscript after the references list.

Note: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

Reviewers' comments:

Reviewer's Responses to Questions

Reviewer #1: Some comments in my previous round of review (especially those regarding math) were neither discussed in the rebuttal letter nor followed in the manuscript. Why?

The advantage of deep learning methods in GRN inference is still questionable. Besides, the math of DeepSEM is not quite reasonable. Therefore, DA as a simple improvement of DeepSEM (or other supervised learning methods) might not be sufficiently significant. I think DA does not really solve the zero inflation problem (see below), which also decreases its significance.

My suggestion is to revise it and submit it to a lower-level journal. Also, the authors should respect every review comment.

DA and zero inflation: Figure S1 shows that with or without zero inflation, DA can improve DeepSEM. However, DA cannot fully cancel the effect of zero inflation. The authors should lower the tone. Instead of the false impression that “DA can solve the zero inflation problem”, it is better to state that DA can improve performance even with zero inflation.

Development is impossible if the dynamics of gene expression (not GRN) is at stationary. The application of DAZZLE in Section 2.5 is not satisfactory, especially because there is only qualitative verification, not quantitative evaluation.

Explain that each entry of the gene expression matrix represents the scRNAseq result, where the original value x is replaced by log(1+x). Otherwise, readers might wonder how 0 is handled.

PIDC method has a GitHub page: https://github.com/Tchanders/network_inference_tutorials. What do you mean by “this package is no longer available”?

Reviewer #2: The authors have addressed my concerns and I have no further comments.

Reviewer #4: The manuscript introduces DAZZLE, a method that incorporates Dropout Augmentation to reconstruct gene regulatory networks using scRNA-seq data from human or mouse. This approach refines the existing DeepSEM framework into a new pipeline and benchmarks its performance against several ground truth GRNs from the popular BEELINE toolkit. The study presents a comprehensive evaluation using both ground truth and real-world datasets. The following minor comments will further strengthen the manuscript.

[Major comments]

1. The authors claimed improved efficiency and reduced computational time for DAZZLE. Line 108: “These changes lead to reduced model sizes and computational time”. And Lines 168-169: “a single pass of DeepSEM or DAZZLE can take several hours to compute, even on modern GPUs”. While it may not be feasible to provide detailed computational time or GPU specs, the authors could strengthen their claim by including example run times or comparative results to illustrate the efficiency gains.

2. “For datasets with a large number of genes and cells, a single pass of DeepSEM or DAZZLE can take several hours to compute, even on modern GPUs. In such cases, a single pass of DAZZLE offers a sufficiently-good solution at a more reasonable cost.” It is unclear whether the first mention of DAZZLE refers to DAZZLE-10x and the second to DAZZLE-1x. If so, please clarify, as the current phrasing appears contradictory.

3. In Section 2.5, in addition to literature evidence, the authors may consider databases such as TRRUST v2 and TFLink to query experimentally validated TF-gene interactions. This would allow for a more quantitative evaluation of DAZZLE’s performance by reporting how many ground truth TF-gene interactions were identified.

[Minor comments]

4. DeepSEM and DEEPSEM were used interchangeably in the manuscript.

5. Line 137, “celltype-specific ChiP-Seq” should be “celltype-specific ChIP-Seq”.

6. Lines 454-460, references are needed for this paragraph. For example, cisTarget and AUCell scores haven’t been mentioned in the manuscript.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: None

Reviewer #4: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #4: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013603.r005

Decision Letter 2

Saurabh Sinha, Ferhat Ay

9 Oct 2025

Dear Zhu,

We are pleased to inform you that your manuscript 'Improved gene regulatory network inference from single cell data with dropout augmentation' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Ferhat Ay, Ph.D

Section Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013603.r006

Acceptance letter

Saurabh Sinha, Ferhat Ay

PCOMPBIOL-D-24-01735R2

Improved gene regulatory network inference from single cell data with dropout augmentation

Dear Dr Zhu,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Anita Estes

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Additional ablation study shows the usefulness of Dropout Augmentation.

    Certain proportions of data points (x-axis) were drop to simulate background dropout noise at the very beginning.

    (TIFF)

    pcbi.1013603.s001.tiff (883.4KB, tiff)
    S2 Fig. Quality of inferred GRN from DeepSEM-1x: AUPRC Ratio as a function of the number of training iterations.

    Quality may quickly downgrade after the performance peak. Dashed line is the recommended stopping point from DeepSEM. Note that for the celltype-specific data sets, performance is particularly volatile.

    (TIFF)

    pcbi.1013603.s002.tif (565.7KB, tif)
    S3 Fig. Distribution of 100 runs of DAZZLE-1x and DeepSEM-1x on BEELINE evaluated using the STRING network.

    Results from DAZZLE-1x tend to be more stable than results from DeepSEM-1x.

    (TIFF)

    pcbi.1013603.s003.tif (289.9KB, tif)
    S1 Text. Addition supplement information including hyperparameter choices and additional benchmarks.

    (PDF)

    pcbi.1013603.s004.pdf (148.4KB, pdf)
    Attachment

    Submitted filename: DAZZLE_response_to_reviewers_5.31.pdf

    pcbi.1013603.s005.pdf (128.3KB, pdf)
    Attachment

    Submitted filename: response_to_reviewers_dazzle.pdf

    pcbi.1013603.s006.pdf (139.7KB, pdf)

    Data Availability Statement

    The source code for this project (software + preprocessing scripts) is available at https://github.com/TuftsBCB/dazzle. The processed data used in this study is deposited at https://zenodo.org/records/15762315. The BEELINE benchmark data was obtained from the authors of DeepSEM but could be regenerated using the provided data and code from BEELINE (https://github.com/Murali-group/Beeline). The actual gene expression data files used in BEELINE came from from GEO datasets with the following accession numbers: GSE81252 (hHEP), GSE75748 (hESC), GSE98664 (mESC), GSE48968 (mDC), and GSE81682 (mHSC). The Hammond microglia data is obtained from NCBI’s GEO database under accession number GSE121654 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121654).


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES