Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2025 Sep 17;21(9):e1013462. doi: 10.1371/journal.pcbi.1013462

Reconstructing noisy gene regulation dynamics using extrinsic-noise-driven neural stochastic differential equations

Jiancheng Zhang 1,#, Xiangting Li 2,#, Xiaolu Guo 3,*,#, Zhaoyi You 4, Lucas Böttcher 5,6, Alex Mogilner 7, Alexander Hoffmann 3, Tom Chou 2,8,*, Mingtao Xia 9,*
Editor: Michael A Beer10
PMCID: PMC12513633  PMID: 40961166

Abstract

Proper regulation of cell signaling and gene expression is crucial for maintaining cellular function, development, and adaptation to environmental changes. Reaction dynamics in cell populations is often noisy because of (i) inherent stochasticity of intracellular biochemical reactions (“intrinsic noise”) and (ii) heterogeneity of cellular states across different cells that are influenced by external factors (“extrinsic noise”). In this work, we introduce an extrinsic-noise-driven neural stochastic differential equation (END-nSDE) framework that utilizes the Wasserstein distance to accurately reconstruct SDEs from stochastic trajectories measured across a heterogeneous population of cells (extrinsic noise). We demonstrate the effectiveness of our approach using both simulated and experimental data from three different systems in cell biology: (i) circadian rhythms, (ii) RPA-DNA binding dynamics, and (iii) NFκB signaling processes. Our END-nSDE reconstruction method can model how cellular heterogeneity (extrinsic noise) modulates reaction dynamics in the presence of intrinsic noise. It also outperforms existing time-series analysis methods such as recurrent neural networks (RNNs) and long short-term memory networks (LSTMs). By inferring cellular heterogeneities from data, our END-nSDE reconstruction method can reproduce noisy dynamics observed in experiments. In summary, the reconstruction method we propose offers a useful surrogate modeling approach for complex biophysical processes, where high-fidelity mechanistic models may be impractical.

Author summary

In this work, we propose extrinsic-noise-driven neural stochastic differential equations (END-nSDE) to reconstruct noisy regulated gene expression dynamics. One of our main contributions is that we generalize a recent Wasserstein-distance-based SDE reconstruction approach to incorporate extrinsic noise (parameters that vary across different cells). Our approach can thus capture intrinsic fluctuations in gene regulatory dynamics driven by extrinsic noise (heterogeneity among cells), offering an advantage over deterministic models and outperforming other benchmarks. By inferring noise intensities from batches of experimental data, our END-nSDE can partially capture experimental noisy signaling dynamic data and provides a surrogate model for biomolecular processes that are too complex to model directly.

1. Introduction

Reactions that control signaling and gene regulation are important for maintaining cellular function, development, and adaptation to environmental changes, which impact all aspects of biological systems, from embryonic development to an organism’s ability to sense and respond to environmental signals. Variations in gene regulation, arising from noisy biochemical processes [1,2], can result in phenotypic heterogeneity even in a population of genetically identical cells [3].

Noise within cell populations can be categorized as (i) “intrinsic noise,” which arises from the inherent stochasticity of biochemical reactions and quantifies, e.g., biological variability across cells in the same state [2,4,5], and (ii) “extrinsic noise,” which encompasses heterogeneities in environmental factors or differences in cell state across a population. A substantial body of literature has focused on quantifying intrinsic and extrinsic noise from experimental and statistical perspectives [1,2,613]. Experimental studies have specifically identified relevant sources of noise in various organisms, including E. coli (Escherichia coli), yeast, and mammalian systems [2,1417].

Extrinsic noise is associated with uncertainties in biological parameters that vary across different cells. The distribution over physical and chemical parameters determine the observed variations in cell states, concentrations, locations of regulatory proteins and polymerases [1,2,18], and transcription and translation rates [19]. For example, extrinsic noise is the main contributor to the variability of concentrations of oscillating p53 protein levels across cell populations [20]. On the other hand, intrinsic noise, i.e., inherent stochasticity of cells in the same state, can limit the accuracy of expression and signal transmission [2,5]. Based on the law of mass action [21,22], ordinary differential equations (ODEs) apply only in some deterministic or averaged limit and do not take into account intrinsic noise. Therefore, stochastic models are necessary to accurately represent biological processes, such as thermodynamic fluctuations inherent to molecular interactions within regulatory networks [1,5,18] or random event times in birth-death processes.

Existing stochastic modeling methods that account for intrinsic noise include Markov jump processes [23,24] and SDEs [2527]. These approaches are applicable to different system sizes: Markov jump processes provide exact descriptions for discrete molecular systems, while SDEs serve as continuous approximations to Markov processes when molecular abundances are sufficiently high. SDE approaches may not be suitable for gene expression systems with very low copy numbers, where discrete master equation descriptions are more accurate. However, SDE approaches become more appropriate when modeling protein dynamics or when gene regulatory interactions are modeled implicitly through Hill functions. Additionally, a hierarchical Markov model was designed in [28] for parameter inference in dual-reporter experiments to separate the contributions of extrinsic noise, intrinsic noise, and measurement error when both extrinsic and intrinsic noise are present. The described methods have been effective in the reconstruction of low-dimensional noisy biological systems. Discrete master-equation methods to model the evolution of probabilities in systems characterizing, e.g., gene regulatory dynamics [2931], can be computationally expensive and usually require specific forms of a stochastic model with unknown parameters that need to be inferred. It is unclear whether such methods and their generalizations can be applied to more complex (e.g., higher-dimensional) systems for which a mechanistic description of the underlying biophysical dynamics is not available or impractical.

SDEs can capture both the mean dynamics (as ODEs do) and random fluctuations, offering a practical and scalable alternative to master equations in complex systems. Thus, we introduce an extrinsic-noise-driven neural stochastic differential equation (END-nSDE) reconstruction method that builds upon a recently developed Wasserstein distance (W2 distance) nSDE reconstruction method [32]. Our method is used to identify macromolecular reaction kinetics and cell signaling dynamics from noisy observational data in the presence of both extrinsic and intrinsic noise. A key question we address in this paper is how extrinsic noise that characterizes cellular heterogeneity influences the overall stochastic dynamics of the population.

The major differences between the approach presented here and prior work [32] are: (i) the inclusion of extrinsic noise into the framework allowing one to model cell-to-cell variability through parameter heterogeneity, and (ii) the ability of our method to learn the dependency of the SDE on those parameters, enabling reconstruction of a family of SDEs rather than a single SDE model. In contrast, the method developed in reference [32] focuses on reconstructing a single SDE without considering parameter variations or extrinsic noise sources. In Fig 1, we provide an overview of the specific applications that we study in this work.

Fig 1. Workflow of our proposed END-nSDE prediction on parameters altering stochastic dynamics.

Fig 1

A. Workflow for training and testing of the extrinsic-noise-driven neural SDE (END-nSDE). Predicted trajectories are simulated (see B) using a range of model parameters (see Sect 2.2) before splitting into training and testing sets (see Fig E in S1 Text for details on the splitting strategy). Model parameters and state variables serve as inputs to a neural network that reconstructs drift and diffusion terms (see C). Network weights are optimized by minimizing the Wasserstein distance (Eq 8) between the training set and predicted trajectories. B. Predicted trajectories are generated by the reconstructed SDE dX^=f^(X^;ω)dt+σ^(X^;ω)dBt. C. The drift and diffusion functions, f^ and σ^, are approximated using parameterized neural networks. The parameterized neural-network-based drift function f^(X^;ω) and diffusion function σ^(X^;ω) take the system state X^ and biological parameters ω as inputs. D. Table of three examples illustrating the nSDE input, along with training and testing datasets. For the last, NFκB example, a more detailed workflow for validation on experimental datasets is illustrated in Fig 8.

Our approach employs neural networks as SDE approximators in conjunction with the torchsde package [33,34] for reconstructing noisy dynamics from data. Previous work showed that for SDE reconstruction tasks, the W2 distance nSDE reconstruction method outperforms other benchmark methods such as generative adversarial networks [32,35]. Compared to other probabilistic metrics such as the KL divergence, the Wasserstein distance better incorporates the metric structure of the underlying space. This geometric property makes the Wasserstein distance particularly suitable for trajectory and image data on high-dimensional manifolds, where the supports of different distributions do not always overlap [36]. Additionally, the W2-distance-based nSDE reconstruction method can directly extract the underlying SDE from temporal trajectories without requiring specific mathematical forms of the terms in the underlying SDE model. We apply our END-nSDE methodology to three biological processes: (i) circadian clocks, (ii) RPA-DNA binding dynamics, and (iii) NFκB signaling to illustrate the effectiveness of the END-nSDE method in predicting how extrinsic noise modulates stochastic dynamics with intrinsic noise. Additionally, our method demonstrates superior performance compared to several time-series modeling methods including recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and Gaussian processes. In summary, the reconstruction method we propose provides a useful surrogate modeling approach for complex biophysical and biochemical processes, especially in scenarios where high-fidelity mechanistic models are impractical.

2. Methods and models

In this work, we extend the temporally decoupled squared W2-distance SDE reconstruction method proposed in Refs. [32,37] to reconstruct noisy dynamics across a heterogeneous cell population (“extrinsic noise”). Our goal is to not only reconstruct SDEs for approximating noisy cellular signaling dynamics from time-series experimental data, but to also quantify how heterogeneous biological parameters, such as enzyme- or kinase-mediated biochemical reaction rates, affect such noisy cellular signaling dynamics.

2.1. SDE reconstruction with heterogeneities in biological parameters

The W2-distance-based neural SDE reconstruction method proposed in [32] aims to approximate the SDE

 dX(t)=f(X(t),t)dt+σ(X(t),t)d𝐁(t),X(t)d, (1)

using an approximated SDE

 dX^(t)=f^(X^(t),t)dt+σ^(X^(t),t)dB(t),X^(t)d, (2)

where f^ and σ^ are two parameterized neural networks that approximate the drift and diffusion functions f and σ in Eq (1), respectively. These two neural networks are trained by minimizing a temporally decoupled squared W2-distance loss function

W~22(μ,μ^)=0TinfπΠ(μ(t),μ^(t))𝔼π[||X(t)X^(t)||2]dt, (3)

where Π(μ(t),μ^(t)) denotes the set of all coupling distributions π of two distributions μ(t),μ^(t) on the probability space d, and X(t) and X^(t) are the observed trajectories at time t and trajectories generated by the approximate SDE model Eq (2) at time t, respectively. μ and μ^ are the probability distributions associated with the stochastic processes {X(t)},0tT and {X^(t)},0tT, respectively, while μ(t) and μ^(t) are the probability distributions of X(t) and X^(t) at a specific time t. A coupling πΠ(μ(t),μ^(t)) between μ(t) and μ^(t) is defined by

π(A,d)=μ(A),π(d,B)=μ^(B),A,B(d), (4)

where (d) is the Borel σ-algebra on d and 𝔼π[||X(t)X^(t)||2] represents the expectation when (X,X^)~π.

The W~22 term in Eq (3) is denoted as the temporally decoupled squared W2 distance loss function. For simplicity, in this paper, we shall also denote Eq (3) as the squared W2 loss. The infimum is taken over all possible coupling distributions πΠ(μ(t),μ^(t)) and · denotes the 2 norm of a vector. That is,

|X(t)|2i=1d|Xi(t)|2. (5)

Across different cells, extrinsic noise or cellular heterogeneities such as differences in kinase or enzyme abundances resulting from cellular variabilities, can lead to variable, cell-specific, gene regulatory dynamics. Such heterogeneous and stochastic gene expression (both intrinsic and extrinsic noise) can be modeled using SDEs with distributions of parameter values reflecting cellular heterogeneity. To address heterogeneities in gene dynamics across different cells, we propose an END-nSDE method that is able to reconstruct a family of SDEs for the same gene expression process under different parameters. Specifically, for a given set of (biological) parameters ω, we are interested in reconstructing

 dX(t;ω)=f(X(t;ω);ω)dt+σ(X(t;ω);ω)dB(t), (6)

using the approximate SDE

 dX^(t;ω)=f^(X^(t;ω);ω)dt+σ^(X^(t;ω);ω)dB^(t), (7)

in the sense that the errors f(X(t;ω);ω)f^(X(t;ω);ω) and σ(X(t;ω);ω)σ^(X(t;ω);ω) for all different values of ω will be minimized. In Eq (7), f^ and σ^ are represented by two parameterized neural networks that take both the state variable X^ and the parameters ω as inputs. To train these two neural networks, we propose an extrinsic-noise-driven temporally decoupled squared W2 distance loss function

 L(Λ)=ωΛW~22(μ(ω),μ^(ω)), (8)

where μ(ω) and μ^(ω) are the distributions of the trajectories X(t;ω),0tT and X^(t;ω),0tT, and W~ is the temporally decoupled squared W2 loss function in Eq (3). Λ denotes the set of parameters ω. Eq (8) is different from the local squared W2 loss in Refs. [38,39] since we do not require a continuous dependence of {X(t;ω)}t[0,T] on the parameter ω nor do we require that ω is a continuous variable. The extrinsic-noise-driven temporally decoupled squared W2 loss function Eq (8) takes into account both parameter heterogeneity and intrinsic fluctuations as a result of the Wiener processes B(t) and B^(t) in Eqs (1) and (2).

Our END-nSDE method is outlined in Fig 1A1C. With observed noisy single-cell dynamic trajectories as the training data, we train two parameterized neural networks [40] by minimizing Eq (8) to approximate the drift and diffusion terms in the SDE. The reconstructed nSDE is a surrogate model of single-cell dynamics (see Fig 1B and 1C). The hyperparameters and settings for training the neural SDE model are summarized in Table A in S1 Text. Through the examples outlined in Fig 1D, we will show that our W2-distance-based method can yield very small errors in the reconstructed drift and diffusion functions ff^ and σσ^.

Algorithm 1 END-nSDE training and prediction framework.

Obtain training trajectories {X(t;ω)} (simulated or experimental time-series data). Maximum training epochs =imax.

Preprocess the relevant training trajectories by grouping them according to different biophysical parameters ω.

Phase 1: Training

for iimax do

  Input the initial state X0 and ωΛ into the END-nSDE to generate new predictions X^(t;ω).

  Calculate the loss function L(Λ) in Eq (8) and perform gradientdescent to train the END-nSDE model.

end for

return the trained END-nSDE model

Phase 2: Prediction

Input initial condition X0 and corresponding noise parameters ω from testing data into the trained END-nSDE model.

Generate predicted trajectories X^(t;ω) from the learned model.

2.2. Biological models

We consider three biological examples where stochastic dynamics play a critical role and use our END-nSDE method to reconstruct noisy single-cell gene expression dynamics under both intrinsic and extrinsic noise (also summarized in Fig 1D). In these applications, we investigate the extent to which the END-nSDE can efficiently capture and infer changes in the dynamics driven by extrinsic noise.

2.2.1. Noisy oscillatory circadian clock model.

Circadian clocks, often with a typical period of approximately 24 hours, are ubiquitous in intrinsically noisy biological rhythms generated at the single-cell molecular level [41].

We consider a minimal SDE model of the periodic gene dynamics responsible for per gene expression which is critical in the circadian cycle. Since per gene expression is subject to intrinsic noise [42], we describe it using a linear damped-oscillator SDE

dx=αxdtβydt+ξx1dB1,t+ξx2dB2,tdy=βxdtαydt+ξy1dB1,t+ξy2dB2,t, (9)

where x and y are the dimensionless concentrations of the per mRNA transcript and the corresponding per protein, respectively. dB1,t, dB2,t are two independent Wiener processes and the parameters α>0 and β>0 denote the damping rate and angular frequency, respectively. A stability analysis at the steady state (x,y)=(0,0) in the noise-free case (ξx=ξy=0 in Eq (9)) reveals that the real parts of the eigenvalues of the Jacobian matrix -α-ββ-α at (x,y)=(0,0) are all negative, indicating that the origin is a stable steady state when the system is noise-free. Noise prevents the state (x(t),y(t)) from remaining at (0,0); thus, fluctuations in the single-cell circadian rhythm are noise-induced [42].

To showcase the effectiveness of our proposed END-nSDE method, we take different forms of the diffusion functions ξx and ξy in Eq (9), accompanied by different values of noise strength and the correlation between the diffusion functions in the dynamics of x,y.

2.2.2. RPA-DNA binding model.

Regulation of gene expression relies on complex interactions between proteins and DNA, often described by the kinetics of binding and dissociation. Replication protein A (RPA) plays a pivotal role in various DNA metabolic pathways, including DNA replication and repair, through its dynamic binding with single-stranded DNA (ssDNA) [4346]. By modulating the accessibility of ssDNA, RPA regulates multiple biological mechanisms and functions, acting as a critical regulator within the cell [47]. Understanding the dynamics of RPA-ssDNA binding is therefore a research area of considerable biological interest and significance.

Multiple binding modes and volume exclusion effects complicate the modeling of RPA-ssDNA dynamics. The RPA first binds to ssDNA in 20 nucleotide (nt) mode, which occupies 20nt of the ssDNA. When the subsequent 10nt of ssDNA is free, 20nt-mode RPA can transform to 30nt-mode, further stabilizing its binding to ssDNA, as illustrated in Fig 2. Occupied ssDNA is not available for other proteins to bind. Consequently, the gap size between adjacent ssDNA-bound RPAs determines the ssDNA accessibility to other proteins.

Fig 2. A continuous-time discrete Markov chain model for multiple RPA molecules binding to long ssDNA.

Fig 2

The possible steps in the biomolecular kinetics of multiple RPA molecules binding to ssDNA. The RPA in the free solution can bind to ssDNA with rate k1 provided there are at least 20 nucleotides (nt) of consecutive unoccupied sites. This bound “20nt mode” RPA unbinds with rate k−1. When space permits, the 20nt-mode RPA can extend and bind an additional 10nt of DNA at a rate of k2, converting it to a 30nt-mode bound protein. The 30nt-mode RPA transforms back to 20nt-mode spontaneously with the rate k−2. However, when the gap is not large enough to accommodate the RPA, the binding or conversion is prohibited (k1=0 and k2=0).

Mean-field mass-action type chemical kinetic ODE models cannot describe the process very well because they do not capture the intrinsic stochasticity. A stochastic model that tracks the fraction of two different binding modes of RPA, 20nt-mode (x1) and 30nt-mode (x2), has been developed to capture the dynamics of this process. A brute-force approach using forward stochastic simulation algorithms (SSAs) [48] was then used to fit the model to experimental data [47]. However, a key challenge in this approach is that the model is nondifferentiable with respect to the kinetic parameters, making it difficult to estimate parameters. Yet, simple spatially homogeneous stochastic chemical reaction systems can be well approximated by a corresponding SDE of the form given in Eq (1) when the variables are properly scaled in the large system size limit [49]. While interparticle interactions shown in Fig 2 make it difficult to find a closed-form SDE approximation, results from [49] motivate the possibility of an SDE approximation for the RPA-ssDNA binding model in terms of the variables x1 and x2.

Here, to address the non-differentiability issue associated with the underlying Markov process, we use our END-nSDE model to construct a differentiable surrogate for SSAs, allowing it to be readily trained from data. Further details on the models and data used in this study are provided in Appendix B of S1 Text. Throughout our analysis of RPA-DNA binding dynamics, we benchmark the SDE reconstructed by our extended W2-distance approach against those found using other time series analysis and reconstruction methods such as the Gaussian process, RNN, LSTM, and the neural ODE model. We show that our surrogate SDE model is most suitable for approximating the RPA-DNA binding process because it can capture the intrinsic stochasticity in the dynamics.

2.2.3. NFκB signaling model.

Macrophages can sense environmental information and respond accordingly with stimulus-response specificity encoded in signaling pathways and decoded by downstream gene expression profiles [50]. The temporal dynamics of NFκB, a key transcription factor in immune response and inflammation, encodes stimulus information [51]. NFκB targets and regulates vast immune-related genes [5254]. While NFκB signaling dynamics are stimulus-specific, they exhibit significant heterogeneity across individual cells under identical conditions [51]. Understanding how specific cellular heterogeneity (extrinsic noise) contributes to heterogeneity in NFκB signaling dynamics can provide insight into how noise affects the fidelity of signal transduction in immune cells.

A previous modeling approach employs a 52-dimensional ODE system to quantify the NFκB signaling network [51] and recapitulate the signaling dynamics of a representative cell. This ODE model includes 52 molecular entities and 47 reactions across a TNF-receptor module, an adaptor module, and a core module with and NFκB-IKK-IκBα (IκBα is an inhibitor of NFκB, while IKK is the IκB kinase complex that regulates the IκBα degradation) feedback loop (see Fig 3) [55]. However, such an ODE model is deterministic and assumes no intrinsic fluctuations in the biomolecular processes. Yet, from experimental data, the NFκB signaling dynamics fluctuate strongly; such fluctuations cannot be quantitatively described by any deterministic ODE model. Due to the system’s high dimensionality and nonlinearity, it is challenging to quantify how intrinsic noise influences temporal coding in NFκB dynamics.

Fig 3. Simplified schematic of the NFκB Signaling Network.

Fig 3

TNF binds its receptor, activating IKK, which degrades IκBα and releases NFκB. The free NFκB translocates to the nucleus and promotes IκBα transcription. Newly synthesized IκBα then binds NFκB and exports it back to the cytoplasm. Red arrows indicate noise that we consider in the corresponding SDE system.

To incorporate the intrinsic noise within the NFκB signaling network, we introduce noise terms into the 52-dimensional ODE system to build an SDE that can account for the observed temporally fluctuating nuclear NFκB trajectories. While NFκB signaling pathways involve many variables, experimental constraints limit the number of measurable components. Among these, nuclear NFκB concentration is the most direct and critical experimental readout. As a minimal stochastic model, we hypothesize that only the biophysical and biochemical processes of NFκB translocation (which directly affects experimental measurements) and IκBα transcription (a key regulator of NFκB translocation) are subject to Brownian-type noise (red arrows in Fig 3), as these processes play crucial roles in the oscillatory dynamics of NFκB [55].

The intensity of Brownian-type noise in the NFκB dynamics may depend on factors such as cell volume (smaller volumes result in higher noise intensity), or copy number (lower copy numbers lead to greater noise intensity), and is therefore considered a form of extrinsic noise. Noise intensity parameters thus capture an aspect of cellular heterogeneity. There are other sources of cellular heterogeneity, such as variations in kinase or enzyme abundances, which are too complicated to model and are thus not included in the current model. For simplicity, all kinetic parameters, except for the noise intensity (σ), are assumed to be consistent with those of a representative cell [55]. The 52-dimensional ODE model for describing NFκB dynamics is given in Refs. [51,56]. We extend this model by adding noise to the dynamics of the sixth, ninth, and tenth ODEs of the 52-dimensional ODE model. We retain 49 ODEs but convert the equations for the sixth, ninth, and tenth components to SDEs:

du6=(kbasal+kmaxu52nNFκBu52nNFκB+KNFκBnNFκBkdegu6)dt+σ1dB1,tdu9=(kimpu9ka-IκB-NFκBu2u9kdeg-NFκBu9+v1kexpu10+kd-IκB-NFκBu4+kphosu7)dtσ2dB2,tdu10=(kexpu10ka-IκB-NFκBu3u10+vkimpu9+kd-IκB-NFκBu5)dt+σ2dB2,t. (10)

In Eqs 10, u2 is the concentration of IκBα in the cytoplasm; u3 is the concentration of IκBα in the nucleus; u4 is the concentration of the IκBα-NFκB complex; u5 is the concentration of the IκBα-NFκB complex in the nucleus; u6 is the mRNA of IκBα; u7 is the IKK-IκBα-NFκB complex; u9 is NFκB; u10 represents nuclear NFκB concentration; and u52 is the nuclear concentration of NFκB with RNA polymerase II that is ready to initiate mRNA transcription. A description of the parameters and their typical values are given in Table C in S1 Text. The quantities σ1dB1,t and σ2dB2,t are noise terms associated with IκBα transcription and NFκB translocation, respectively. The remaining variables are latent variables and their dynamics are regulated via the remaining 49-dimensional ODE in Refs. [51,56]. The activation of NFκB is quantified by the total nuclear NFκB concentration (u5+u10), which is also measured in experiments.

Within this example, we wish to determine if our proposed parameter-associated nSDE can accurately reconstruct the dynamics underlying experimentally observed NFκB trajectory data.

3. Results

3.1. Accurate reconstruction of circadian clock dynamics

As an illustrative example, we use the W2-distance nSDE reconstruction method to first reconstruct the minimal model for damped oscillatory circadian dynamics (see Eq (9)) under different forms of the diffusion function. We set the two parameters α=0.19 and β=0.21 in Eq (9) and impose three different forms for the diffusion functions ξx1,ξx2,ξy1,ξy2: a constant diffusion function [57], a Langevin [58] diffusion function, and a linear diffusion function [59]. These functions, often used to describe fluctuating biophysical processes, are

 const:[ξx1ξx2ξy1ξy2]=σ0[1cc1], (11)
 Langevin:[ξx1ξx2ξy1ξy2]=σ0[|x|c|y|c|x||y|], (12)

and

 linear:[ξx1ξx2ξy1ξy2]=σ0[xc|y|c|x|y]. (13)

There are two additional parameters in Eqs (11), (12), and (13): σ0 that determines the intensity of the Brownian-type fluctuations and c that controls the correlation of fluctuations between the two dimensions. For each type of diffusion function, we trained a different nSDE model, each of which takes the state variables (x,y) and the two parameters (c,σ0) as inputs and which outputs the values of the reconstructed drift and diffusion functions.

We take 25 combinations of (σ0,c){(0.1+0.05i,0.2+0.2j),i{0,...,4},j{0,...,4}}; for each combination of (ξx1,ξx2,ξy1,ξy2), we generate 50 trajectories from the ground truth SDE (9) as the training data with t[0,1]. The initial condition is set as (x(0),y(0))=(0,1). To test the accuracy of the reconstructed diffusion and drift functions, we measure the following relative errors:

 Error in fi=1Mj=0T|f(Xi(tj;ω);ω)f^(X^i(tj;ω);ω)|1j=0T|f(Xi(tj;ω);ω)|1, (14)
 Error in σi=1Mj=0T||σ(Xi(tj;ω);ω)σT(Xi(tj;ω),tj;ω)||σ^(Xi(tj;ω);ω)σ^T(tj;ω)||mi=0Mj=0d|σ(Xi(tj;ω);ω)σT(Xi(tj;ω);ω)|m. (15)

Here, f(αxβy,βxαy)T is the vector of ground truth drift functions and f^ is the reconstructed drift function. σ is the matrix of ground truth diffusion functions [ξx1,ξx2;ξy1,ξy2] given in Eqs 11, (12), and (13). M is the number of training samples, |·|1 denotes the 1 norm of a vector, and the matrix norm |A|mi=1mj=1n|Aij| for a matrix Am×n. The errors are measured separately for different parameters ω(σ0,c).

The errors in the reconstructed drift function f^ and diffusion function σ^ as well as the temporally decoupled squared W2 loss Eq (3) associated with different forms of the diffusion function and different values of (σ0,c) are shown in Fig 4. When the diffusion function is a constant Eq (11), the mean reconstruction error of the drift function is 0.15, the mean reconstruction error of the diffusion function is 0.16, and the mean temporally decoupled squared W2 loss between the ground truth trajectories and the predicted trajectories is 0.074 (averaged over all sets of parameters (σ0,c)). When a Langevin-type diffusion function Eq (12) is used as the ground truth, the mean errors for the reconstructed drift and diffusion functions are 0.069 and 0.29, respectively, and the mean temporally decoupled squared W2 loss between the ground truth and predicted trajectories is 0.020. For a linear-type diffusion function as the ground truth, mean reconstruction errors of the drift and diffusion functions are 0.19 and 0.41, respectively, and the mean temporally decoupled squared W2 distance is 0.013. For all three forms of diffusion, our END-nSDE method can accurately reconstruct the drift function (αxβy,βxαy) (see Fig 4D4F). When the diffusion function is a constant, our END-nSDE model can also accurately reconstruct this constant (see Fig 4G). When the diffusion function takes a more complicated form such as the Langevin-type diffusion function Eq (12) or the linear-type diffusion function Eq (13), the reconstructed nSDE model can still approximate the diffusion function well for most combinations of (σ0,c), especially when the correlation c>0.2 (see Fig 4H4I). Overall, our proposed END-nSDE model can accurately reconstruct the minimal stochastic circadian dynamical model Eq (9) in the presence of extrinsic noise (different values of (σ0,c)); the accuracy of the reconstructed drift and diffusion functions is maintained for most combinations of (σ0,c). While the drift function is reconstructed with high accuracy, the reconstructed diffusion function exhibits larger relative errors, particularly for models with more complex diffusion forms. How errors depend on the functional forms of the diffusion should be investigated.

Fig 4. Reconstructing the circadian model using END-nSDE.

Fig 4

Temporally decoupled squared W2 losses Eq (3) and errors in the reconstructed drift and diffusion functions for different types of the diffusion function and different values of (σ0,c). A-C. The temporally decoupled squared W2 loss between the ground truth trajectories and the trajectories generated by the reconstructed nSDEs for the constant-type diffusion function Eq (11), Langevin-type diffusion function Eq (12), and the linear-type diffusion function Eq (13). D-F. Errors in the reconstructed drift function for the three different types of ground truth diffusion functions and the linear-type diffusion function Eq (13). G-I. Errors in the reconstructed diffusion function for the three different types of ground truth diffusion functions.

To investigate how the strengths of the extrinsic and intrinsic noise and affect our reconstruction of extrinsic-noise-driven SDEs, we conduct an additional test on the reconstruction of circadian clock dynamics. We generate training trajectories from a revised version of Eq (9):

dx=αxdtβydt+(σ0+σ1k1)xdB1,tdy=βxdtαydt+(σ0+σ1k2)ydB2,t. (16)

In Eq (16), for each set of (σ0,σ1), we generate 25 groups of (k1,k2){0,±0.5,±1}×{0,±0.5,±1} with each group containing 50 trajectories as training data. To train the neural SDE model, both the state variables (x,y) and (k1,k2) are input into the neural SDE. σ0 characterizes the average level of intrinsic noise while σ1 represents the strength of extrinsic noise, and we use different values of (σ0,σ1). As shown in Fig 5B and 5C, errors in the reconstructed drift function and in the reconstructed diffusion function, averaged over all different sets of (k1,k2), increases with both σ0 and σ1. Specifically, an increase in the intrinsic noise level (σ0) reduces the reconstruction accuracy more than an increase in the extrinsic noise (σ1) does. More analysis on how the variation in intrinsic noise and extrinsic noise could affect the accuracy of the reconstructed drift and diffusion functions using our proposed END-nSDE method is promising.

Fig 5.

Fig 5

Average temporally decoupled squared W2 losses Eq (3) and errors in the reconstructed drift and diffusion functions for different choices of intrinsic noise strength and extrinsic noise strength (σ0,σ1) in Eq (16).

3.2. Accurate approximation of interacting DNA-protein systems with different kinetic parameters

To construct a differentiable surrogate for stochastic simulation algorithms (SSAs), the neural SDE model should be able to take kinetic parameters as additional inputs. Thus, the original W2-distance SDE reconstruction method in [32] can no longer be applied because the trained neural SDE model cannot take into account extrinsic noise [60], i.e., different values of kinetic parameters. To be specific, we vary one parameter (the conversion rate k2 from 20nt-mode RPA to 30nt-mode RPA) in the stochastic model and then apply our END-nSDE method which takes the state variables and the kinetic parameter k2 as the input. We set k2{104+j/10,j=0,...,25} with other parameters taken from experiments [47] (k1=103 s−1, k1=106 s−1, k2=106 s−1, see Fig 2). For each k2, we generate 100 trajectories and use 50 for the training set and the other 50 for the testing set. Each trajectory encodes the dynamics of the fraction of 20nt-mode DNA-bound RPA x1(t) and the fraction of 30nt-mode DNA-bound RPA x2(t).

When approximating the dynamics underlying the RPA-DNA binding process, we compare our SDE reconstruction method with other benchmark time-series analysis or reconstruction approaches, including the RNN, LSTM, Gaussian process, and the neural ODE model [61,62]. These benchmarks are described in detail in Appendix C in S1 Text.

The extrinsic-noise-driven temporally decoupled squared W2 distance loss Eq (8) between the distribution of the ground truth trajectories and the distribution of the predicted trajectories generated by our END-nSDE reconstructed SDE model is the smallest among all methods (shown in Table 1). The underlying reason is an SDE well approximates the genuine Markov counting process underlying the continuum-limit RPA-DNA binding process [49]. The RNN and LSTM models do not capture the intrinsic fluctuations in the counting process. The neural ODE model is a deterministic model and cannot capture the stochasticity in the RPA-DNA binding dynamics. Additionally, the Gaussian process can only accurately approximate linear SDEs, which is not an appropriate form for an SDE describing the RPA-DNA binding process.

Table 1. The extrinsic-noise-driven time-decoupled squared W2 distance Eq (8) between the ground truth and predicted trajectories generated by different models on the testing set.

Model Loss
END-nSDE 0.0006
LSTM 0.062
RNN 0.087
nODE 0.0012
Gaussian Process 0.0010

In Fig 6A and 6B, we plot the predicted trajectories obtained by the trained neural SDE model for two different values lgk2=4 and lgk2=1.5. Actually, for all different values of k2, trajectories generated by our END-nSDE method match well with the ground truth trajectories on the testing set, as the temporally decoupled squared W2 loss is maintained small for all k2 (shown in Fig 6C). This demonstrates the ability of our method to capture the dependence of the stochastic dynamics on biochemical kinetic parameters.

Fig 6. Reconstructed trajectories of the RPA-DNA binding model.

Fig 6

A. Sample ground truth and reconstructed trajectories evaluated at lgk2=4, where we use the convention that lg=log10. B. Sample ground truth and reconstructed parameters evaluated at lgk2=1.5. C. Temporally decoupled squared W2 distances (see Eq (8)) between the ground truth and reconstructed trajectories evaluated at different lgk2 values. In A and B, blue and red trajectories represent the filling fractions of DNA by 20nt-mode and 30nt-mode RPA, respectively. The dashed lines represent the predicted trajectories, and the solid lines represent the ground truth. Throughout the figure, the data are generated by a single neural SDE model that accepts the conversion rate k2 as a parameter and outputs the trajectories.

3.3. Reconstructing high-dimensional NFκB signaling dynamics from simulated and experimental data

Finally, we evaluate the effectiveness of the END-nSDE framework in reconstructing high-dimensional NFκB signaling dynamics under varying noise intensities and investigate the performance of the neural SDE method in reconstructing experimentally measured noisy NFκB dynamics. The procedure is divided into two parts. First, we trained and tested our END-nSDE method on synthetic data generated by the NFκB SDE model Eq (10) under different noise intensities (σ1,σ2). Second, we test whether the trained END-nSDE can reproduce the experimental dynamic trajectories.

3.3.1. Reconstructing a 52-dimensional stochastic model for NFκB dynamics.

For training END-nSDE models, we first generated synthetic data from the 52-dimensional SDE model of NFκB signaling dynamics Eqs 19 and established models [51,56]. The synthetic trajectories are generated under 121 combinations of noise intensity (σ1,σ2) in Eqs (10) (see Appendix D of S1 Text). The resulting NFκB trajectories vary depending on noise intensity, with low-intensity noise producing more consistent dynamics across cells (see Fig 7A) and higher-intensity noise yielding more heterogeneous dynamics (see Fig 7B). The simulated ground truth trajectories are split into training and testing datasets (see Appendix E in S1 Text for details). Specifically, we excluded 25 combinations of noise intensities (σ1,σ2) from the training set in order to test the generalizability of the trained neural SDE model on noisy intensities.

Fig 7. Reconstruction of NFκB signaling dynamics.

Fig 7

A. Sample trajectories of nuclear NFκB concentration as a function of time with σ1=103.2, σ2=102.5. B. Sample trajectories of nuclear NFκB concentration as a function of time with σ1=102.2, σ2=101.5. C. Reconstructed nuclear NFκB trajectories generated by the trained neural SDE versus the ground truth nuclear NFκB trajectories under noise intensities σ1=103.2, σ2=102.5 in Eq (10). D. Reconstructed nuclear NFκB trajectories generated by the trained neural SDE versus the ground truth nuclear NFκB trajectories under noise intensities σ1=102.2, σ2=101.5. E. The squared W2 distance between the distributions of the predicted trajectories and ground truth trajectories on the training set under different noise strengths (σ1,σ2). For training, we randomly selected 50% sample trajectories in 80 combinations of noise strengths (σ1,σ2) as the training dataset. Blank cells indicate that the corresponding parameter set is not included in the training set. F. Validation of the trained model by evaluating the squared W2 distance between the distributions of predicted trajectories and ground truth trajectories on the validation set.

Next, as detailed in Appendix E of S1 Text, we trained a 52-dimensional neural SDE model using our END-nSDE method on synthetic trajectories. The loss function is based on the W2 distance between the distributions of the neural SDE predictions in Eqs (10) and the simulated nuclear IκBα-NFκB complex and nuclear NFκB activities (u5(t) and u10(t), respectively) and the corresponding END-nSDE predictions. The remaining 50 variables of the NFκB system were treated as latent variables, as they are not directly included in the loss function calculation.

Although the NFκB dynamics vary under different noise intensities (σ1,σ2), the trajectories generated by our trained neural SDE closely align with the ground truth synthetic NFκB dynamics under different noise intensities (σ1,σ2) (see Fig 7C and 7D). The neural SDE model demonstrates greater accuracy in reconstructing NFκB dynamics when the noise in IκBα transcription (σ1) is smaller, as evidenced by the reduced squared W2 distance between the predicted and ground-truth trajectories on both the training and validation sets (see Fig 7E and 7F). The temporally decoupled squared W2 loss Eq (8) on the validation set is close to that on the training set for different values of noise intensities (σ1,σ2). The mean squared W2 distance across all combinations of noise intensities (σ1,σ2) is 0.0013 for the training set, and the validation set shows a mean squared W2 distance of 0.0017.

Since the loss function for this application involves only two variables out of 52, we also tested whether the “full” 52-dimensional NFκB system can be effectively modeled by a two-dimensional neural SDE. After training, we found that the reduced model was insufficient for reconstructing the full 52-dimensional dynamics, as it disregarded the 50 latent variables not included in the loss function (see Fig D in Appendix F of S1 Text). This result underscores the importance of incorporating latent variables from the system, even when they are not explicitly included in the loss function.

3.3.2. Reproducing NFκB data with a trained END-nSDE.

We assessed whether our proposed END-nSDE can accurately reconstruct the experimentally measured NFκB dynamic trajectories. For simplicity and feasibility, we tested the END-nSDE under the assumption that: (1) all cells share the same drift function, and (2) cells with trajectories deviating similarly from their ODE predictions have the same noise intensities. Based on these assumptions, we developed the following workflow (see Fig 8):

Fig 8. Workflow of reconstructing experimental data via END-nSDE.

Fig 8

Workflow for reconstructing experimental data using the trained parameterized nSDE and the parameter-inference neural network (NN). The boxes on the left outline the steps of the experimental data reconstruction process, while the boxes on the right illustrate the corresponding results at each step.

  1. We used experimentally measured single-cell trajectories of NFκB concentration, obtained through live-cell image tracking of macrophages from mVenus-tagged RelA mouse with a frame frequency of five minutes [63], yielding a total of 31 consecutive time points. These trajectories correspond to the sum of nuclear IκBαNFκB and NFκB concentration in the 52D SDE model (u5(t) and u10(t) in Eq (10)).

  2. The experimental dataset was divided into subgroups. Cosine similarity was calculated between the ODE-generated trajectory (representative-cell NFκB dynamics) and experimental trajectories. The trajectories are then ranked and divided into different groups based on their cosine similarity with the trajectory generated from the ODE model [64]. Experimental trajectories with higher similarity to the ODE trajectory are expected to exhibit smaller intrinsic fluctuations, corresponding to lower noise intensities (see Appendix G in S1 Text for details).

  3. Each group of experimental trajectories was input into the trained neural network (see the next paragraph for more details) to infer the corresponding noise intensities (σ1,σ2). For simplicity, we assume that trajectories within each group share the same noise intensities.

  4. The inferred noise is then used as inputs for the trained END-nSDE to simulate NFκB trajectories.

  5. The simulated trajectories were compared with the corresponding experimental data to evaluate the model’s performance.

To estimate noise intensities from different groups of experimentally measured single-cell nuclear NFκB trajectories (step (3) in the proposed workflow), we trained another neural network to predict the corresponding IκBα transcription and NFκB translocation noise intensities from the groups of NFκB trajectories in the synthetic training data, similar to the approach taken in [65]. The trained neural network can then be used for predicting noise intensities in the validation set (see Appendix H in S1 Text for technical details).

Assessing the impact of group size (number of trajectories) on noise intensity prediction performance, we found that taking a group size of at least two leads to a relative error of around 0.1 (see Fig 9A). Given the high heterogeneity present in experimental data, we took a group size of 32 as the input into the neural network. Under this group size, the relative errors in the predicted noise intensities were 0.021 on the training set and 0.062 on the testing set (see Fig 9B and 9C).

Fig 9. Inferring intrinsic noise intensities and reconstructing experimental data via END-nSDE.

Fig 9

A. Plots showing the mean (solid circles) and variance (error bars) of the relative error in the reconstructed noise intensities (σ^1,σ^2) predicted by the parameter-inference NN for the testing dataset, as a function of the group size of input trajectories. B. Heatmaps showing the relative error in the reconstructed noise intensities for the training dataset. Colored cells represent results from the parameter-inference NN for the training dataset, while blank cells indicate noise strength values not included in the training set. C. Heatmaps showing the relative error in the diffusion function for the testing dataset. D. The inferred intensity of IκBα transcription noise (σ1) and NFκB translocation noise (σ2) in different groups of experimental trajectories, plotted against the group’s ranking in decreasing similarity with the representative ODE trajectory. E-H. Groups of experimental and nSDE-reconstructed trajectories ranked by decreasing cosine similarity: #1 (E), #4 (F), #16 (G), #29 (H). The squared W2-distance between experimental and SDE-generated trajectories are 0.157 (E), 0.143 (F), 0.212 (G), 0.236 (H). The inferred noises are (10−0.49,10−0.81) (E), (10−0.47,10−0.78) (F), (10−0.46,10−0.74) (G), (10−0.44,10−0.71) (H). I. The temporally decoupled squared W2 distance between reconstructed trajectories generated by the trained END-nSDE and groups of experimental trajectories, ordered according to decreasing similarity with the representative ODE trajectory.

Using the trained neural network, we inferred noise intensities for the experimental data, which were grouped based on their cosine similarities with the representative-cell trajectory (deterministic ODE) with a group size of 32. The predicted noise intensities on the experimental dataset are larger than the noise intensities on the training set, possibly because unmodeled extrinsic noise complicates the inference of noise intensity. The transcription noise of IκBα is predicted to be within the range of [10−0.81,10−0.71] (see Fig 9D). In addition, the inferred noise for NFκB translocation fell within [10−0.49,10−0.43] (see Fig 9D). These inferred noise intensities were then used as inputs to the END-nSDE to simulate NFκB trajectories.

We compare the reconstructed NFκB trajectories generated by the trained neural SDE model with the experimentally measured NFκB trajectories (see Fig 9E9I). The trajectories generated using our END-nSDE method successfully reproduce the experimental dynamics for the majority of time points for the top 50% of cell subgroups most correlated with the representative-cell ODE model (see Fig 9E9G, Fig 9I).

For the top-ranked subgroups (#1 to #16), the heterogeneous nSDE-reconstructed dynamics align well with the experimental data for the first 100 minutes. The predicted trajectories deviate more from ground truth trajectories observed in experiments after 100 minutes possibly due to error accumulation and errors in the predicted noise intensity. For experimental subgroups that significantly deviate from the representative-cell ODE model, the END-nSDE struggles to fully capture the heterogeneous trajectories. This limitation likely arises from the assumption that all cells in a group share the same underlying dynamics, whereas in reality, substantial cellular differences in underlying dynamics exist due to heterogeneity in the drift term, an aspect not accounted for in END-nSDE due to the high computational cost.

With sufficient data and computational resources, our proposed workflow is able to incorporate extrinsic noise in the drift terms, allowing for further discrimination of experimental trajectories. Our END-nSDE method can partially reconstruct experimental datasets and has the potential to fully capture experimental dynamics. Furthermore, trajectories generated from the trained END-nSDE model can reproduce the intrinsic fluctuations in the observed NFκB signaling dynamics which are inaccessible to the representative-cell ODE model.

4. Discussion

In this work, we used the W2-distance to develop an END-nSDE reconstruction method that takes into account extrinsic noise in gene expression dynamics as observed across various biophysical and biochemical processes such as circadian rhythms, RPA-DNA binding, and NFκB translocation. We first demonstrated that our END-nSDE method can successfully reconstruct a minimal noise-driven fluctuating SDE characterizing the circadian rhythm, showcasing its effectiveness in reconstructing SDE models that contain both intrinsic and extrinsic noise. Next, we used our END-nSDE method to learn a surrogate extrinsic-noise-driven neural SDE, which approximates the RPA-DNA binding process. Molecular binding processes are usually modeled by a Markov counting process and simulated using Monte-Carlo-type stochastic simulation algorithms (SSAs) [48]. Our END-nSDE reconstruction approach can effectively reconstruct the stochastic dynamics of the RPA-ssDNA binding process while also taking into account extrinsic noise (heterogeneity in biological parameters among different cells). Our END-nSDE method outperforms several benchmark methods such as LSTMs, RNNs, neural ODEs, and Gaussian processes.

Finally, we applied our methodology to analyze NFκB trajectories collected from over a thousand cells. Not only did the neural SDE model trained on the synthetic dataset perform well on the validation set, but it also partially recapitulated experimental trajectories of NFκB abundances, particularly for subgroups with dynamics similar to those of the representative cell. These results underscore the potential of neural SDEs in modeling and understanding the role of intrinsic noise in complex cellular signaling systems [6668].

When the experimental trajectories were divided into subgroups, we assumed that all cells across different groups shared the same drift function (as in the representative ODE) and cells within each group shared the same diffusion term. We found that subgroups with dynamics more closely aligned with the deterministic ODE model resulted in better reconstructions. In contrast, for experimental trajectories that deviated significantly from the representative ODE model, their underlying dynamics may differ from those defined by the representative cell’s ODE. Therefore, the assumption that a group shares the same drift function as the representative cell ODE holds only when the trajectories closely resemble the ODE. Incorporating noise into the drift term for training the neural SDE could potentially address this issue. We did not consider this approach due to the high computational cost required for training.

Applying our method to high-dimensional synthetic NFκB datasets, we showed the importance of incorporating latent variables. This necessity arises because the ground-truth dynamics of the measured quantities (nuclear NFκB) are not self-closed and inherently depend on additional variables. Consequently, the 52-dimensional SDE reconstruction requires more variables than just the “observed” dynamics of nuclear NFκB. In this example, the remaining 50 variables in the nSDE were treated as latent variables, even though they were not explicitly included in the loss function.

For high-dimensional systems (e.g., 52 dimensions as in our NFκB example), analyzing stochastic dynamics remains challenging. Even though regulated processes do not follow gradient dynamics in general, imposing a self-consistent energy landscape and adopting lower dimensional projections can provide a valuable framework for studying stochastic dynamics of high-dimensional biological systems [6971]. Once an effective energy landscape is identified, prior knowledge about the system structure can be incorporated into the neural SDE framework through the following formulation:

dX=(F(X)+f^(X;ω))dt+σ^(X;ω)dBt, (17)

where F(X)=E(X) represents the prior knowledge of the energy landscape. The neural networks f^ and σ^ then learn deviations from the prior knowledge and the unknown intrinsic noise. Such prior information on the energy landscape could facilitate training and improve accuracy of the learned model [37]. How imposition of a high-dimensional landscape as a constraint affects our W2-distance-based inference and how this potential sold be interpreted should be explored in more depth. If meaningful and informative, prior results on how landscapes can be used to characterize neural networks across various tasks can be leveraged [72,73].

Finally, neural SDEs can serve as surrogate models for complex biomedical dynamics [74,75]. Combining such surrogate models with neural control functions [72,76,77] can be useful for tackling complex biomedical control problems. As shown in preceding work [32,37], a larger number of training trajectories led to a more accurate reconstructed neural SDE. However, in biological experiments, obtaining more training trajectories could be more expensive. Therefore, it is of biological significance to find out the number of training trajectories that can be practically obtained in real experiments and that are necessary for an accurate reconstruction of the intrinsic-noise-aware SDE using our END-nSDE approach. Finally, it is worth further investigation to find out the biophysical molecular processes in which taking into account intrinsic fluctuations is necessary. In such problems, using our END-nSDE framework to reconstruct the noisy molecular dynamics could yield a more biologically reasonable, noise-aware model than first-principle-based mass-action ODE models.

While our work focuses on gene regulation dynamics, it is important to emphasize that the END-nSDE reconstruction method is general and can potentially be applied to biological systems beyond gene regulation. The method’s ability to capture both intrinsic and extrinsic noise makes it suitable for modeling various stochastic biological processes, including, but not limited to, signal transduction networks, metabolic pathways, population dynamics, and developmental processes. The examples we chose-circadian rhythms, RPA-DNA binding dynamics, and NFκB signaling-were selected to demonstrate the method’s capabilities and bring the neural SDE approach to the attention of the molecular and cell biology community. Future applications could extend to other domains such as epidemiology, ecology, and systems biology, where stochastic dynamics that could be described by SDEs with heterogeneity among different cells or individuals are prevalent.

Besides better understanding effective energy landscape constraints, there are several promising directions for future research. First, techniques to extract an explicit form of the learned neural network SDEs can be developed. For example, one could employ a polynomial model as the reconstructed drift and diffusion functions in the SDE [78]. Such an explicit functional form of the approximate SDE may facilitate biological interpretation of the underlying model. Recent research has also shed light on directly interpreting trained neural networks using simple functions such as polynomials [79]. Therefore, one can apply such methods to extract the approximate forms from the learned drift and diffusion functions in the neural SDE and interpret their biophysical meaning.

Another promising avenue of investigation is to combine discrete and continuous modeling approaches to account for both mRNA and protein dynamics. Such a hybrid approach would use discrete Markov jump processes for low-abundance species (such as mRNA) while employing SDEs for high-abundance species (such as proteins), thereby addressing the limitations of pure SDE approaches when molecular counts approach zero.

Finally, the presence of unobserved variables in cellular systems poses a significant challenge for accurate SDE modeling. Many cellular processes involve hidden regulatory mechanisms, unmeasured metabolites, or latent cellular states that influence the observed dynamics but are not directly captured in experimental measurements. This limitation can lead to model misspecification, where the inferred drift and diffusion functions compensate for missing variables, potentially resulting in biased parameter estimates and poor predictive performance. A more realistic scenario occurs when we already know what molecules can have an effect on the dynamics, but experiments can only report a few molecular species. In such cases, we can model the full system dynamics with the full dimension with a parameterized model for sampling the initial values of those unobserved variables. The rest of the training procedure would be the same as in the main text.

Supporting information

S1 Text. Technical appendices.

(PDF)

pcbi.1013462.s001.pdf (1.7MB, pdf)

Acknowledgments

We acknowledge Stefanie Luecke for providing the experimental datasets for NFκB dynamics.

Data Availability

No data was created in this research. All data used in this research are publicly available at https://www.nature.com/articles/s41467-023-39579-y and https://www.embopress.org/doi/full/10.1038/s44320-024-00047-4 and have been properly cited. The simulated datasets, neural SDE model code, and analysis scripts to replicate the study findings are available on GitHub at https://github.com/JianchengZ/Neural-SDE-GeneDynamics.

Funding Statement

XG acknowledges financial support from UCLA Collaboratory Fellowship. LB acknowledges financial support from hessian. AI and the ARO through grant W911NF-23-1-0129. TC acknowledges inspiring discussions at the “Statistical Physics and Adaptive Immunity” program at the Aspen Center for Physics, which is supported by the National Science Foundation grant PHY-2210452. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. XG and AH acknowledge support from NIH R01AI173214.

References

  • 1.Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci U S A. 2002;99(20):12795–800. doi: 10.1073/pnas.162041399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297(5584):1183–6. doi: 10.1126/science.1070919 [DOI] [PubMed] [Google Scholar]
  • 3.Sanchez A, Choubey S, Kondev J. Regulation of noise in gene expression. Annu Rev Biophys. 2013;42:469–91. doi: 10.1146/annurev-biophys-083012-130401 [DOI] [PubMed] [Google Scholar]
  • 4.Foreman R, Wollman R. Mammalian gene expression variability is explained by underlying cell state. Mol Syst Biol. 2020;16(2):e9146. doi: 10.15252/msb.20199146 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Mitchell S, Hoffmann A. Identifying Noise Sources governing cell-to-cell variability. Curr Opin Syst Biol. 2018;8:39–45. doi: 10.1016/j.coisb.2017.11.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Thattai M, van Oudenaarden A. Intrinsic noise in gene regulatory networks. Proc Natl Acad Sci U S A. 2001;98(15):8614–9. doi: 10.1073/pnas.151588598 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Tsimring LS. Noise in biology. Rep Prog Phys. 2014;77(2):026601. doi: 10.1088/0034-4885/77/2/026601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Fu AQ, Pachter L. Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. Stat Appl Genet Mol Biol. 2016;15(6):447–71. doi: 10.1515/sagmb-2016-0002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Llamosi A, Gonzalez-Vargas AM, Versari C, Cinquemani E, Ferrari-Trecate G, Hersen P, et al. What population reveals about individual cell identity: single-cell parameter estimation of models of gene expression in yeast. PLoS Comput Biol. 2016;12(2):e1004706. doi: 10.1371/journal.pcbi.1004706 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Dharmarajan L, Kaltenbach H-M, Rudolf F, Stelling J. A simple and flexible computational framework for inferring sources of heterogeneity from single-cell dynamics. Cell Syst. 2019;8(1):15-26.e11. doi: 10.1016/j.cels.2018.12.007 [DOI] [PubMed] [Google Scholar]
  • 11.Finkenstädt B, Woodcock DJ, Komorowski M, Harper CV, Davis JRE, White MRH. Quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: an application to single cell data. The Annals of Applied Statistics. 2013; p. 1960–82. [Google Scholar]
  • 12.Dixit PD. Quantifying extrinsic noise in gene expression using the maximum entropy framework. Biophys J. 2013;104(12):2743–50. doi: 10.1016/j.bpj.2013.05.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fang Z, Gupta A, Kumar S, Khammash M. Advanced methods for gene network identification and noise decomposition from single-cell data. Nat Commun. 2024;15(1):4911. doi: 10.1038/s41467-024-49177-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mRNA synthesis in mammalian cells. PLoS Biol. 2006;4(10):e309. doi: 10.1371/journal.pbio.0040309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Raser JM, O’Shea EK. Control of stochasticity in eukaryotic gene expression. Science. 2004;304(5678):1811–4. doi: 10.1126/science.1098641 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sigal A, Milo R, Cohen A, Geva-Zatorsky N, Klein Y, Liron Y, et al. Variability and memory of protein levels in human cells. Nature. 2006;444(7119):643–6. doi: 10.1038/nature05316 [DOI] [PubMed] [Google Scholar]
  • 17.Singh A, Razooky BS, Dar RD, Weinberger LS. Dynamics of protein noise can distinguish between alternate sources of gene-expression variability. Mol Syst Biol. 2012;8:607. doi: 10.1038/msb.2012.38 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Paulsson J. Models of stochastic gene expression. Physics of Life Reviews. 2005;2(2):157–75. [Google Scholar]
  • 19.Singh A, Soltani M. Quantifying intrinsic and extrinsic variability in stochastic gene expression models. PLoS One. 2013;8(12):e84301. doi: 10.1371/journal.pone.0084301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wang D-G, Wang S, Huang B, Liu F. Roles of cellular heterogeneity, intrinsic and extrinsic noise in variability of p53 oscillation. Sci Rep. 2019;9(1):5883. doi: 10.1038/s41598-019-41904-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Voit EO, Martens HA, Omholt SW. 150 years of the mass action law. PLoS Comput Biol. 2015;11(1):e1004012. doi: 10.1371/journal.pcbi.1004012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ferner RE, Aronson JK. Cato Guldberg and Peter Waage, the history of the Law of Mass Action, and its relevance to clinical pharmacology. Br J Clin Pharmacol. 2016;81(1):52–5. doi: 10.1111/bcp.12721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bressloff PC, Newby JM. Metastability in a stochastic neural network modeled as a velocity jump Markov process. SIAM Journal on Applied Dynamical Systems. 2013;12(3):1394–435. [Google Scholar]
  • 24.Kurtz TG. Limit theorems and diffusion approximations for density dependent Markov chains. In: Wets RJB, editor. Berlin, Heidelberg: Springer; 1976. p. 67–78. [Google Scholar]
  • 25.Tian T, Burrage K, Burrage PM, Carletti M. Stochastic delay differential equations for genetic regulatory networks. Journal of Computational and Applied Mathematics. 2007;205(2):696–707. [Google Scholar]
  • 26.Chen K-C, Wang T-Y, Tseng H-H, Huang C-YF, Kao C-Y. A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics. 2005;21(12):2883–90. doi: 10.1093/bioinformatics/bti415 [DOI] [PubMed] [Google Scholar]
  • 27.Xia M, Chou T. Kinetic theories of state- and generation-dependent cell populations. Phys Rev E. 2024;110(6–1):064146. doi: 10.1103/PhysRevE.110.064146 [DOI] [PubMed] [Google Scholar]
  • 28.Zechner C, Unger M, Pelet S, Peter M, Koeppl H. Scalable inference of heterogeneous reaction kinetics from pooled single-cell recordings. Nat Methods. 2014;11(2):197–202. doi: 10.1038/nmeth.2794 [DOI] [PubMed] [Google Scholar]
  • 29.Sukys A, Öcal K, Grima R. Approximating solutions of the Chemical Master equation using neural networks. iScience. 2022;25(9):105010. doi: 10.1016/j.isci.2022.105010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Öcal K, Gutmann MU, Sanguinetti G, Grima R. Inference and uncertainty quantification of stochastic gene expression via synthetic models. Journal of The Royal Society Interface. 2022;19(192):20220153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Cao Z, Chen R, Xu L, Zhou X, Fu X, Zhong W, et al. Efficient and scalable prediction of stochastic reaction-diffusion processes using graph neural networks. Math Biosci. 2024;375:109248. doi: 10.1016/j.mbs.2024.109248 [DOI] [PubMed] [Google Scholar]
  • 32.Xia M, Li X, Shen Q, Chou T. Squared Wasserstein-2 Distance for Efficient Reconstruction of Stochastic Differential Equations; 2024. https://arxiv.org/abs/2401.11354.
  • 33.Li X, Wong TKL, Chen RTQ, Duvenaud D. Scalable gradients for stochastic differential equations. In: International Conference on Artificial Intelligence and Statistics, 2020.
  • 34.Kidger P, Foster J, Li X, Lyons TJ. Neural SDEs as Infinite-Dimensional GANs. In: Proceedings of the 38th International Conference on Machine Learning. 2021. p. 5453–63.
  • 35.Kidger P, Foster J, Li X, Lyons TJ. Neural SDEs as infinite-dimensional GANs. In: 2021. p. 5453–63.
  • 36.Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. 2017. http://arxiv.org/abs/1701.07875v3
  • 37.Xia M, Li X, Shen Q, Chou T. An efficient Wasserstein-distance approach for reconstructing jump-diffusion processes using parameterized neural networks. Machine Learning: Science and Technology. 2024;5:045052. [Google Scholar]
  • 38.Xia M, Shen Q. A local squared Wasserstein-2 method for efficient reconstruction of models with uncertainty. 2024. https://arxiv.org/abs/2406.06825 [Google Scholar]
  • 39.Xia M, Shen Q, Maini P, Gaffney E, Mogilner A. A new local time-decoupled squared Wasserstein-2 method for training stochastic neural networks to reconstruct uncertain parameters in dynamical systems. Neural Networks. 2025. [DOI] [PubMed] [Google Scholar]
  • 40.Yu R, Wang R. Learning dynamical systems from data: an introduction to physics-guided deep learning. Proc Natl Acad Sci U S A. 2024;121(27):e2311808121. doi: 10.1073/pnas.2311808121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Gonze D. Modeling circadian clocks: from equations to oscillations. Open Life Sciences. 2011;6(5):699–711. [Google Scholar]
  • 42.Westermark PO, Welsh DK, Okamura H, Herzel H. Quantification of circadian rhythms in single cells. PLoS Comput Biol. 2009;5(11):e1000580. doi: 10.1371/journal.pcbi.1000580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Dueva R, Iliakis G. Replication protein A: a multifunctional protein with roles in DNA replication, repair and beyond. NAR Cancer. 2020;2(3):zcaa022. doi: 10.1093/narcan/zcaa022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Caldwell CC, Spies M. Dynamic elements of replication protein A at the crossroads of DNA replication, recombination, and repair. Crit Rev Biochem Mol Biol. 2020;55(5):482–507. doi: 10.1080/10409238.2020.1813070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Nguyen HD, Yadav T, Giri S, Saez B, Graubert TA, Zou L. Functions of Replication Protein A as a Sensor of R Loops and a Regulator of RNaseH1. Mol Cell. 2017;65(5):832-847.e4. doi: 10.1016/j.molcel.2017.01.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wold MS. Replication protein A: a heterotrimeric, single-stranded DNA-binding protein required for eukaryotic DNA metabolism. Annu Rev Biochem. 1997;66:61–92. doi: 10.1146/annurev.biochem.66.1.61 [DOI] [PubMed] [Google Scholar]
  • 47.Ding J, Li X, Shen J, Zhao Y, Zhong S, Lai L, et al. ssDNA accessibility of Rad51 is regulated by orchestrating multiple RPA dynamics. Nature Communications. 2023;14:3864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gillespie DT. Exact stochastic simulation of coupled chemical reactions. The Journal of Physical Chemistry. 1977;81(25):2340–61. [Google Scholar]
  • 49.Gillespie DT. The chemical Langevin equation. The Journal of Chemical Physics. 2000;113(1):297–306. [Google Scholar]
  • 50.Sheu K, Luecke S, Hoffmann A. Stimulus-specificity in the responses of immune sentinel cells. Curr Opin Syst Biol. 2019;18:53–61. doi: 10.1016/j.coisb.2019.10.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Adelaja A, Taylor B, Sheu KM, Liu Y, Luecke S, Hoffmann A. Six distinct NFκB signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity. 2021;54(5):916-930.e7. doi: 10.1016/j.immuni.2021.04.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hoffmann A, Levchenko A, Scott ML, Baltimore D. The IkappaB-NF-kappaB signaling module: temporal control and selective gene activation. Science. 2002;298(5596):1241–5. doi: 10.1126/science.1071914 [DOI] [PubMed] [Google Scholar]
  • 53.Cheng QJ, Ohta S, Sheu KM, Spreafico R, Adelaja A, Taylor B, et al. NF-κB dynamics determine the stimulus specificity of epigenomic reprogramming in macrophages. Science. 2021;372(6548):1349–53. doi: 10.1126/science.abc0269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sen S, Cheng Z, Sheu KM, Chen YH, Hoffmann A. Gene regulatory strategies that decode the duration of NFκB dynamics contribute to LPS- versus TNF-specific gene expression. Cell Syst. 2020;10(2):169-182.e5. doi: 10.1016/j.cels.2019.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Adelaja A, Taylor B, Sheu KM, Liu Y, Luecke S, Hoffmann A. Six distinct NFκB signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity. 2021;54(5):916-930.e7. doi: 10.1016/j.immuni.2021.04.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Guo X, Adelaja A, Singh A, Wollman R, Hoffmann A. Modeling heterogeneous signaling dynamics of macrophages reveals principles of information transmission in stimulus responses. Nat Commun. 2025;16(1):5986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Bressloff PC. Stochastic processes in cell biology. Springer; 2014.
  • 58.Spagnolo B, Spezia S, Curcio L, Pizzolato N, Fiasconaro A, Valenti D. Noise effects in two different biological systems. The European Physical Journal B. 2009;69:133–46. [Google Scholar]
  • 59.Pahle J, Challenger JD, Mendes P, McKane AJ. Biochemical fluctuations, optimisation and the linear noise approximation. BMC Syst Biol. 2012;6:86. doi: 10.1186/1752-0509-6-86 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Swain PS, Elowitz MB, Siggia ED. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc Natl Acad Sci U S A. 2002;99(20):12795–800. doi: 10.1073/pnas.162041399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Kelleher JD. Deep learning. MIT Press. 2019.
  • 62.Saptadi NTS, Kristiawan H, Nugroho AY, Rahayu N, Waseso B, Intan I. Deep Learning: Teori, Algoritma, dan Aplikasi. Sada Kurnia Pustaka; 2025.
  • 63.Luecke S, Guo X, Sheu KM, Singh A, Lowe SC, Han M, et al. Dynamical and combinatorial coding by MAPK p38 and NFκB in the inflammatory response of macrophages. Mol Syst Biol. 2024;20(8):898–932. doi: 10.1038/s44320-024-00047-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Nakamura T, Taki K, Nomiya H, Seki K, Uehara K. A shape-based similarity measure for time series data with ensemble learning. Pattern Analysis and Applications. 2013;16(4):535–48. [Google Scholar]
  • 65.Frishman A, Ronceray P. Learning force fields from stochastic trajectories. Physical Review X. 2020;10(2):021009. [Google Scholar]
  • 66.Rao CV, Wolf DM, Arkin AP. Control, exploitation and tolerance of intracellular noise. Nature. 2002;420(6912):231–7. doi: 10.1038/nature01258 [DOI] [PubMed] [Google Scholar]
  • 67.Arias AM, Hayward P. Filtering transcriptional noise during development: concepts and mechanisms. Nat Rev Genet. 2006;7(1):34–44. doi: 10.1038/nrg1750 [DOI] [PubMed] [Google Scholar]
  • 68.Eling N, Morgan MD, Marioni JC. Challenges in measuring and understanding biological noise. Nat Rev Genet. 2019;20(9):536–48. doi: 10.1038/s41576-019-0130-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Kang X, Li C. A dimension reduction approach for energy landscape: identifying intermediate states in metabolism-EMT network. Adv Sci (Weinh). 2021;8(10):2003133. doi: 10.1002/advs.202003133 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Lang J, Nie Q, Li C. Landscape and kinetic path quantify critical transitions in epithelial-mesenchymal transition. Biophys J. 2021;120(20):4484–500. doi: 10.1016/j.bpj.2021.08.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Li C, Wang J. Landscape and flux reveal a new global view and physical quantification of mammalian cell cycle. Proc Natl Acad Sci U S A. 2014;111(39):14130–5. doi: 10.1073/pnas.1408628111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Böttcher L, Asikis T. Near-optimal control of dynamical systems with neural ordinary differential equations. Machine Learning: Science and Technology. 2022;3(4):045004. [Google Scholar]
  • 73.Böttcher L, Wheeler G. Visualizing high-dimensional loss landscapes with Hessian directions. Journal of Statistical Mechanics: Theory and Experiment. 2024;2024(2):023401. [Google Scholar]
  • 74.Fonseca LL, Böttcher L, Mehrad B, Laubenbacher RC. Optimal control of agent-based models via surrogate modeling. PLoS Comput Biol. 2025;21(1):e1012138. doi: 10.1371/journal.pcbi.1012138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Böttcher L, Fonseca LL, Laubenbacher RC. Control of medical digital twins with artificial neural networks. Philosophical Transactions of the Royal Society A. 2025;383:20240228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Asikis T, Böttcher L, Antulov-Fantulin N. Neural ordinary differential equation control of dynamics on graphs. Physical Review Research. 2022;4(1):013221. [Google Scholar]
  • 77.Böttcher L, Antulov-Fantulin N, Asikis T. AI Pontryagin or how artificial neural networks learn to control dynamical systems. Nat Commun. 2022;13(1):333. doi: 10.1038/s41467-021-27590-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Fronk C, Petzold L. Interpretable polynomial neural ordinary differential equations. Chaos. 2023;33(4):043101. doi: 10.1063/5.0130803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Morala P, Cifuentes JA, Lillo RE, Ucar I. NN2Poly: a polynomial representation for deep feed-forward artificial neural networks. IEEE Trans Neural Netw Learn Syst. 2025;36(1):781–95. doi: 10.1109/TNNLS.2023.3330328 [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013462.r001

Decision Letter 0

Michael A Beer, Padmini Rangamani

7 Jun 2025

PCOMPBIOL-D-25-00636

Reconstructing noisy gene regulation dynamics using extrinsic-noise-driven neural stochastic differential equations

PLOS Computational Biology

Dear Dr. Xia,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

As you will see from the attached reviews, several reviewers found the work of interest, but raised significant concerns about the value of modeling stochastic gene expression with SDEs, specifically, what biological insights are gained from estimating the drift and diffusion terms, and how accurately do such terms need to reflect the underlying stochastic distributions to be confident of such inferences. We would be happy to reconsider the manuscript if revisions can be made to address these concerns.

​Please submit your revised manuscript within 60 days Aug 07 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter

We look forward to receiving your revised manuscript.

Kind regards,

Michael A Beer

Academic Editor

PLOS Computational Biology

Padmini Rangamani

Section Editor

PLOS Computational Biology

Journal Requirements:

1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full.

At this stage, the following Authors/Authors require contributions: Jiancheng Zhang, Xiangting Li, Xiaolu Guo, Zhaoyi You, Lucas Böttcher, Alex Mogilner, Alexander Hoffmann, Tom Chou, and Mingtao Xia. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form.

The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions

2) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019.

3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: 

https://journals.plos.org/ploscompbiol/s/figures

4) We notice that your supplementary Figures are included in the manuscript file. Please remove them and upload them with the file type 'Supporting Information'. Please ensure that each Supporting Information file has a legend listed in the manuscript after the references list.

5) We note that your Data Availability Statement is currently as follows: "No data was created in this research. All data used in this research are publicly available and have been properly cited.Authors will make all codes publicly available upon acceptance of this manuscript.". Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study. Authors must share the “minimal data set” for their submission. PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition).

For example, authors should submit the following data: 

1) The values behind the means, standard deviations and other measures reported;

2) The values used to build graphs;

3) The points extracted from images for analysis..

Authors do not need to submit their entire data set if only a portion of the data was used in the reported study.

If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories.

If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If data are owned by a third party, please indicate how others may request data access.

6) Please provide a detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

1) Please clarify all sources of financial support for your study. List the grants, grant numbers, and organizations that funded your study, including funding received from your institution. Please note that suppliers of material support, including research materials, should be recognized in the Acknowledgements section rather than in the Financial Disclosure

2) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

3) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

4) If any authors received a salary from any of your funders, please state which authors and which funders..

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Effective control of cellular signaling and gene expression is essential for sustaining cell operations, growth, and responses to environmental shifts under fluctuations. In this work, Zhang et al. propose a neural stochastic differential equation framework driven by extrinsic noise named END-nSDE, to learn cell population dynamics from trajectory data. They validated their method in a few simulated datasets, including circadian oscillations, RPA-DNA interactions, and NFκB signaling, and demonstrated that their approach surpasses conventional time-series tools like RNN and LSTM in accuracy. By quantifying cell-to-cell variations directly from data, their model can replicate experimentally observed stochastic dynamics partially. I think this may be an interesting method for simulating biological systems when detailed mechanistic models are missing, enabling insights into noise-driven cellular processes. I have the following comments:

1, It’s interesting to see that the authors can reconstruct temporal experimental data (Fig. 8). However, for a high-dimensional system (e.g., 52 dimensions referred in this work), how to analyze the stochastic dynamics is a paramount question, even if we already have a dynamical model or Neural SED model, as the authors have emphasized the importance of external and intrinsic noise. For example, the energy landscape theory provides a way to study stochastic dynamics of high-dimensional biological systems (Kang and Li, Adv. Sci., 8: 2003133 (2021), Li and Wang, PNAS 111, 14130-14135 (2014)). So, I suggest the authors considering adding related discussions.

2, The authors have used Wasserstein distance. Can it be replaced by KL-divergence? If so, what is the advantage and disadvantage for using these two measures regarding current task?

3, For reconstructing experimental data (Fig. 8), how many time points are there from experiments? And how does the proposed approach depend on the number (and resolution) of input experimental data point? Please elaborate this point.

Reviewer #2: The paper by Zhang et al reports on a novel method that attempts to reconstruct the stochastic dynamics of gene regulatory networks using neural stochastic differential equations (SDEs). Basically the method works by approximating the drift and diffusion functions of the SDEs using neural networks. The reconstructed neural SDE is then in principle a surrogate model of single-cell dynamics. The method is tested on 3 different models of intracellular dynamics. Their claim is that the surrogate model can effectively capture both intrinsic and extrinsic noise. The paper is overall well written and certainly the application of these approaches is promising. I am however less convinced by the claims of its accuracy, the limitations of the method are not sufficiently addressed and the authors seem to have missed a large amount of relevant literature on this topic. If these can be addressed then I think the paper will present a much stronger and thorough exposition of what is an interesting and potentially useful method. Here are my more detailed comments:

1. It is practically immediately assumed that stochastic gene expression can be suffuciently modelled by an effective SDE. I strongly disagree with this statement. SDEs can only be derived from first principles in the large system size limit and gene systems are not in this category because genes and mRNA numbers are typically present in very low copy numbers per cell which automatically means a discrete description in terms of an effective master equation is a much more accurate. Having said this, if the genes and mRNA are not explciitly described in the model, i.e. the model describes protein dynamics (proteins are typcially quite abundant per cell) and the gene regulatory interactions are modelled implicitly by Hill functions and similar then yes an effective SDE approach starts to make sense. To make this point, I note that in mouse cells, it is reported that the median mRNA number is about 17 while the median protein numbers is about 50,000 (Schwanhäusser et al. "Global quantification of mammalian gene expression control." Nature 473.7347 (2011): 337-342). So the first point is to clarify from the beginning which systems can be suitably approximated by an effective SDE approach. Of course, one can try to approximate such systems still with an SDE approach but in that case the learnt drift and diffusion functions will not have much physical meaning. Also I would conjecture that very likely the estimated diffusion function would be quite badly approximated as its very difficult to capture the noise properly using a continuum model when ithe input stochastic process regularly hits zero (would be the case for most mRNA species due to their very low abundance in most cells as shown by sequencing studies which regularly report a mean mRNA is that is between 1 and 5).

2. I note that machine-learning based approaches to learn surrogate discrete models from low abundance gene expression data already exist: Jiang et al. "Neural network aided approximation and parameter inference of non-Markovian models of gene expression." Nature communications 12.1 (2021): 2618; Cao et al. "Efficient and scalable prediction of stochastic reaction–diffusion processes using graph neural networks." Mathematical biosciences 375 (2024): 109248; Öcal et al. "Inference and uncertainty quantification of stochastic gene expression via synthetic models." Journal of The Royal Society Interface 19.192 (2022): 20220153; Sukys et al. "Approximating solutions of the chemical master equation using neural networks." Iscience 25.9 (2022). Some of these approaches are particularly close in spirit to the present approach in the sense that they learn the propensity functions of an effective master equation (the propensity functions of the master equation determine the drift and diffusion functions in an SDE as discussed in the Chemical Langevin equation paper by D. T. Gillespie). The existing literature on approaches that aim to do the same (or similar) but using master equations should be cited and discussed vis-a-vis the nSDE approach.

3. On Section 3.1 on Circadian clock dynamics, it is mentioned "Overall, our proposed END-nSDE model can accurately reconstruct the minimal stochastic circadian dynamical model". I think this is overstating the accuracy of their method and similar claims are made in other parts of the paper. In particular I note that the accuracy is reasonably good for the drift function but not for the diffusion function -- they report relative errors of 0.29 for the Langevin-type diffusion and 0.41 for the linear-type diffusion model which are really high! The drift is comparatively much more accurately reconstructed but this is to be expected because this is the deterministic part of the SDE. I also think they should more carefully assess the accuracy of the nSDE using a more systematic approach, particularly to understand how well the reconstruction of the drift and diffusion functions vary as a function of the size of the intrinsic and extrinsic noise in the data (as measured by the coefficient of variation - CV). I would expect to see the errors increase as the CV increases. So to summarise it is important that the limitations of the method are properly and thoroughly investigated so the reader can understand when the method can be applied and when other approaches maybe more suitable.

Reviewer #3: The authors present a method for reconstructing dynamical system trajectories that involve a combination of intrinsic (random fluctuations in biochemical reactions) and extrinsic (cell-to-cell heterogeneity) noise. Their approach is based on previous Wasserstein distance-based neural stochastic differential equations (SDE) models. They train neural negworks to learn the drift and diffusion functions of the SDEs. The method is used on three biological systems: a trivial (linear) model of damped oscillator that is meant to capture some aspect of circadian rhythms, RPA-DNA binding, and NFκB signaling.

Overall, the paper is interesting. I did not always find it is clear; important steps in the methodology are not well explained and believe that major changes are needed.

1. The authors claim that the major difference between the present manuscript and reference 29 is that the current manuscript is the inclusion of extrinsic noise.

2. The Methods section (2.1) left me confused about the “neural SDE reconstruction method” – two SDEs are presented in equations 1 and 2. Is f-hat the same as f? The dimensionality of the states is the same (real d-vectors). Is this not placing significant constraints, since the dimension of any actual biological system is likely to be unknown and much higher than what the model attempts?

3. The notation for $\pi(\mu(t),\hat{\mu}(t))$ is not very clear as there appears to conflict with the definitions in Equation 4 (in one case the two arguments are probability distributions, in the other there is a Borel \sigma-algebra and a linear space).

4. I did not find the cartoon in Figure 1A,B to be sufficient to understand the END-nSDE method. This appears to be the main contribution of the manuscript, and it would be essentially impossible to recreate the results based on the information provided.

5. Model 1: Circadian clock model. I was somewhat confused by the fact that the paramaters c and \sigma_0 are inputs to the system. Would this not be something that should be found by the methodology?

6. Model 3: Several observations here. On the plus side, the methodology is being used on real data which is great. However, the results leave a few questions. The initial training required a detailed model. To what extent is the END-nSDE and improvement over this existing model?

7. This example suggests that there is a lot of pre-processing before the method is used (e.g. the steps on page 12). How generalizable is this?

8. Last major point. If everything works well, the END-nSDE technique recreates trajectories. It is not clear, however, what the practical benefit of this is. Did we learn something new about the biology of the systems? Not that I can see – the drift and diffusion terms obtained are essentially black boxes with no insight into the biology. I think that the argument is that these models could be used to train controllers of some sort. More discussion on this would be welcome.

9. Minor point: What is the subscript “m” in equation 15?

10. The title includes “gene regulation dynamics” but there is nothing intrinsical about the method that woud apply only to “gene” dynamics. Why not make it more general?

11. Typo: “fluctuations in the single-cell circadian rhythm is [sic] noise-induced” (p. 5)

12. Typo: “Hyperparameters in the neural network is [sic] initialized” p. 22

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No: Experimental data for NFkB model was not included. Software for methods was not included (although promised following acceptance)

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013462.r003

Decision Letter 1

Michael A Beer, Padmini Rangamani

24 Aug 2025

Dear Prof. Xia,

We are pleased to inform you that your manuscript 'Reconstructing noisy gene regulation dynamics using extrinsic-noise-driven neural stochastic differential equations' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Michael A Beer

Academic Editor

PLOS Computational Biology

Padmini Rangamani

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have fully addressed my previous comments. I recommend the publication of this paper.

Reviewer #2: The authors have done a substantial revision that addresses all of my comments. I recommend the paper's acceptance in its current form.

Reviewer #3: I appreciate the responses received. As in my previous review, I continue to have concerns about the usefulness of the results, as so much information is needed a priori ("relevant molecular species influencing the dynamics are known a priori", "known, parameterized level of extrinsic noise") leading to a blackbox model. However, I can also appreciate that this is a first step which will likely spur further investigation - as such, I can support publication.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Chunhe Li

Reviewer #2: No

Reviewer #3: No

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Technical appendices.

    (PDF)

    pcbi.1013462.s001.pdf (1.7MB, pdf)
    Attachment

    Submitted filename: response1.pdf

    pcbi.1013462.s002.pdf (68.9KB, pdf)

    Data Availability Statement

    No data was created in this research. All data used in this research are publicly available at https://www.nature.com/articles/s41467-023-39579-y and https://www.embopress.org/doi/full/10.1038/s44320-024-00047-4 and have been properly cited. The simulated datasets, neural SDE model code, and analysis scripts to replicate the study findings are available on GitHub at https://github.com/JianchengZ/Neural-SDE-GeneDynamics.


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES