Skip to main content
Entropy logoLink to Entropy
. 2025 Dec 20;28(1):9. doi: 10.3390/e28010009

A Physics-Informed Neural Network (PINN) Approach to Over-Equilibrium Dynamics in Conservatively Perturbed Linear Equilibrium Systems

Abhishek Dutta 1,*, Bitan Mukherjee 2, Sk Aftab Hosen 2, Meltem Turan 3, Denis Constales 4, Gregory Yablonsky 5
Editor: Miguel Rubi
PMCID: PMC12840258  PMID: 41593916

Abstract

Conservatively perturbed equilibrium (CPE) experiments yield transient concentration extrema that surpass steady-state equilibrium values. A physics-informed neural network (PINN) framework is introduced to simulate these over-equilibrium dynamics in linear chemical reaction networks without reliance on extensive time-series data. The PINN incorporates the reaction kinetics, stoichiometric invariants, and equilibrium constraints directly into its loss function, ensuring that the learned solution strictly satisfies physical conservation laws. Applied to three- and four-species reversible mechanisms (both acyclic and cyclic), the PINN surrogate matches conventional ODE integration results, reproducing the characteristic early concentration extrema (maxima or minima) in unperturbed species and the subsequent relaxation to equilibrium. It captures the timing and magnitude of these extrema with high accuracy while inherently preserving total mass. Through the physics-informed approach, the model achieves accurate results with minimal data and a compact network architecture, highlighting its parameter efficiency.

Keywords: conservatively perturbed equilibrium, cyclic and acyclic mechanisms, over-equilibrium dynamics, physics-informed neural network, finite-time thermodynamics

1. Introduction

Conservatively perturbed equilibrium (CPE) is a kinetic phenomenon that uses the stability of chemical equilibrium at a fixed temperature, in which a reacting mixture is initiated from a specially constructed state that respects all elemental balances but reaches one (or more) species exactly in its equilibrium concentration [1]. The remaining species are “conservatively” displaced from equilibrium while preserving the totals of each chemical element obeying elemental conservation. The ensuing relaxation to equilibrium exhibits an unavoidable, well-defined extremum (maximum or minimum) in the concentration of every “unperturbed” species before the system finally settles at the unique, stable equilibrium composition. This phenomenon and its core properties were initially formulated for general mechanisms in a closed system and then developed for reactor models, providing a rigorous and robust framework that differs fundamentally from small-signal relaxation methods [2] because the perturbations are finite. Operationally, CPE proceeds in five steps: (1) compute the equilibrium composition at the chosen T (and p, if relevant); (2) select at least two species to perturb away from their equilibrium values; (3) select at least one “unperturbed” species whose initial concentration equals its equilibrium value; (4) enforce all elemental balances so that the total number of atoms remain unchanged; and (5) integrate the kinetic model and track the trajectories until equilibrium is reached. These steps apply in batch systems and can be carried over to flow reactors, provided the conservative balances are respected in the inlet mixture.

A linear reversible mechanism AB ⇌ C has a unique equilibrium composition subject to the conservation of total concentration. If the system is initially at equilibrium with respect to the intermediate species B, and the terminal species A and C are perturbed while respecting the conservation constraint, then the time evolution of B(t) exhibits an extremum. A key property of this extremum is that the time at which it occurs depends only on the kinetic rate constants and not on the magnitude or direction of the perturbation. This invariance is the defining feature of CPE. For nonlinear reversible mechanisms (e.g., when mass-action kinetics involve nonlinearities such as bimolecular association, enzyme saturation, or autocatalysis), CPE also applies, but with modifications. If a system at equilibrium is perturbed within its conservation constraints, certain intermediate species can exhibit extrema in their relaxation profiles. However, unlike the linear case, the time at which the extremum occurs generally depends on both the rate constants and the equilibrium composition. The underlying reason is that the Jacobian of the nonlinear system at equilibrium governs the local relaxation dynamics: close to equilibrium the system behaves linearly, so a perturbation-independent extremum emerges in the linear response regime, but nonlinearities introduce amplitude dependence as the perturbation grows larger. CPE extends beyond simple sequences to general cyclic and acyclic networks and, more importantly, to flow reactors such as CSTR and PFR. In this case, the extremum that the unperturbed species experiences can lie above its thermodynamic equilibrium value, i.e., an “over-equilibrium” whose magnitude is tunable by the size of the initial perturbations while still maintaining conservation laws [3,4]. Two mechanistic inferences follow from this extremum. First, if an unperturbed species participates in only one step, the concentration extremum coincides with a momentary (instantaneous) equilibrium of that step; if it participates in multiple steps, the extremum generally does not represent a momentary equilibrium for any single step. Second, in linear (first-order) reversible mechanisms, the position of the CPE extremum is invariant to the size of the conservative perturbation, while its magnitude scales with the perturbation amplitude; these features persist across standard reactor models (batch, CSTR, and PFR) and provide methods for parameter extraction [4,5].

Physics-informed neural networks (PINNs) are a class of function approximators that embed governing equations, typically ordinary or partial differential equations, into the learning objective so that the neural network’s predictions satisfy data and physics simultaneously. A PINN trains by minimizing a composite loss that includes residuals of the target equations evaluated at collocation points, alongside data misfit when observations are available. In their foundational study, Raissi et al. [6] demonstrated how this can solve both forward problems (predicting fields given parameters) and inverse problems (identifying unknown parameters or hidden states), using automatic differentiation to compute the residual terms with respect to network outputs and inputs. Recent overviews emphasize how the “physics regularization” in the loss can be adapted to problem class (e.g., control-volume and entropy-consistent formulations for shocks) and how architectures and priors can be tailored to embed more structure when needed [7]. This structure naturally supports inverse modeling for kinetics: by differentiating the residual with respect to θ during gradient-based training, the network can estimate Arrhenius parameters or effectiveness factors while reconstructing species trajectories. In practice, PINNs learn solutions and parameters jointly, often with quasi-Newton (e.g., L-BFGS) or Adam–L-BFGS sequences, leveraging relatively few measurements because the physics term constrains the solution manifold [8]. This same study also highlights a practical caveat: when experimental data do not cover the full time horizon, the PINN may minimize the residual by collapsing to plateaued solutions in the uncovered region, underscoring the importance of collocation design and data coverage. Stiff-PINN integrates the quasi-steady-state assumption (QSSA) into the architecture and loss: fast (QSS) species are eliminated from the learned outputs and their algebraic closures are enforced in the residual, leaving the network to learn only the slow manifold [9]. Simılarly, the Multiscale PINN (MPINN) also aims to restructure the learning task rather than the physics: it segregates species by time scale, fits each group with a dedicated subnetwork, and uses adaptive loss reweighting keyed to species-specific residual indicators [10]. Thus, operationally the PINN proceeds in three steps: (1) Encode rate laws, stoichiometry, balances, and boundary/initial conditions as hard or soft constraints; choose collocation schemes that resolve fast transients and conserve invariants; and favor non-dimensionalization that handle stiffness [9,11]. (2) Treat optimization as its own design space: monitor and adapt loss weights and diagnose gradient issues early [12]. (3) Respect multiscale structure explicitly: group species, mix subnetworks, or reduce models when justified [9].

Recent studies [13,14,15] show that physics-informed neural networks (PINNs) are becoming a versatile tool across chemical engineering, complementing classical first-principles models in both steady-state and dynamic applications. Beyond their original formulation for solving parametric PDEs, modern PINN variants, including adaptive loss PINNs, domain decomposition PINNs, and gradient-enhanced PINNs, improve robustness for stiff and multiscale systems that commonly arise in chemical processes [16,17]. General PINN-based frameworks now target dynamic process systems governed by differential–algebraic equations, embedding conservation laws and algebraic constraints directly into the loss function to improve extrapolation and data efficiency compared with purely data-driven deep learning models [18]. In transport phenomena, several studies apply PINNs to steady and transient heat transfer problems such as conduction, convection, and conjugate heat transfer, showing accurate temperature and heat flux fields from sparse measurements while directly solving the governing PDEs without meshing [6,19,20]. In reactor engineering, forward and inverse PINNs have been developed for isothermal fixed-bed CO2 methanation reactors, enabling simultaneous solution of the reactor model and parameter identification under realistic operating conditions [21]. Dynamic process applications further include CSTRs, separators, and flowsheets, where PINNs or physics-informed recurrent networks reconstruct unmeasured states and estimate kinetic or transport parameters from limited data [20,22]. At the multiscale level, CFD–PINN workflows have been proposed to accelerate simulations of green ammonia synthesis in axial–radial packed-bed reactors by learning effective source terms or closure relations from high-fidelity CFD data [13]. Finally, mass- and energy-constrained neural networks generalize these ideas to both steady and dynamic unit operations, ensuring exact satisfaction of global balances while learning remaining nonlinearities from process data, which is particularly attractive for flowsheet-level integration of hybrid first-principles/machine learning models [14].

The methodology of PINNs is well-suited to CPE systems because it allows the embedding of the kinetic ODEs/PDEs, stoichiometric conservation laws, and equilibrium relations directly into the loss so the model respects physical laws while fitting sparse data. For CPE specifically, the equilibrium relations and the conservative constraints can be enforced as hard constraints (via parameterization) or soft penalties, enabling the model to learn over-equilibrium deviations and their routes without violating mass or charge balances. The outcome is a physics-consistent model that is data-efficient, accelerates process exploration, and provides a basis for optimization and control of CPE-intensified systems. In this study, a PINN framework has been implemented for linear case studies with three and four species, both cyclic and acyclic, with a proposition of an extended framework for scalability to higher-order systems. The specific contributions of this study are threefold: (1) Application of PINNs to CPE systems, demonstrating that physics-informed learning can capture over-equilibrium dynamics and concentration extrema without dense time-series data; (2) physics-consistent loss function design that enforces stoichiometric invariants, equilibrium constraints, and kinetic ODEs simultaneously, ensuring mass conservation and thermodynamic consistency; and (3) systematic validation across linear mechanisms of increasing complexity (three- and four-species, acyclic and cyclic), establishing parameter efficiency and accuracy benchmarks for extending PINNs to finite-time thermodynamic phenomena.

2. Methodology

2.1. Conservatively Perturbed Equilibrium (CPE)

A closed, isothermal, well-mixed reaction network is considered whose dynamics are linear in the concentrations

dc(t)dt=Mc(t),c(t)RN, (1)

where M is the kinetic generator assembled from elementary forward and reverse rate constants. The equilibrium composition ceq satisfies Mceq=0. Conservatively perturbed equilibrium (CPE) experiments are modeled by constructing an initial state exactly at equilibrium and then applying a finite concentration perturbation to a strict subset of species while leaving at least one species unaltered. Let P{1,,N} denote indices of perturbed species and U its complement (unperturbed). The CPE initial condition is

c(0)=ceq+Δ,Δi=0(iU),ΔC, (2)

where C is the linear subspace of admissible perturbations that preserve all elemental (or other conserved) balances. Writing those invariants as rows of a matrix L, admissibility reads LΔ=0. In the present linear batch setting, the system returns to the same ceq; the transient encodes the mechanism through the interplay of the eigenstructure of M and the choice of Δ. To make these notions concrete, consider first a four-species acyclic chain

Ak1+k1Bk2+k2Ck3+k3D, (3)

with concentrations c=[A,B,C,D]T. Using the mass-action rates r1=k1+Ak1B, r2=k2+Bk2C, r3=k3+Ck3D, the dynamics are A˙=r1, B˙=r1r2, C˙=r2r3, and D˙=r3. Equivalently,

c˙=Mc,M=k1+k100k1+(k1+k2+)k200k2+(k2+k3+)k300k3+k3. (4)

The equilibrium vector ceq is the normalized right null vector of M. Because the network is closed, one may work with normalized concentrations so that 1Tc(t)=1 for all t; i.e.,

1Tc(t)=i=14ci(t)=1,1=[1,1,1,1]T. (5)

In that normalization the admissible perturbations satisfy 1TΔ=0 in addition to any elemental balances encoded in L. A CPE realization on this chain perturbs, for example, the end species A and D while leaving B and C strictly at their equilibrium values: ΔB=ΔC=0. Two immediate technical consequences follow from the structure of the right-hand side. First, any unperturbed species that is not directly connected to a perturbed neighbor has a zero initial time derivative. Indeed, evaluating B˙(0)=r1(0)r2(0) with B(0)=Beq, C(0)=Ceq, A(0)=Aeq+ΔA, and D(0)=Deq+ΔD, one has

r1(0)=k1+(Aeq+ΔA)k1Beq=k1+Aeqk1Beq=0+k1+ΔA=k1+ΔA, andr2(0)=k2+Beqk2Ceq=0 (6)

with B˙(0)=k1+ΔA. If ΔA=0 as well (i.e., neither neighbor of B is perturbed), then B˙(0)=0. The same reasoning applies to C. This “initial shielding” is a sharp, local diagnostic at t=0: it depends only on which species were perturbed and on network connectivity, not on the perturbation magnitude. Second, whenever an end species (such as A or D) is held unperturbed at t=0, any turning point of its trajectory occurs at a momentary balance of the adjacent step. For instance, at an extremum of A(t) one has A˙=0r1=0k1+A=k1B. This equality refers to the local step at the instant of the extremum; it does not imply that the entire network is at detailed equilibrium yet. Taken together, these two properties, i.e., zero initial velocity for shielded unperturbed species and momentary step balance at turning points for end species, supply strict algebraic equalities at specific times that can be enforced or monitored.

The same CPE construction can be written for a three-species acyclic chain ABC by removing the third step and the species D. The generator and admissibility conditions are identical in form, and the consequences just discussed specialize in a simpler way. If C is left unperturbed while A and B are perturbed in a balance-preserving manner, then C˙(0)=k2+B(0)k2C(0)=k2+ΔB; thus C˙(0) is nonzero precisely when its neighbor is perturbed. If instead B is the unperturbed species and only A and C are perturbed, then B˙(0)=k1+ΔA+k2ΔC, showing how the initial drift of the unperturbed middle species is controlled by the two adjacent perturbations. In either selection, the end species obey the same “momentary balance at turning points” condition, e.g., A˙=0k1+A=k1B. For a three-species cyclic network,

Ak1+k1Bk2+k2Ck3+k3A, (7)

the generator M becomes circulant-like, with nonzeros wrapping from C to A. The CPE initialization is defined exactly as above, but a thermodynamic closure is additionally required so that the cycle admits a consistent equilibrium. Writing equilibrium constants Ki=ki+/ki, the standard loop constraint

K1K2K3=1(equivalentlyΔGi=0) (8)

ensures that a positive null vector ceq exists and is unique up to normalization. This equality can be treated as a parameter consistency condition when rate constants are specified a priori. Because each species now participates in two steps, the “end species momentary balance” observation no longer applies; nevertheless, the initial shielding logic still holds: an unperturbed species whose two neighbors are both unperturbed has c˙i(0)=0, while perturbing at least one neighbor introduces a nonzero initial drift via the corresponding adjacent rate.

Across all of these cases, four families of CPE constraints emerge that are used later as hard or soft conditions. First, the dynamic law itself must hold at every time, c˙(t)Mc(t)=0, which is the formal statement of compliance with the chemical mechanism. Second, the equilibrium anchor must be respected, c(t)ceq as t with Mceq=0, and any constraints linking the rate constants on cycles (e.g., K1K2K3=1) must be satisfied. Third, the conservation relations must hold for all t, Lc(t)=Lc(0)=Lceq, including, in normalized units, 1Tc(t)=1. Fourth, the CPE admissibility pattern must be honored at t=0: the chosen unperturbed indices U start exactly at equilibrium (Δi=0 for iU), at least two species are perturbed to encode a genuine finite displacement, and LΔ=0 is satisfied.

2.2. Physics-Informed Neural Network (PINN)

The previous section established the mathematical and physical framework of CPE systems. Building on that foundation, the following section details the physics-informed neural network (PINN) architecture developed to reproduce, without dense numerical data, the temporal evolution of such linear reaction networks. The network acts as a differentiable solver that enforces the physics of the system at every collocation point in time, thereby eliminating the need for large time-series datasets while retaining quantitative fidelity to the chemical kinetics. The PINN is a machine learning technique designed to solve problems involving ordinary/partial differential equations (ODEs and PDEs), which can be given as

Fx,u,dudx,d2udx2,ux,2ux2,,dmudxm,dudtmuxm,ut=0. (9)

Instead of relying on traditional numerical methods like finite difference or finite element methods, PINNs use neural networks to approximate the solution of an ODE by minimizing a loss landscape, which includes data loss and physics loss components. The neural network learns to approximate the solution by minimizing these components, which together guide the network toward satisfying both the governing equations and the physical constraints of the problem; i.e, PINNs transform the problem of solving ODEs into an optimization task, where the goal is to minimize a loss landscape that penalizes deviations from the physical laws (ODEs) and boundary (BC) and initial (IC) conditions.

Figure 1 represents the PINN framework used in this study. The inputs of the PINN are the independent variable x and time t. The output of the PINN is the predicted dependent variable. An artificial neural network (ANN) in PINN u^(x,t;θ) is initially designed as a surrogate for the solution u(x,t). The parameter θ encompasses all the hyperparameters, including the weight matrices W and bias vectors b within the ANN such that

u^(x,t;θ)=i=1nWi·fi1(x,t)+bi, (10)

where fi1(x,t) is the output from the previous layer, with (x,t) being the input to the network.

Figure 1.

Figure 1

Schematic of physics-informed neural network (PINN) framework for solving a linear case CPE system.

Traditional numerical solvers discretize the time domain and march the equations forward with small steps, which requires dense data to capture transient shapes. By contrast, the PINN represents c(t) as a smooth function parameterized by the neural network, enabling differentiation with respect to time via automatic differentiation. This continuous, differentiable representation makes it possible to compute the residuals c˙(t)Mc(t) exactly at arbitrary points in time, enforcing the chemical dynamics throughout the domain without an explicit grid. For a CPE system where the equilibrium composition is known and the trajectory begins from a perturbation thereof, this property is particularly advantageous. The equilibrium vector and the initial perturbation pattern can be inserted as boundary and invariance constraints within the loss function. As a result, the network inherently learns trajectories that (i) conserve total mass, (ii) return to the correct equilibrium, and (iii) respect the local kinetic connectivity discussed before.

Each species’ concentration is learned simultaneously through a single neural architecture. For the systems studied here, the network takes time t[0,T] as input and outputs the four-component concentration vector c^(t)=[A^,B^,C^,D^]. To ensure numerical stability and exact initialization, the model employs a residual-based formulation:

c^(t)=c0+tNθ(t), (11)

where c0 is the known initial concentration vector obtained from the CPE perturbation and Nθ(t) is a fully connected neural subnetwork with trainable parameters θ. This construction automatically satisfies the initial condition at t=0 (c^(0)=c0) and prevents spurious offsets. Furthermore, the factor t in the expression ensures that early-time derivatives remain well-conditioned, avoiding discontinuities at the origin that often destabilize training. The subnetwork Nθ(t) consists of L hidden layers with H neurons per layer:

tLinear(1,H)GELULinear(H,H)GELULinear(H,4). (12)

For the current implementation in this study, five hidden layers (L=5) with 128 neurons each (H=128) were found to balance expressiveness and numerical stability. The Gaussian Error Linear Unit (GELU) activation was selected for its smooth differentiability, which yields more accurate higher-order derivatives compared to ReLU-type activations. All computations are performed in double precision (float64) to mitigate gradient underflow in stiff regions of the dynamics.

The central element of the PINN is the physics-informed loss function, which integrates the CPE constraints derived before. For a randomly sampled set of collocation points {ti}i=1Nf within the time domain, the total loss is defined as follows:

L=Lphys+λinvLinv+λeqLeq, (13)

where each component enforces a specific physical requirement. The physics residual loss is defined as follows:

Lphys=1Nfidc^(ti)dtMc^(ti)2 (14)

which ensures that the learned function satisfies the governing kinetic equations at all collocation points. The time derivative dc^/dt is obtained analytically via automatic differentiation of the neural output with respect to t. The invariant loss is defined as follows:

Linv=1Nfi(1Tc^(ti)1)2 (15)

which enforces the global conservation of total concentration (or any other balance expressed through the operator Lc^=Lc0). Finally, the equilibrium loss penalizes deviations from the known steady state:

Leq=c^(T)ceq2, (16)

with the option to include local equilibrium equalities (e.g., k1+A^=k1B^ at turning points) as additional regularization terms. The scalar weights λinv and λeq modulate the relative influence of each contribution; empirical tuning showed that values near 0.5–1.0 provided stable convergence without biasing toward any single term.

Training Strategy and Optimization

In this study, training proceeds by sampling Nf=2000 collocation points uniformly in the time interval [0,T]. Each point is treated as an “independent enforcement location” of the governing equations, forming a Monte Carlo approximation to the continuous residual norm. The optimizer is the Adam stochastic gradient method with an initial learning rate of 103. To improve convergence stability and prevent overshooting near equilibrium, an exponential learning rate scheduler with decay factor γ=0.99 is applied every 1000 epochs. In total, 2000 epochs are run for each system, which is sufficient for the residuals to reach the 10−5–10−6 range.

The network is implemented in PyTorch (version 2.7.0) using automatic differentiation on double-precision tensors. All operations are GPU-accelerated when available, allowing rapid evaluation of gradients through the full computational graph. To further stabilize training, gradient clipping and zero-bias initialization of the output layer are available, but the reported runs use the default PyTorch initialization; the residual-stable envelope (c^(t)=c0+tNθ(t)) already enforces (c^(0)=c0) exactly, keeping initial predicted rates well behaved near t=0. Each epoch computes the full loss, performs back-propagation, and updates the weights according to Adam’s adaptive moment estimates. The scheduler then gradually decreases the learning rate, refining the parameters toward the equilibrium regime. A distinct advantage of this approach is that, because the derivative operator is embedded analytically, the trained model implicitly satisfies smoothness and differentiability requirements, enabling direct evaluation of higher-order quantities such as c¨(t) without numerical differencing.

3. CPE-PINN Integration

As shown in Figure 1, the implementation can be viewed as a sequence of three tightly coupled stages:

  1. Physical initialization: The equilibrium composition ceq is computed from the kinetic matrix M as its normalized null vector. A valid CPE perturbation Δ satisfying LΔ=0 is applied to construct the initial state c0=ceq+Δ.

  2. Neural approximation and differentiation: The neural architecture c^(t)=c0+tNθ(t) generates continuous predictions for any t; automatic differentiation supplies the exact time derivatives c^˙(t).

  3. Physics-consistent optimization: The total loss L combines differential residuals, invariance preservation, and equilibrium anchoring. Minimizing L drives the network toward full compliance with both the kinetic law and the CPE constraints.

Through this workflow, the PINN effectively solves the chemical kinetic problem as part of its training. No synthetic or experimental time-series data are required beyond the initial equilibrium composition and known rate constants. The resulting neural surrogate is a compact, analytical representation of the entire temporal relaxation trajectory. The constructed PINN for CPE systems merges domain knowledge and machine learning into a unified computational framework. By encoding the kinetic generator M, the equilibrium vector ceq, and the conservation structure directly within its architecture and loss function, the network enforces physical realism at every stage of training. The residual-based formulation guarantees the correct initial conditions, while adaptive optimization and learning rate scheduling ensure smooth convergence toward the physically consistent manifold of solutions. Consequently, the model provides a data-free yet quantitatively faithful reconstruction of CPE transients, fully aligned with the principle that learning is guided by physics rather than by data density. The present study restricts analysis to linear reversible mechanisms where the kinetic generator matrix M is constant and concentration-independent. This assumption is valid for elementary reactions with mass-action kinetics in the dilute limit where activity coefficients approach unity. Extension to nonlinear systems would require several modifications: (1) the PINN residual loss would remain valid, as automatic differentiation can compute residuals exactly; (2) the perturbation invariance property of CPE extrema would generally fail, as the system Jacobian depends on initial composition; and (3) the thermodynamic closure constraint k1k2k3=1 would be replaced by more complex equilibrium relations.

4. Results and Discussion

Based on the workflow mentioned in Section 3, the CPE-PINN integration has been implemented on a simple two-step, first-order, three-species mechanism as follows:

Ak1+k1Bk2+k2C.

In this mechanism, the initial concentration of one species (A, B, or C) is set equal to its equilibrium value, while the other two start away from equilibrium; thus the system initially displays the CPE phenomenon. If two species are initialized at their equilibrium concentrations, mass conservation forces the third species to assume its equilibrium value as well; the system is therefore at equilibrium rather than in a CPE state. More generally, for a mechanism with N chemical species, if N1 species have initial concentrations equal to their equilibrium values, the remaining species is fixed by the total balance and the whole system is at equilibrium; i.e., the CPE phenomenon cannot arise.

Consequently, to observe CPE one must set at least one initial concentration equal to its equilibrium value, but no more than N2 such concentrations. Figure 2 shows two CPE transient regimes (δ=0.5): (a) where the unperturbed species A rises to an unavoidable maximum before relaxing to equilibrium and (b) where the unperturbed species yields a pronounced minimum. At each extremum the local rate balance of the corresponding single-step reaction is zero, so the step reaches a momentary partial equilibrium. The size of the perturbation δ appears as the initial offset from equilibrium for the two perturbed species, while for the linear ABC mechanism the extremum timing is independent of δ. Correspnding to the numerical solution, the PINN has been implemented on the governing ODEs. The PINN reproduces both CPE transient regimes shown in Figure 2: the early extremum (maximum or minimum) and the subsequent relaxation to equilibrium with near-perfect agreement with the reference solution. As can be observed, the predicted concentration curves are visually indistinguishable and the PDE/ODE residuals are negligible. The PINN also preserves overall mass balance throughout the transient (with negligible conservation error) and accurately captures the timing and amplitude of the extrema, demonstrating that a data-efficient, physics-constrained model can reliably reproduce CPE dynamics.

Figure 2.

Figure 2

Concentration profiles of species A, B, and C for δ=0.5, comparing analytical solutions (colour continuous lines) and PINN predictions (black dashed lines). (a) Case corresponding to a maximum in species A, with rate constants k1+=1.75, k1=3.00, k2+=1.50, and k2=0.50s1. (b) Case corresponding to a minimum in species B, with k1+=1.50, k1=0.10, k2+=10.0, and k2=6.50s1.

Species are classified as either “perturbed” or “unperturbed”. While in the previous case, the transient behavior of the unperturbed species in the two-step mechanism was discussed, the following cases discuss the transient behavior of both the perturbed and unperturbed species in the two-step (three-species) case and more complex (four-species) mechanisms. It is important to note that a perturbation affecting a single species cannot, in general, be made conservative (i.e., preserve the total amount of each conserved element), so at least two species must be altered to obtain a conservative perturbation. Consequently, for a mechanism with N species the number of species that can be perturbed conservatively ranges from 2 to N1. The number of independent degrees of freedom of a conservative perturbation equals the number of perturbed species minus the number of independent conservation laws (see [1]). For a linear mechanism with N species, the number of perturbed species, the number of unperturbed species, and the perturbation degrees of freedom are related by simple counting and the number of independent conservation laws. For the three-species case (N=3), a two-species perturbation is considered; hence the number of unperturbed species is 32=1. Assuming a single independent conservation law (for example total mass or an elemental balance), the number of degrees of freedom of the conservative perturbation is 21=1. For the four-species case (N=4), the number of perturbed species could in principle be two or three; in the present study all perturbations involve two species, so that the number of unperturbed species is 42=2 and the degrees of freedom remain 21=1 under the same single-conservation-law assumption.

4.1. Analysis of Perturbed Species in a Three-Species Acyclic Mechanism

The three-species acyclic mechanism represents the simplest linear system in which the CPE phenomenon can be observed. Under such conditions, unperturbed species exhibit a single unavoidable extremum—either a maximum or a minimum—during the transient evolution. This extremum corresponds to a momentary equilibrium if the species participate in only one step. The timing of the extremum is independent of the perturbation magnitude and reveals structural features of the mechanism. The transient response of perturbed species in two-step mechanisms, i.e., through two successive elementary reactions (steps) in a linear sequence, has been examined in previous work [2] as follows:

Ak1+k1Bk2+k2C.

To extend that analysis to perturbed species, a series of numerical simulations covering a wide range of kinetic constants was carried out. From this ensemble, two representative experiments are selected for case study and their parameter values are reported in Table 1.

Table 1.

Two-step, three-species acyclic conservatively perturbed equilibrium (CPE) example with settings.

Experiment Settings Experiment #1 Experiment #2
Kinetic Parameters (s−1): k1+=5, k1=4
k2+=12, k2=6
k1+=16, k1=4
k2+=12, k2=6
Perturbed species: A, B A, B
Unperturbed species: C C

The transient trajectories of the three species in the two experiments along with the PINN results are shown below in Figure 3a–d.

Figure 3.

Figure 3

Figure 3

Two-step, three-species acyclic reaction mechanism. Panels (a,b) correspond to simulation Experiment #1 using PINN, while panels (c,d) correspond to simulation Experiment #2 using PINN. Analytical solutions are shown as colour continuous lines, and PINN predictions are shown as dashed black lines. (a) Conversion of species A; (b) conversion of species B; (c) conversion of species A; and (d) conversion of species B.

4.2. Analysis of Unperturbed Species in a Three-Species Cyclic Mechanism

The three-species cyclic mechanism introduces an additional reaction closing the loop between terminal species, forming a triangular network. Despite this topological change, the unperturbed species still shows a single concentration extremum under CPE conditions, with the same analytical expression for its occurrence time as in the acyclic case. However, the extremum time is generally shorter due to the additional pathway accelerating relaxation. All species in the cycle participate in multiple steps, precluding momentary equilibrium interpretation. A three-species cyclic mechanism displays an added direct connectivity from species A to species C as follows: graphic file with name entropy-28-00009-i001.jpg

The closed topology alters the stoichiometric constraints and the network’s structural properties. Depending on the elementary step kinetics (mass-action vs. autocatalytic or nonlinear steps), cyclic three-species models can exhibit richer dynamics than acyclic chains. The thermodynamic consistency must satisfy the following Onsager condition, which requires that the equilibrium constant of the third reaction equals the product of the equilibrium constants of the first two reactions.

k3+k3=k1+k1k2+k2,

This enforces detailed balance around the three-reaction cycle and guarantees a single, well-defined equilibrium state. From this ensemble, two representative experiments are selected for case study and their parameter values are reported in Table 2.

Table 2.

Three-species cyclic conservatively perturbed equilibrium (CPE) example with settings.

Experiment Settings Value
Kinetic parameters (s−1): k1+=16k1=4k2+=12k2=6k3+=8k3=1.
Perturbed species: A, B
Unperturbed species: C

The experiments used the settings reported in Table 2, and the corresponding transient trajectories of the three species in the two experiments along with the PINN results are shown Figure 4. The first two pairs of kinetic parameters match those of Experiment 2; a third reaction was then added, with its forward and reverse rate constants selected to enforce the Onsager relation between the three reactions.

Figure 4.

Figure 4

Conversion of the unperturbed species C in a two-step, three-species cyclic reaction mechanism. The perturbation is introduced by changing the initial concentration of species A from 0 to 0.36 and that of species B from 0.92 to 0.56. Analytical solutions are shown as colour continuous lines, while PINN predictions are shown as dashed black lines.

4.3. Analysis of Unperturbed Species in a Four-Species Acyclic Mechanism

The four-species acyclic mechanism serves as a baseline case, illustrating the intricacies of CPE-induced dynamics in a linear reaction network. Under these conditions, the concentration of an initially unperturbed species can exhibit two distinct extrema-a transient maximum followed by a minimum separated by an inflection point during relaxation to equilibrium. This striking multiple-extremum trajectory underscores the distinctive transient behavior introduced by CPE conditions and the rich mechanistic insights they provide. The four-species acyclic mechanism is as follows: graphic file with name entropy-28-00009-i002.jpgThe experimental results of this case are summarized in Table 3.

Table 3.

Four-species acyclic conservatively perturbed equilibrium (CPE) example with settings.

Experimental Settings Value
Kinetic parameters (s1): k1+=2    k1=1
k2+=3    k2=1
k3+=1    k3=1
Perturbed species: A, D
Unperturbed species: B, C

The simulation parameters were held constant across all runs, so that the six cases differ only in which species are perturbed; the full parameter list appears in Table 3, and one representative case is plotted in Figure 5. Unperturbed species can exhibit complex transient behavior during CPE, including two concentration extrema (a maximum and a minimum) and an inflection point. This four-species mechanism leads to six distinct perturbation cases, defined by the choice of two unperturbed species. The kinetic parameters and simulation settings for all six combinations are listed in Table 4 and Table 5, while Figure 6 plots two representative cases (Experiment #2 and Experiment #3) that demonstrate this behavior.

Figure 5.

Figure 5

Four-species acyclic mechanism, species B and C unperturbed: (a) conversion of unperturbed species B with PINN; (b) conversion of unperturbed species C with PINN. Analytical solutions are shown as colour continuous lines while PINN predictions are shown as dashed black lines.

Table 4.

Four-species acyclic conservatively perturbed equilibrium (CPE) example with experimental settings.

Experimental Settings Values
Kinetic parameters (s−1): k1+=2    k1=1
k2+=3    k2=1
k3+=1    k3=1

Table 5.

Four-species acyclic conservatively perturbed equilibrium (CPE) example with cases and results.

Experiment Perturbed Species Unperturbed Species Behavior
1 A, B C, D 2 extrema of [C], 1 of [D]
2 A, C B, D 1 extremum of [B], 1 of [D]
3 A, D B, C 2 extrema of [C], 1 of [B]
4 B, C A, D 1 extremum of [A], 1 of [D]
5 B, D A, C 1 extremum of [A], 1 of [C]
6 C, D A, B 1 extremum of [A], 1 of [B]

Figure 6.

Figure 6

(a,b) Experiment #2 with PINN: (a) concentration of the unperturbed species B; (b) concentration of the unperturbed species D. (c,d) Experiment #3 with the PINN: (c) concentration of the unperturbed species B; (d) concentration of the unperturbed species C. Analytical solutions are shown as colour continuous lines, while PINN predictions are shown as dashed black lines.

4.4. Analysis of Unperturbed Species in a Four-Species Cyclic Mechanism

The four-species cyclic mechanism shares the same reaction sequence as the acyclic case, with one additional link closing the network into a loop. Despite this added connectivity, the CPE-induced transient behavior remains qualitatively similar: unperturbed species still exhibit the same pattern of two transient concentration extrema with an intervening inflection point during the approach to equilibrium. Thus, even with a fully connected cyclic network, the hallmark transient features of the CPE phenomenon persist, demonstrating the robustness of this behavior to increased reaction connectivity. The four-species cyclic mechanism is as follows: graphic file with name entropy-28-00009-i003.jpg

The numerical values used for simulations, together with the choices of perturbed species, are listed in Table 6. Representative temporal trajectories obtained from these simulations are presented in Figure 7.

Table 6.

Four-species cyclic conservatively perturbed equilibrium (CPE) example with settings.

Experiment Settings Value
Kinetic parameters (s−1): k1+=2       k1=1
k2+=3       k2=1
k3+=1       k3=1
k4+=1       k4=6
Perturbed species: A, D
Unperturbed species: B, C

Figure 7.

Figure 7

Four-species cyclic mechanism experiment with the PINN, B and C unperturbed: (a) conversion of unperturbed species B; (b) conversion of unperturbed species C. Analytical solutions are shown as colour continuous lines, while PINN predictions are shown as dashed black lines.

Table 7 summarizes the unified PINN setup used for all CPE simulations, including a fixed architecture and hyperparameters. The network architecture (shared across cases) is a fully connected feedforward network with one input (time) and an output for each species (three or four), comprising five hidden layers of 128 neurons with GELU activation. Training was performed with the Adam optimizer (learning rate 103, decaying by 0.99 every 1000 epochs) for about 2000 epochs, using 2000 collocation points per epoch (drawn 50% uniformly and 50% from a Beta-distributed tail-biased sampling to emphasize late-time dynamics). The loss function combined a physics residual term (enforcing the ODE dynamics), a conservation term to maintain constant totals, and a final-time “tail anchor” term to ensure correct equilibrium behavior, with weights of 1.0, 0.5, and 2.0, respectively. This core configuration was applied consistently across all cases, and with double-precision training plus validation against a high-accuracy ODE solver (LSODA, tol 109), it enables the PINN to learn the over-equilibrium CPE dynamics accurately while strictly preserving physical constraints.

Table 7.

Concise form of the main configuration for CPE-PINN systems used in this study.

Category Parameter Value
Network Architecture 11281281281281283/4
Activation GELU
Parameterization c(t)=c0+t·N(t)
Training Optimizer Adam (lr=103)
Scheduler ExponentialLR (γ=0.99, every 1k epochs)
Epochs 2000 (standard) / 2500
Collocation Points (Nf) 2000–2200 per epoch
Sampling Strategy 50% uniform + 50% Beta(3,1) tail-biased
Loss Physics (λphy) 1.0×MSE(dcdtPINN,dcdtODE)
Conservation (λcons) 0.5×MSE(ci,1)
Tail Anchor (λtail) 2.0×MSE(residual@t=T) [256 pts]
Precision dtype torch.float64
Validation Method solve_ivp (LSODA, tol = 109)

Table 8 summarizes the system-specific PINN training configurations for each case study, detailing the number of training epochs, collocation points (Nf), the conservation loss weight (λcons), and any custom adjustments applied in each scenario. Notably, the core PINN architecture and loss formulation remained consistent across all systems, but minor hyperparameter tweaks were introduced to enhance training stability and accuracy for different reaction networks. For example, a higher conservation loss weight (λcons=5.0) was used in one four-species case to enforce elemental balance more strongly, while another scenario employed gradient clipping (limit =1.0) and increased weighting for one species to maintain numerical stability. In addition, a “tail anchor” loss term (with weight 2.0) and biased collocation sampling at late times were incorporated for certain stiff systems to stabilize the long-time behavior of the PINN. These targeted modifications did not alter the fundamental PINN framework, but rather fine-tuned the training setup to the complexity of each system, ensuring robust convergence and accurate dynamics predictions across all case studies. A detailed version of the system-specific PINN training configurations for each case study can be found in Appendix A Table A1. Table 9 quantitatively compares the PINN surrogate against adaptive ODE solvers (RK45 and BDF) across acyclic and cyclic reaction systems of increasing dimensionality. For all cases considered, the PINN achieves L2 errors on the order of 105, comparable to high-accuracy adaptive integration, while strictly preserving mass conservation to within 107. Although the adaptive solvers dynamically refine time steps, particularly in cyclic and stiff systems, the trained PINN provides fixed-cost evaluations that reproduce the full transient dynamics with essentially identical accuracy. Notably, the wall-clock time per evaluation is reduced by more than two orders of magnitude for the PINN, highlighting its suitability for repeated inference, parameter sweeps, and embedding within optimization or inverse-problem workflows.

Table 8.

Concise system-specific configuration for CPE-PINN systems used in this study.

Figure System Epochs Nf λcons Modifications
Figure 3a–c 3-sp cyclic 800 1000 0.5 Standard
Figure 3d 3-sp cyclic 1000 1000 0.5 Standard
Figure 4 3-sp cyclic 2000 1200 0.5 No tail anchor
Figure 5a 4-sp acyclic 2000 2000 5.0 High conservation weight
Figure 5b and Figure 6a–c 4-sp acyclic/cyclic 2000 2000 0.5 Standard
Figure 6d 4-sp acyclic 2500 2200 0.5 +Grad clip (1.0), 1.6× weight for C
Figure 7a,b 4-sp cyclic 2000 2000 0.5 +Tail anchor (2.0), Tail-biased collocation markers

Table 9.

Comparison of PINN and adaptive ODE solvers (RK45 and BDF) for acyclic and cyclic reaction systems with increasing species complexity.

Metric PINN ODE
(RK45, Adaptive)
ODE
(BDF, Adaptive)
Notes
3-Species Acyclic System
L22 Error 1.2×105 1.4×105 1.3×105 Comparable accuracy
Mass conservation
error
<107 <107 <107 All satisfy physical
constraints
Adaptive steps taken 2000
(fixed)
127 145 RK45 refines during transient
Wall-clock time
(single eval)
<1 ms 95 ms 112 ms ODE solver slower for
single-shot inference
3-Species Cyclic System
L22 Error 1.1×105 1.5×105 1.2×105 Comparable accuracy
Mass conservation error <107 <107 <107 All satisfy conservation
Adaptive steps taken 2000
(fixed)
156 168 Higher stiffness
detected
Wall-clock time
(single eval)
<1 ms 115 ms 128 ms ODE slower;
BDF handles stiffness
4-Species Acyclic System
L22 Error 1.4×105 1.6×105 1.5×105 Comparable;
slight BDF advantage
Mass conservation error <107 <107 <107 All maintain physical
validity
Adaptive steps taken 2000
(fixed)
203 218 Increased complexity;
more steps
Wall-clock time
(single eval)
<1 ms 142 ms 155 ms ODE cost increases
with system size
4-Species Cyclic System
L22 Error 1.5×105 1.7×105 1.4×105 BDF superior in
cyclic/stiff case
Mass conservation error <107 <107 <107 All conserve mass
effectively
Adaptive steps taken 2000
(fixed)
187 211 Highest stiffness;
BDF more efficient
Wall-clock time
(single eval)
<1 ms 138 ms 149 ms Comparable to RK45
despite more steps

Table 10 summarizes the complementary strengths of the PINN and classical adaptive ODE solvers from a computational and application-oriented perspective. While adaptive solvers incur no upfront training cost and remain optimal for single-shot trajectory evaluation, their computational expense scales linearly with the number of queries. In contrast, the PINN incurs a one-time training cost but delivers sub-millisecond inference thereafter, leading to a break-even point at approximately 103–104 evaluations. This distinction makes PINNs particularly attractive for inverse problems, uncertainty propagation, and real-time digital twin deployment. Importantly, both approaches achieve comparable residual accuracy, confirming that the efficiency gain of the PINN does not compromise physical fidelity.

Table 10.

Comparison of PINN and adaptive ODE solvers in terms of computational cost, accuracy, and application suitability.

Metric PINN ODE Solver (Adaptive) Advantage
Training time 5–10 min 0 ODE
Single evaluation <1 ms 50–200 ms PINN (50–200×)
Break-even queries ≈4000 PINN
Residual accuracy 105106 105106 None
Storage size <5 MB Model code PINN
Inference accuracy Marginal
extrapolation loss
Stable and bounded ODE
Application fit Inverse problems,
Monte Carlo sampling
Single-shot evaluation Domain-dependent

A critical distinction between CPE and arbitrary perturbations is the perturbation invariance property of the extremum. While any off-equilibrium initial condition produces transient extrema, CPE specifically ensures that the time at which the extremum occurs is independent of the magnitude of the conservative displacement (for linear systems). This fundamental property was theoretically established by Yablonsky in his seminal paper [2] and later verified experimentally in complex esterification reactions. The invariance property enables deterministic process control: the operator can modulate extremum magnitude by varying perturbation size while maintaining predictable timing. In contrast, arbitrary perturbations would yield timing that depends on both kinetics and initial composition, prohibiting such quantitative control. This property is particularly valuable in batch reactors and flow systems (CSTR and PFR), where fine-tuning product concentration at a specific time window is desired without empirical parameter fitting. More recently, Sachs et al. [23] demonstrated that CPE points as over-equilibria represent intriguing opportunities for improving industrial yield, where achieving concentrations above thermodynamic equilibrium can temporarily enhance product selectivity. Furthermore, the conservatively perturbed framework has enabled novel kinetic analysis techniques: thermodynamic and kinetic–thermodynamic invariant expressions can be derived from CPE experiments, allowing extraction of kinetic parameters independently of equilibrium descriptions, which are unavailable from standard arbitrary perturbation experiments.

4.5. Application of PINN to Higher-Order Linear CPE Systems

The above-mentioned methodology can be easily extended from three- and four-species to arbitrary linear, closed reaction networks with N species and general topology. The state is c(t)R0N and the dynamics remain linear, as shown in Equation (2). For any reversible link ij with forward rate kij+ and reverse rate kji, MijMij+kji, MjjMjj(kji), MjiMji+kij+, and MiiMii(kij+), and Mpq is zero otherwise. This rule scales to any number of species and edges without altering the learning framework: only the size, sparsity pattern, and numerical values of M change. A legal perturbation Δ of the equilibrium must satisfy the linear conservation relations, which are compactly written as LΔ=0 for a fixed LRr×N encoding the r independent invariants (e.g., total concentration and elemental balances). The initial state is c0=ceq+Δ, with selected indices left unperturbed by setting Δi=0 on that subset. The immediate kinematic consequences scale cleanly with network size: because Mceq=0, the initial drift is c˙(0)=MΔ. Thus, any unperturbed species that is not graph-adjacent to the perturbed set has c˙i(0)=0 (“initial shielding”), while species directly adjacent to at least one perturbed neighbor inherit a nonzero initial drift through the corresponding off-diagonal rates. No additional casework is required; these statements are purely algebraic and hold for all N. The PINN parameterization also scales without conceptual change. Let a single network (see Equation (11)) map time to an N-vector and retain the residual-stable envelope

c^(t)=c0+g(t)Nθ(t), (17)

with g(t)=t (or any smooth, strictly increasing function with g(0)=0) and a fully connected subnetwork Nθ:[0,T]RN. This guarantees c^(0)=c0 exactly, regardless of N. The time derivative needed for the residual is obtained analytically,

dc^dt(t)=g(t)Nθ(t)+g(t)dNθdt(t), (18)

via automatic differentiation. Scaling the output dimension from 4 to N requires only changing the size of the last linear layer; width and depth can be tuned with N (e.g., H=αN with α[1,4] as a practical rule), but the formulation is unchanged. The loss functional remains the sum of three terms, each now written in fully general form:

L=1Nfi=1Nfdc^dt(ti)Mc^(ti)22+λinvNfi=1NfLc^(ti)Lc022+λeqΦ(c^(·);ceq). (19)

The first term enforces the kinetic law at randomly sampled collocation times {ti}, the second preserves all invariants in L (recovering the familiar 1Tc^(t)=1 when L=1T), and the third optionally anchors the long-time limit. In higher-dimensional, stiff systems it is often preferable to implement the terminal anchor through a “stationarity” penalty local to late times,

Φ(c^(·);ceq)=1NTj=1NTMc^(Tj)22,Tj[T,T], (20)

rather than a direct c^(T)ceq22; this avoids biasing trajectories when multiple slow eigenmodes coexist but still drives the prediction into the equilibrium manifold {c:Mc=0}. Two implementation details make the approach computationally scalable. First, evaluation of Mc^ exploits sparsity. Because M is typically very sparse in large networks, storing it as a compressed sparse matrix and using sparse–dense products reduces the per-batch complexity from O(N2) to O(|E|), where |E| is the number of directed edges. In practice this is realized by constructing M computationally from the reaction list and keeping it in a sparse tensor format, while all neural computations remain dense. Second, time is non-dimensionalized to temper stiffness: choosing τ=αt with α|λmin(M)| (the largest-magnitude nonzero eigenvalue) contracts the eigenvalue spread toward O(1), which improves gradient flow and reduces the number of required collocation points at early times. Collocation can also be biased toward the boundary layers by sampling τ from a Beta distribution or by the mapping t=Tsp with p>1 to densify near t=0 without altering the loss.

From an optimization perspective, the Adam-based training with exponential learning rate decay suffices; only the batch sizes and width/depth may be increased with N. Double precision remains advisable as N grows and eigenvalues separate; the envelope g(t) can be kept as t, although in highly stiff cases a smoothed envelope (e.g., g(t)=log(1+t) scaled to match T) can further stabilize early-time derivatives. Regularizers such as L2-weight decay on the last layer or spectral normalization on hidden layers are optional and mechanism-agnostic; they become useful only when the residual plateaus due to ill-conditioning introduced by extreme rate disparities. A detailed form of the main configuration for CPE-PINN systems used in this study, which can be extended to higher-order linear CPE systems, can be found in Appendix A Table A3.

5. Conclusions

Conservatively perturbed equilibrium (CPE) is not just a computational curiosity. It should be viewed as a principled way to “shape” the transient path to that equilibrium rather than to alter the endpoint. By exploiting conservative initialization, it is possible to (i) create informative extrema that map directly onto mechanistic structures (single-step versus multi-step participation), (ii) extract kinetic parameters with high sensitivity, and (iii) design reactor starts that transiently deliver over-equilibrium performance at earlier times or shorter lengths than conventional protocols would allow. It is also important to note that the PINN formulation is not tied to any specific three- or four-species example. The algebraic structure of linear kinetics that the network enforces is encoded in a sparse, column-sum-zero generator, a set of linear invariants, and a CPE-consistent initialization. The same residual-stable parameterization, the same physics-informed loss, and the same optimizer produce differentiable surrogates for c(t) that remain consistent to the chemical mechanism, thereby providing a general-purpose solver for linear CPE over-equilibrium systems.

Appendix A

Appendix A.1

Table A1 serves as a comprehensive reference for the specific hyperparameters and architectural configurations of the physics-informed neural network (PINN) used to generate the results in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7. This summary is essential for reproducibility and highlights the system-specific tuning required to achieve accurate simulations for different chemical reaction networks. The table documents the progression of this study, beginning with three-species cyclic systems (Figure 3 and Figure 4) and advancing to more complex four-species acyclic (Figure 5 and Figure 6) and cyclic (Figure 7) mechanisms. For each case, it specifies the model’s training duration (Epochs) and the number of collocation points (Nf) used to enforce the physical constraints. Crucially, the table reveals key variations in the model’s loss function and training strategy. Early experiments (Figure 3 and Figure 4) omitted the tail anchor loss, a feature introduced in later models (Figure 5 and Figure 6) to improve the solution’s stability and accuracy at the final time (t=TEND). Furthermore, the table shows specific tuning of the conservation weight (e.g., increased to 5.0 for Figure 5a) and the implementation of advanced strategies like tail-biased collocation sampling (Figure 6), which concentrates collocation points near the end of the time domain. This fine-tuning, including species-specific weighting of the physics loss (e.g., “1.6× for C” in Figure 6d), underscores the necessity of adapting the PINN architecture and loss formulation to the unique dynamic characteristics of each CPE experiment.

Table A1.

Detailed form of the system-specific variations for CPE-PINN systems used in this study.

Figure System Type Species Epochs Nf Cons. Weight Special Features
Figure 3a–c 3-species cyclic A, B, C 800 1000 0.5 No tail anchor used
Figure 3d 3-species cyclic A, B, C 1000 1000 0.5 No tail anchor used
Figure 4 3-species cyclic A, B, C 2000 1200 0.5 No tail anchor used
Figure 5a 4-species acyclic A, B, C, D 2000 2000 5.0 Higher conservation
penalty; asymmetric δ
Figure 5b 4-species acyclic A, B, C, D 2000 2000 0.5 Standard tail-biased
configuration
Figure 6a 4-species acyclic A, B, C, D 2000 2000 0.5 Tail-biased collocation
Figure 6b 4-species acyclic A, B, C, D 2000 2000 0.5 Tail-biased collocation
Figure 6c 4-species acyclic A, B, C, D 2000 2000 0.5 Tail-biased collocation
Figure 6d 4-species acyclic A, B, C, D 2500 2200 0.5 (1.6× for C) Gradient clipping;
species-weighted physics
Figure 7a,b 4-species cyclic A, B, C, D 2000 2000 0.5 Markers every 60 points
for visualization

Appendix A.2

The physics-informed neural network (PINN) used to model the conservatively perturbed equilibrium (CPE) dynamics is structured as a five-layer feedforward MLP with 128 neurons per layer and a GELU activation function. Training is performed using the Adam optimizer with an initial learning rate of 1×103 and an exponential decay scheduler. The loss function is a critical component, incorporating the mean squared error (MSE) of the governing ODE residuals (physics loss), a mass balance constraint (conservation loss), and a tail-end stabilization term (tail anchor loss). The network is trained on 2000–2200 collocation points, with validation performed against high-precision solutions from the LSODA ODE solver.

Table A2.

PINN architecture, training configuration, and numerical settings used in this study.

Component Setting Notes
Network 5-layer MLP, 128 neurons/layer GELU activation, linear output
Input/Output 1 input (t)/3 or 4 outputs (species) Matches problem dimensionality
Parameterization c(t)=c0+t·N(t) Satisfies initial condition by construction
Optimizer Adam, lr=103 Exponential decay γ=0.99 every 1000 epochs
Epochs 2000–2500 Longer for stiff or cyclic cases
Collocation Points 2000–2200 per epoch 50% uniform, 50% tail-biased (Beta(3,1))
Loss Terms Physics (1.0), Conservation (0.5–5.0),
Tail Anchor (2.0)
Adjustable per case
Precision Double (float64) Ensures stability and accuracy
Validation LSODA (SciPy) with tol = 109 High-precision reference integration
Gradient Handling Clipping (1.0) in some cases Used when numerical instability arises
Device CUDA if available Defaults to CPU otherwise

Table A3.

Detailed main configuration for CPE-PINN systems.

PINN Component Configuration Notes
Architecture
   Network Type Feedforward MLP Sequential MLP
   Input Dimension 1 Time (t)
   Output Dimension 3 or 4 Species conc. (A, B, C) or (A, B, C, D)
   Hidden Layers 5 Excl. input and output layers
   Neurons per Layer 128 Uniform across hidden layers
   Activation Function GELU Gaussian Error Linear Unit
   Output Activation None (linear) Direct concentration output
Parameterization
   Concentration Form c(t)=c0+t·N(t) Initial condition satisfied automatically
   N(t) Neural network output Learned time-dependent correction
Training
   Optimizer Adam Adaptive moment estimation
   Learning Rate (lr) 1×103 Initial lr
   LR Scheduler ExponentialLR Multiplicative decay
   Scheduler Gamma (γ) 0.99 Applied every 1000 epochs (typ.)
   Epochs 2000–2500 2000 standard; 2500 for Figure 6d
   Batch Size (Nf) 2000–2200 Collocation pts/epoch
Collocation Strategy
   Base Sampling Uniform random 50% points (60% in Figure 6d)
   Tail-Biased Sampling Beta(3,1) Dense near t=TEND
   Endpoint Inclusion Optional Include t=0 and/or t=T
Loss Function
   Physics Loss MSEc˙PINN,c˙phys ODE residual
   Conservation Loss MSEici,1.0 Mass balance constraint
   Tail Anchor Loss MSE at t=TEND Stabilizes late-time behavior
   Physics Weight 1.0 Base weight
   Conservation Weight 0.55.0 0.5 standard; 5.0 for Figure 5a; 1.6× for C in Figure 6d
   Tail Anchor Weight 2.0 Applied to 256 points at t=TEND
Gradient Management
   Gradient Clipping 1.0 (norm) Applied in Figure 6d only
   Autograd Method torch.autograd.grad Per-species derivative computation
   Gradient Graph create_graph=True For higher-order derivatives
   retain_graph True Keep graph for multiple species
Numerical Precision
   Data Type torch.float64 Double precision
   Device CUDA (if available) Falls back to CPU
Time Domain
   tspan System-dependent (0, 1–8) s (typ.)
   Evaluation Points 600–1200 For plotting and validation
Validation
   Reference Method scipy.integrate.solve_ivp LSODA method
   ODE Tolerances rtol = 1 × 10−9, atol = 1 × 10−9 High-precision validation

Author Contributions

Conceptualization: A.D.; formal analysis: A.D. and B.M.; investigation: A.D., B.M. and G.Y.; methodology: A.D., M.T. and B.M.; software: B.M. and S.A.H.; supervision: G.Y. and D.C.; validation: M.T. and G.Y.; visualization: A.D., M.T., S.A.H. and B.M.; writing (original draft): A.D.; writing (review and editing): M.T., S.A.H., B.M., G.Y. and D.C.; project administration: A.D. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Yablonsky G., Branco P., Marin G., Constales D. Conservatively Perturbed Equilibrium (CPE) in Chemical Kinetics. Chem. Eng. Sci. 2019;196:384–390. doi: 10.1016/j.ces.2018.11.010. [DOI] [Google Scholar]
  • 2.Xi Y., Liu X., Constales D., Yablonsky G. Perturbed and Unperturbed: Analyzing the Conservatively Perturbed Equilibrium (Linear Case) Entropy. 2020;22:1160. doi: 10.3390/e22101160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Peng B., Zhu X., Constales D., Yablonsky G. Experimental Verification of Conservatively Perturbed Equilibrium for a Complex Non-Linear Chemical Reaction. Chem. Eng. Sci. 2021;229:116008. doi: 10.1016/j.ces.2020.116008. [DOI] [Google Scholar]
  • 4.Trishch V., Yablonsky G., Constales D., Beznosyk Y. Conservatively Perturbed Equilibrium in Multi-Route Catalytic Reactions. J. Non-Equilib. Thermodyn. 2023;48:229–241. doi: 10.1515/jnet-2022-0054. [DOI] [Google Scholar]
  • 5.Trishch V., Beznosyk Y., Constales D., Yablonsky G. Over-Equilibrium as a Result of Conservatively-Perturbed Equilibrium (Acyclic and Cyclic Mechanisms) J. Non-Equilib. Thermodyn. 2022;47:103–110. doi: 10.1515/jnet-2021-0036. [DOI] [Google Scholar]
  • 6.Raissi M., Perdikaris P., Karniadakis G. Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations. J. Comput. Phys. 2019;378:686–707. doi: 10.1016/j.jcp.2018.10.045. [DOI] [Google Scholar]
  • 7.Karniadakis G., Kevrekidis I., Lu L., Perdikaris P., Wang S., Yang L. Physics-informed Machine Learning. Nat. Rev. Phys. 2021;3:422–440. doi: 10.1038/s42254-021-00314-5. [DOI] [Google Scholar]
  • 8.Bibeau V., Boffito D., Blais B. Physics-Informed Neural Network to Predict Kinetics of Biodiesel Production in Microwave Reactors. Chem. Eng. Process. Process Intensif. 2024;196:109652. doi: 10.1016/j.cep.2023.109652. [DOI] [Google Scholar]
  • 9.Ji W., Qiu W., Shi Z., Pan S., Deng S. Stiff-PINN: Physics-Informed Neural Network for Stiff Chemical Kinetics. J. Phys. Chem. A. 2021;125:8098–8106. doi: 10.1021/acs.jpca.1c05102. [DOI] [PubMed] [Google Scholar]
  • 10.Weng Y., Zhou D. Multiscale Physics-Informed Neural Networks for Stiff Chemical Kinetics. J. Phys. Chem. A. 2022;126:8534–8543. doi: 10.1021/acs.jpca.2c06513. [DOI] [PubMed] [Google Scholar]
  • 11.De Florio M., Schiassi E., Furfaro R. Physics-informed neural networks and functional interpolation for stiff chemical kinetics. Chaos: Interdiscip. J. Nonlinear Sci. 2022;32 doi: 10.1063/5.0086649. [DOI] [PubMed] [Google Scholar]
  • 12.Wang S., Yu X., Perdikaris P. When and Why PINNs Fail to Train: A Neural Tangent Kernel Perspective. J. Comput. Phys. 2022;449:110768. doi: 10.1016/j.jcp.2021.110768. [DOI] [Google Scholar]
  • 13.Wu Z., Wang H., He C., Zhang B., Xu T., Chen Q. The application of physics-informed machine learning in multiphysics modeling in chemical engineering. Ind. Eng. Chem. Res. 2023;62:18178–18204. doi: 10.1021/acs.iecr.3c02383. [DOI] [Google Scholar]
  • 14.Asrav T., Aydin E. Physics-informed recurrent neural networks and hyper-parameter optimization for dynamic process systems. Comput. Chem. Eng. 2023;173:108195. doi: 10.1016/j.compchemeng.2023.108195. [DOI] [Google Scholar]
  • 15.Turan M., Dutta A. A novel conceptual framework for droplet/particle size distribution in suspension polymerization using Physics-Informed Neural Network (PINN) Chem. Eng. J. 2025;519:164977. doi: 10.1016/j.cej.2025.164977. [DOI] [Google Scholar]
  • 16.Nair S., Walsh T.F., Pickrell G., Semperlotti F. Multiple scattering simulation via physics-informed neural networks. Eng. Comput. 2025;41:31–50. doi: 10.1007/s00366-024-02038-3. [DOI] [Google Scholar]
  • 17.Yu J., Lu L., Meng X., Karniadakis G.E. Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems. Comput. Methods Appl. Mech. Eng. 2022;393:114823. doi: 10.1016/j.cma.2022.114823. [DOI] [Google Scholar]
  • 18.Wang Y., Yao Y., Gao Z. An extrapolation-driven network architecture for physics-informed deep learning. Neural Netw. 2025;183:106998. doi: 10.1016/j.neunet.2024.106998. [DOI] [PubMed] [Google Scholar]
  • 19.Li R., Wang J.X., Lee E., Luo T. Physics-informed deep learning for solving phonon Boltzmann transport equation with large temperature non-equilibrium. NPJ Comput. Mater. 2022;8:29. doi: 10.1038/s41524-022-00712-y. [DOI] [Google Scholar]
  • 20.Zheng Y., Hu C., Wang X., Wu Z. Physics-informed recurrent neural network modeling for predictive control of nonlinear processes. J. Process Control. 2023;128:103005. doi: 10.1016/j.jprocont.2023.103005. [DOI] [Google Scholar]
  • 21.Ngo S.I., Lim Y.I. Solution and parameter identification of a fixed-bed reactor model for catalytic CO2 methanation using physics-informed neural networks. Catalysts. 2021;11:1304. doi: 10.3390/catal11111304. [DOI] [Google Scholar]
  • 22.Koksal E.S., Asrav T., Esenboga E.E., Cosgun A., Kusoglu G., Aydin E. Physics-informed and data-driven modeling of an industrial wastewater treatment plant with actual validation. Comput. Chem. Eng. 2024;189:108801. doi: 10.1016/j.compchemeng.2024.108801. [DOI] [Google Scholar]
  • 23.Sachs J., Bui M., McCarthy J.E., Yablonsky G. Conservatively perturbed equilibrium and perturbation: Linear case. Chem. Eng. J. 2025;510:161284. doi: 10.1016/j.cej.2025.161284. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES