Abstract
Recent technological advances have expanded the availability of high-throughput biological datasets, opening the way to the reliable design of digital twins of biomedical systems or patients. Such computational tools represent key chemical reaction networks driving perturbation or drug response and can profoundly guide drug discovery and personalized therapeutics. Yet, their development still depends on laborious data integration by the human modeler, so that automated approaches are critically needed. The successes of data-driven system discovery in Physics, rooted in clean datasets and well-defined governing laws, have fueled interest in applying similar techniques in Biology, which presents unique challenges. Here, we reviewed 177 methodologies for automatically inferring digital twins from biological time series, which mostly involved symbolic or sparse regression, and recapitulated them in a Shiny app. We evaluated algorithms according to eight biological and methodological challenges, associated with integrating noisy/incomplete data, multiple conditions, prior knowledge, latent variables, or dealing with high dimensionality, unobserved variable derivatives, candidate library design, and uncertainty quantification. Upon these criteria, sparse regression generally outperformed symbolic regression, particularly when using Bayesian frameworks. Next, deep learning and large language models further emerge as innovative tools to integrate prior knowledge, although their reliability and consistency need to be improved. While no single method addresses all challenges, we argue that progress in learning digital twins will come from hybrid and modular frameworks combining chemical reaction network-based mechanistic grounding, Bayesian uncertainty quantification, and the generative and knowledge integration capacities of deep learning. To support their development, we further highlight key components required for future benchmark development to evaluate methods across all challenges.
Keywords: data-driven discovery of biological networks, machine learning, systems biology, dynamical systems
Introduction
Technological advances have made increasingly available multi-type datasets documenting biological variables at multiple scales (subcellular, single cell, bulk, patient), over a dynamical time window, and under different experimental conditions. To handle the complexity of integrating such data, statistical and mathematical approaches have been proposed, from mechanism-based models of regulatory networks, to black-box statistical models (e.g. neural networks), which generally offer limited explainability. Thus, medical research leans toward mechanistic models that represent the underlying biology and predict the molecular mechanisms driving observed phenotypes. Such digital twins of biological systems constitute a powerful tool for the identification of innovative drug targets and personalized therapies. However, their design may be highly time-consuming for the human modeler as it requires integrating multi-type datasets and prior knowledge; this motivated the development of automatic model learning approaches.
Data-driven model inference was first developed for physics and engineering and allowed successful recovery of governing equations for systems that are generally low-dimensional, fully observable, and grounded in well-established mechanistic laws. Indeed, these domains benefit from high-quality time-resolved data with high signal-to-noise ratio and rich prior knowledge that guides model discovery. In sharp contrast, the knowledge in Biology is often partial or qualitative, data are noisy, sparsely sampled, and heterogeneous, and many relevant components remain unobserved. These challenges hinder the direct application of methods originally developed for physical systems to biological contexts.
Model learning for Biology began with intracellular network discovery, which aims to only identify gene interactions without capturing the system dynamics, a common example being gene regulatory network (GRN) inference from transcriptomics [1]. However, Biology may be better represented by quantitative models describing time-varying intracellular events using nonlinear ordinary differential equations (ODEs), where state variables are genes, proteins, or drug species undergoing chemical reactions governed by specified laws and rate constants [2]. Such a mathematical framework has been successfully applied to biological systems or drug pharmacokinetics-pharmacodynamics (PK/PD), thus populating the fields of systems biology and systems pharmacology [3]. Learning an ODE model implies inferring both (i) the mathematical formulations of species interactions (e.g. law of mass action, Michaelis-Menten [4]) and (ii) the magnitude of the reactions (e.g. rate or enzymatic constants). Traditionally, this process is carried out manually: the modeler reviews the literature, formulates equations, and fits parameters to data. However, data collection and integration may be time-consuming, and such an approach is only feasible when the reaction network is known. Moreover, manual model inference presents reproducibility issues, as different modelers may yield different models. Hence, there is a need for a systematic approach to mechanistic model learning.
In this review, we focus on methodologies inferring ODE-based digital twins from time-resolved biological datasets, which may document different individuals and conditions. We categorized methods as either symbolic or sparse regression algorithms and recapitulated them in a ShinyApp (https://u1331systemspharmacology.shinyapps.io/model_learning_review/) [5]. Emphasis was put on methods amenable to discovering complex systems and on innovative approaches combining classical regression with artificial intelligence (AI) tools [6]. We identified four experimental and four methodological challenges currently associated with automatic learning of digital twins for Biology and Medicine, reported the performance of the most relevant methods upon these criteria and discussed the strategies for further improvements.
Current approaches to data-driven discovery of digital twins
Problem formulation
The aim is to infer ODE-based models of biological systems, capturing not only species interactions but also their precise mathematical terms and quantitative estimation of parameters. Let us consider a system involving
species, whose concentration vector
is experimentally observed on a discrete time grid provided as a dataset
. The dynamics of the state variables
are modeled through the following system of ODEs:
![]() |
1 |
where
is a vector field indexed by unknown parameters
, and
denotes the measurement error at time
. The key goal of model inference is to learn the vector field
, together with parameters
that best fit the provided datasets. Crucially, we search for a mechanistic estimator, meaning that the functional form of
is specified a priori from biochemical principles (e.g. mass-action, Michaelis–Menten, or Hill kinetics), yielding biologically grounded equation terms.
State-of-the-art of digital twin inference from temporal data
Two major computational approaches exist to infer model structure and parameters from time series: symbolic regression (Section 2.2.1) and sparse regression (Section 2.2.2).
Symbolic regression
Symbolic regression simultaneously infers the model structure and parameter values that best describe a dataset
by exploring the space of mathematical expressions built from basic operators, thus not assuming any predefined model structure [7, 8]. Initially developed for static and low-dimensional problems, symbolic regression has since been extended to time series and dynamical systems, aiming to identify the vector field
in the ODE system (1) directly from observations [9–11]. In the discovery of a
-sized system, symbolic regression seeks
symbolic expressions
such that
for each species
, minimizing a loss between model-predicted and data-derived derivatives of each variable independently, thus enabling parallel computations.
These methods typically rely on expression trees whose nodes are either a state variable, a constant or a basic operator, selected in a user-defined set (e.g.
) and whose vertices indicate the link between those items (Fig. 1). The algorithms iteratively optimize the populations of candidate expression trees through perturbations (Algorithm 1), which are typically refined via evolutionary algorithms, Monte Carlo tree search, probabilistic grammars, or neural-symbolic models [12].
Figure 1.

Schematic of a typical GP workflow for symbolic regression in which candidate expression trees of the current iteration are used to generate the ones for the next generation through, e.g. mutations or crossovers, with model selection guided by goodness of fit to data so that only the top performers are retained to the subsequent iteration.

Importantly, the general symbolic regression problem is considered NP-hard due to the exponential growth of the search space when increasing the number of variables and considered operators, leading to high computational demands [13]. Moreover, traditional versions struggle to capture variable inter-dependencies as they learn equations separately for each variable, limiting their ability to enforce principles linked to variable interactions such as conservation laws. Another limitation lies in the fact that symbolic regression may overfit the data without suitable regularization or selection strategies [14].
We identified six families of symbolic regression approaches (Fig. 2). First, genetic programming (GP) iteratively evolves candidate mathematical expressions using operators such as mutation and crossover (Fig. 1), which are then selected to maximize the data fit, possibly under structural constraints or complexity penalties. The earliest contribution of such type is by Bongard et al. [25] who interestingly combine (i) partitioning, i.e. modeling each variable independently; (ii) automated model probing, suggesting initial conditions or parameters perturbations to diversify the inferred models; and (iii) snipping, i.e. pruning redundant or low-impact components to enhance interpretability. This method was successfully tested on the 3D lac operon regulatory network in Escherichia coli with the best inferred model recovering 70% of the target terms. Next, deep learning may be leveraged in neural symbolic regression by typically (i) generating candidate tree expressions, via analytical activation functions or symbolic decoders (i.e. neural modules that translate internal representations into symbolic expressions), (ii) employing reinforcement learning to efficiently explore the space of expressions, and (iii) embedding prior knowledge into neural networks to guide candidate functions selection. For example, ODEFormer [9] treats equation recovery as a symbolic translation task: it encodes time series trajectories and directly generates the corresponding symbolic expressions using a transformer decoder. It showed strong performance and robustness—even with noisy or limited data—when tested on the Lotka–Volterra system and a four-variable SEIR epidemic model. Large language models (LLMs) further extend this idea by generating candidate equation structures via prompts. Such approach faithfully recovered tumor growth and pharmacokinetics models [57, 61, 62]. Alternatively, grammar-based methods rely on a formal specification of symbolic expression production rules guiding the generation of new equation terms in the iterative process, such as unit-consistent constraints [75]. Omejc et al. [76] applied this method to successfully recover bacterial respiration and predator–prey systems, even with partial, low-frequency and noisy real-world data. Next, Bayesian symbolic regression infers posterior distributions of expressions by combining structural priors and data likelihoods. For instance, Galagali et al. [11] employed reversible-jump MCMC to infer successfully identify 75% of a five-variable EGF–BRaf signaling network. Alternatively, mathematical constraint-based methods guide the model search by incorporating structural priors derived from either data properties or scientific constraints (e.g. symmetries, conservation laws), first principles (e.g. dimensional analysis), or equation syntax or consistency with trajectory behavior.
Figure 2.
Overview of symbolic regression methods as a tree diagram categorizing existing approaches based on their core modeling principles, such as Genetic Programming and Evolutionary Methods [7, 15–36], Neural Symbolic Regression [9, 10, 12, 37–73], Grammar-based Symbolic Regression [74–78], Bayesian Symbolic Regression [11, 79–81], Structured Search or Mathematical Constraints [82–102], or Application-specific Symbolic Regression [8, 94, 103–105].
Finally, of particular importance for this review, application-specific methods tailor symbolic regression to biological or chemical systems. Schmidt et al. [105] refined ODE models of metabolic pathways by enforcing stoichiometry and conservation laws. Reactmine [94] sequentially identifies chemical reactions based on preponderant changes in species concentrations and correctly inferred six out of seven reactions of a synthetic MAPK phosphorylation cascade, as well as correct three- to five-variable models for the cell cycle and the circadian gene expression regulation, from real-world datasets.
Several benchmarks evaluated symbolic regression methods. Orzechowski et al. [106] focused on GP algorithms inferring low-dimensional models from data, and concluded that EPLEX-1M [36] achieved the best accuracy across 94 real-world datasets. Next, SRBench [12] incorporated neural methods and tests on dynamical systems, and revealed DSR [37] as the best-performing method in terms of accuracy and expression simplicity on both dynamical and algebraic tasks. SRBench++ [107] focused on algebraic equations, further including controlled noise and unified metrics. It concluded that PySR [27] and Bingo [17] achieved the best trade-off between symbolic accuracy, robustness to noise, and expression simplicity, with uDSR [70] showing perfect recovery on simpler tasks. Interestingly, PySR and DSR were also the best-performing methods at recovering chaotic systems, PySR being significantly faster and more robust on average [108]. Next, ODEBench [9] presents the most structured benchmark for ODE modeling, testing 63 systems, yet with only 12 of them involving more than three variables. ODEFormer [9] consistently achieved the highest accuracy while maintaining low inference time and expression complexity.
Finally, LLM-SRBench specifically evaluates LLM-based symbolic regression methods using 128 equations and paired simulated time-series data [109]. LLM-SR methods [62] outperform traditional baselines for symbolic coherence, accuracy, and extrapolation, yet achieving a maximum symbolic accuracy of only 31.5%.
Sparse regression
Sparse regression model learning methods leverage the assumption that biological systems have parsimonious representations, i.e. only a small subset of reactions significantly influences the dynamics [110].
Let us assume access to
longitudinal datasets
,
, describing time-concentration profiles of the
state variables evaluated at
time points, potentially generated for different replicates, initial conditions or perturbations (e.g. gene knockout, drug exposure). These datasets are concatenated horizontally to yield
, with
.
The core idea of sparse regression is to approximate the vector field
in Equation 1 as a linear combination of
candidate functions from a library matrix
, where each column represents a potential term evaluated at each data time point (Algorithm 2). For instance, a typical library may include constant, linear, polynomial, trigonometric, or nonlinear functions of the state variables:
![]() |
2 |
The problem is then cast as solving
![]() |
3 |
where
is the matrix of time derivatives of state variables and
contains the coefficients of each candidate term to be estimated. A sparse regression algorithm is applied to identify the smallest subset of active terms that best explains the system dynamics, thus enforcing that most entries of
are zero, producing parsimonious but accurate and interpretable models. The optimization problem is formulated as minimizing a data-to-model mismatch term plus a regularizer
. The mismatch can be defined either at the derivative level,
, when
is estimated numerically, or at the trajectory level,
, when reconstructed trajectories
are obtained by integrating the candidate system.
A pioneering method in this field is the Sparse Identification of Nonlinear Dynamics (SINDy) [111], available in the user-friendly and actively maintained PySINDy Python package [112, 113]. SINDy includes sparsity techniques such as Sequential Thresholded Least Squares, which iteratively alternates between least-squares fitting and hard thresholding to eliminate small coefficients. Although powerful and interpretable, it requires derivative estimation, making it sensitive to noise and mainly effective for low-dimensional, clean datasets.
We identified five main families of sparse regression approaches (Fig. 3). First, classical sparse regression methods, such as SINDy and its recent extensions, infer governing equations from derivative approximations, using libraries of nonlinear candidate terms, as described above. As an example relevant for biological systems, SINDy-AIC [149] outputs multiple models and selects the best ones using the Akaike Information Criterion to balance complexity and goodness of fit. It fully recovered an SEIR epidemic model (three variables, six terms). Next, sparse reaction network discovery specifically infers biochemical network structures from time series. A representative method is Reactive SINDy [191], which constructs a library of candidate reactions following mass-action kinetics, enforcing chemical mass conservation. It retrieved all reactions of both the E. coli and MAPK pathway networks (respectively, three and nine variables). Alternatively, sparse regression methods without derivatives approximation offer to bypass the sensitivity of numerical differentiation to data noise and sparse sampling. These methods reformulate the regression problem to avoid direct estimation of derivatives, using numerical integration or weak formulations of the system. Next, Bayesian sparse regression enables the use of sparsity-enforcing parameter priors [193], and quantifies uncertainty of inferred mathematical terms. For example, Jiang et al. [182] apply regularized horseshoe priors and MCMC sampling, and successfully recovered synthetic biological networks. Finally, neural sparse regression leverages neural networks or LLMs for either estimating derivatives, modeling hidden components, or encoding prior knowledge to guide model search.
Figure 3.
Overview of sparse regression methods as a tree diagram categorizing existing approaches based on their core modeling principles, including Classical Sparse Regression [54, 111, 114–152], Sparse Regression without Derivative Approximation [153–167], Bayesian Sparse Regression [159, 164, 168–183], Neural Sparse Regression [158, 160, 167, 184–188], and Sparse Network or Reaction Discovery [115, 188–192].
Despite many algorithmic variants of the popular SINDy framework, systematic benchmarking was lacking until Kaptanoglu et al. [194] introduced the first large-scale study using 70 low-dimensional polynomial models, with full-state observability, which may not capture real-world challenges though. Among the five tested methods, STLSQ [111] and MIOSR [133] demonstrated strong performance, while weak SINDy [161] showed high accuracy and stability to noise.
Critically, no benchmark yet evaluates sparse regression methods of all five categories or uses biologically realistic high-dimensional systems documented through irregular data.
Biological challenges
The data-driven inference of real-world biological networks or digital twins involves specific challenges inherent to this field of application. We identified four of them (Fig. 4): (1) the management of data irregularity and noise (Section 3.1), (2) the account of heterogeneous datasets across multiple conditions or individuals (Section 3.2), (3) the integration of prior knowledge (Section 3.3), and (4) the incorporation of unobserved variables (Section 3.4).
Figure 4.
Biology challenges in digital twin discovery: Left: physical systems such as the Lorentz oscillator are typically documented by low-noise, densely and regularly sampled data available for a unique physical context, enabling full observability of a small number of variables; in addition, well-established prior knowledge already in the form of governing equations may further guide model search; as data-driven model learning was mostly initiated for problems arising from physics, the resulting methods are not geared toward the unique challenges inherent to Biology; Right: biological data are often highly noisy, sparsely and irregularly sampled, complicating the reliable estimation of system dynamics; measurements are frequently heterogeneous, varying across individuals and experimental conditions; mechanistic prior knowledge (such as known species interactions, kinetic forms, or parameter ranges) is required to guide the model discovery and to constrain the model space in such low-data regime, but is often uncertain and of various types; moreover, key variables can be experimentally unobserved, introducing hidden dynamics that further hinder model identification; these key differences highlight the unique challenges of applying model learning methods to biological systems.
Handling noisy, sparse, and irregularly sampled biological time series
Biological time series are often irregular due to technical limitations (detection limits, malfunctions) or intrinsic biological variability. This hinders the application of traditional model learning techniques, which assume that all variables are continuously and regularly observed. To address this challenge, two main strategies have been developed: (1) reconstructing the data before model search, and (2) directly integrating data uncertainty into model inference.
First, preprocessing methods aims to reconstruct smooth, continuous trajectories from incomplete or noisy data with simple techniques such as spline-based interpolation and smoothing, low-pass or Savitzky–Golay filters, or kernel regression [54, 87, 103, 117, 153, 166, 170, 180]. More advanced methods rely on Gaussian processes, which combine smoothing, interpolation, and uncertainty quantification [27, 88, 159, 168, 169], or on neural networks trained to denoise or interpolate sparse observations [69, 78, 158, 188]. Other strategies leverage a partially known model to reconstruct plausible trajectories using local Taylor approximations around available data points [32] or first-principle models for trajectory generation [139]. These denoising and imputation procedures may be crucial for improving the performance of model inference but must be manipulated with care to avoid erasing biologically relevant information or creating artifacts.
A second strategy addresses data inconsistencies directly during model inference by adjusting the optimization procedure. For example, Champion et al. [129] proposed a trimming functionality that down-weights data points with large residuals—often outliers or cases with missing replicates, allowing the algorithm to focus on reliable data. Such approach immediately raises the question of correctly tuning hyperparameters to avoid losing biologically sound data points and associated information. Following another rationale, Omejc et al. [76] restrict the computation of the model error to observed components, at documented time points. North et al. [177] further generalize this by introducing a time-dependent observation mask that indicates which variables are measured at each time point, enabling flexible integration of partial trajectories. Other approaches directly address data uncertainty in the learning process by jointly estimating a clean version of the data and the dynamic model [135, 137].
Leveraging data in multiple experimental conditions or individuals
Biological datasets often include measurements in multiple cell lines, animal strains, or patients, and under varying experimental conditions such as gene knockouts or drug exposure. Fully exploiting this structure is essential for learning accurate and generalizable models in systems biology, where models are of high dimension and data are proportionally scarce. Yet, most model learning methods only process one condition at a time, limiting their ability to extract shared or condition-specific dynamics. However, several recent methods have tried to address this limitation and cleanly separate what is structurally conserved from what varies across conditions, enhancing model robustness and interpretability.
A promising strategy consists in learning a shared model structure while allowing parameters to vary across individuals. Schaeffer et al. [147] propose a group-sparse regression method that enforces a common set of active terms across conditions while permitting coefficients to differ, capturing both invariant dynamics and inter-individual variability. Similarly, INSITE [154] first learns a global ODE model across all patients via sparse regression, then re-estimates coefficients for each patient using their specific time series. Finally, SpReME [155] learns a binary mask for the shared equation structure and condition-specific coefficients in a single optimization.
Several frameworks leverage neural networks to discover or generalize shared dynamical structures, these methods likely requiring larger amounts of data per patient to ensure accuracy and generalization. NeuralCODE [167] uses Automated Machine Learning (AutoML, i.e. automated neural network architecture search) to learn a common backbone of differential equations, with subject-specific parameters. Next, Gui et al. [195] propose moving beyond via invariant function learning, allowing both parameters and functional forms to vary across conditions. A neural network estimates the invariant component of the ODE, enabling, as a second step, symbolic or sparse regression to recover interpretable models for each subject. Alternatively, MetaPhysiCa [131] casts dynamics learning as a meta-learning problem, identifying physical laws that generalize across experimental setups while efficiently adapting to new conditions. By combining causal inference and invariant risk minimization, it extracts robust, generalizable models even in extrapolation regimes. Next, OASIS [185] pairs SINDy with a neural network that maps input conditions to ODE parameters. Local models are learned offline from condition-specific time series, while the network generalizes across them, enabling rapid adaptation to new regimes via online parameter prediction.
A different approach is proposed by Pantazis et al. [162], who use a multiple shooting approach: each condition-specific trajectory is fitted separately, and the resulting equations are combined into a unified system with common parameter values. Then, sparse regression is used to recover the single dynamics that best fit all data. Finally, Ukorigho et al. [143] designed a competitive learning scheme that trains several models in parallel and softly assigns each data point to the best-fitting one based on prediction error, enabling discovery of distinct mechanistic regimes without prior labeling. However, such late integration strategies, where models are trained independently before being reconciled or selected, may be ill-suited to low-sample regimes where simultaneously leveraging all datasets is crucial for robust generalization. Importantly, despite their potential, most methods listed here remain to be tested on large-scale biological systems with multiple experimental settings.
Integrating prior biological knowledge into model discovery
Prior knowledge on chemical interactions
Prior knowledge about species interactions explicitly involves information regarding the possibility of a reactant
becoming a product
. Such knowledge can be incorporated at the whole model level or at the reaction level, as described below.
Model-level priors encode global structural assumptions, such as sparsity—favoring minimal interaction sets—or topological features like scale-free networks where most nodes have few connections and a few act as hubs [110, 196]. Classical sparse regression, such as SINDy [111], uses
regularization (LASSO, [197]), while SR3 [129] applies the sparsity penalty to an auxiliary variable rather than directly to the coefficients, improving optimization stability. Bayesian approaches allow imposing flexible priors on parameters, such as the regularized horseshoe, which shrinks most coefficients to zero while allowing a few large ones, or the spike-and-slab, which explicitly models inclusion via a mixture of near-zero (spike) and unconstrained (slab) components [159, 179, 182]. While accounting for such priors provides convex relaxations of the nonconvex model selection problem, Mixed-Integer Optimization (MIO) offers a more principled alternative by explicitly controlling term inclusion via binary indicators that encode whether a given term is included (1) or excluded (0) from the model. As an important example of such method, MIOSR [133] formulates symbolic model discovery as a constrained optimization problem, enabling provably optimal recovery of ODE systems from noisy data. Alternatively, Reactmine [94] tackles model complexity through a tree-search algorithm that infers reactions sequentially, using a depth limit to enforce parsimony.
Next, chemical interaction-level prior information may be derived from curated biological databases, documenting Protein-Protein Interactions (e.g. STRING [198], SIGNOR [199], HuRI [200], humanDB [201]), biochemical pathway (Reactome [202], KEGG [203], BioCyc [204]), or gene and chemical interactions (BioGRID [205]). These resources provide structured knowledge to constrain or bias model selection. For example, Bayesian Reactive SINDy [182] allows specifying reaction-specific shrinkage priors to favor known interactions. However, most existing inference frameworks do not account for these biological priors, due to inconsistencies across databases, context-specific validity of interactions, and challenges in mapping qualitative interactions to quantitative ODE terms. These obstacles—familiar from GRN inference—extend to ODE modeling, where the integration of prior knowledge remains an open and critical challenge [206].
Prior knowledge on mathematical formulation or parameters of interaction kinetics
Kinetic prior knowledge refers to assumptions on the mathematical expressions of the reaction rates and associated parameter values. This can apply to the whole model or to specific interactions.
Systemic kinetic priors encode broad physical constraints such as conservation laws, dimensional consistency, and symmetries that reduce the candidate model space and enhance biological plausibility. Standard methods like SINDy ignore these, potentially producing unrealistic models (e.g. negative concentrations). Group sparsity methods can help by jointly selecting or discarding related terms [146]. Symbolic regression frameworks like ProGED [75, 76] enforce dimensional consistency, while NSRwH [51] allows conditioning on user-defined hypotheses (e.g. symmetries), guiding expression generation.
But ultimately, for Biology, it is essential to adopt the chemical reaction network (CRN) formalism that offers a structured representation of biochemical processes encoding species interactions, canonical kinetic forms, and interpretable parameter constraints.
Definition 1.
A CRN is a set of chemical reactions formally defined as a triple
, where
(resp.
) is a multiset of
reactants (resp.
products) and
is a rate function over reactant concentrations specifying the reaction kinetics rates, parametrized by
.
Crucially, casting ODE inference as CRN recovery imposes a strong structural prior: it narrows the model space to reaction-based dynamics, ensuring that the inferred vector field
adheres to key mechanistic principles [207]. For instance, the stoichiometry of
enforces mass conservation and structural constraints. As a result,
is not only mathematically consistent—e.g. positive, smooth, and dimensionally correct—but also interpretable in terms of underlying biochemical processes. Such CRN-based formulations was applied in Reactmine [94], Reactive SINDy [191], its Bayesian variant [182], and other studies [11, 178].
Beyond global constraints, knowledge on the formulation of specific reactions may be available. A key notion here is the reference model—a predefined set of equation terms and estimated parameters subtracted from the approximated derivatives to perform model inference solely on residual dynamics [8, 94, 192, 208]. Symbolic regression terms can also be restricted either through inference fo partial models or through symbolic neural networks biasing term selection toward plausible forms [17, 85, 105].
Foundation models-aided integration of prior knowledge
The advent of foundation models (FMs) and LLMs offer new opportunities to incorporate more abstract forms of prior knowledge into model inference. Pretrained on large-scale, heterogeneous scientific corpora, these deep learning models internalize a wide range of biological knowledge, as well as mathematical patterns and conventions, which may assist model inference at various levels. This field builds on the recent success of specialized FMs like scGPT [209] and scFoundation [210] that generate single-cell transcriptomics, after having been trained on millions of gene expression profiles, which allowed them to implicitly capture biological structures such as gene co-expression.
First, the generation of candidate model may be assisted by deep learning tools, such as ODEFormer, a Transformer trained on millions of synthetic trajectories from randomly sampled ODEs [9]. It recapitulates common structural features of dynamical systems such as smoothness and variable dependencies and can output symbolic equations, which can serve as strong inductive priors for model inference. Next, Al-Khwarizmi [184] integrates LLMs with symbolic regression, using prompts to ensure known dynamics are included in the learned model. Interactive frameworks like Sym-Q [38], LLM4ED [61], D3 [57], and LLM-SR [62] further incorporate user feedback during the modeling process, e.g. by allowing experts to accept, reject, or modify candidate equations, thereby steering the search toward models that are not only data-consistent but also aligned with established scientific understanding.
In addition, LLMs can assist in incorporating structured knowledge into the optimization problem. The LLM-Lasso framework [211] exemplifies this by introducing adaptive, feature-specific penalties in sparse regression, guided by biological priors. Applied to SINDy, it assigns plausibility scores to candidate terms in the function library
using Retrieval-Augmented Generation, based on scientific databases (e.g. PubMed abstracts, pathway databases) [212]. For example, to evaluate the plausibility of the candidate term
, the LLM receives a prompt consisting of literature mentioning
and
along with a query such as “Is the interaction between
and
biologically plausible?.” This setup translates textual and database evidence into quantitative weights guiding kinetics and parameters priors.
Although promising, these methods come with serious limitations including high computational costs, risks of propagating biases from training data, poor biological interpretability, and the critical need for a posteriori check of biological and physical constraints. Indeed, LLMs are particularly prone to hallucinations in underspecified settings [213], which are common in biology, making factual reliability a critical concern. Moreover, their effectiveness often depends on carefully crafted prompts, yet prompt engineering remains a manual, error-prone process [55]. Such limitations may explain why simple baselines continue to outperform LLMs in certain predictive tasks such as perturbation prediction response [214, 215] and call for future benchmark studies, which are nowadays critically lacking in the field of digital twin design.
Dealing with latent or unobserved variables
In Biology, some elements may be unmeasurable due to technical or experimental constraints, resulting in undocumented variables that nonetheless influence the observed dynamics. Most model learning methods assume full observability and cannot account for these hidden variables, thus limiting their real-world applicability. To address this challenge, several approaches have recently been designed.
A first family of methods incorporates user-defined latent variables and jointly optimizes their trajectories with observed ones. Bhouri et al. (GP-NODE [159]) and North et al. [177] propose a probabilistic inference of both observed and hidden components within a Bayesian model. The problem formulation is similar to SINDy, using a candidate function library to express the system’s dynamics. The method then jointly infers the full-time evolution of all variables (including unobserved ones) and the sparse set of active equation terms, by fitting available observations. This task is performed in an end-to-end manner via either Hamiltonian Monte Carlo or MCMC. Similarly, but in a symbolic regression framework, Omejc et al. [76] treat the initial conditions and parameters of user-specified unobserved variables kinetics as free parameters. For each candidate model generated from a probabilistic grammar, the full system of equations is simulated and compared with the available observed data. This allows the method to infer the influence of latent states on the measured dynamics, provided that the available data are sufficient to ensure parameter practical identifiability.
An alternative is offered by the adaptive strategy of Daniels et al. [91], which introduces hidden variables solely when justified by predictive performance. Their method builds a hierarchical structure in an iterative manner: at each step, it evaluates whether adding a new latent variable increases the data fit and, if true, retains it. This approach allows automatically determining both the number and role of latent components essential to accurate forecasting while preserving parsimony.
Alternatively, indirect strategies for encoding latent influence rely on higher order derivatives or integrals of documented variables, which may be employed in a limited number of cases. Somacal et al. [114] include derivatives of observed variables into the dictionary of candidate functions, leveraging the fact that unobserved states may leave signatures in the derivatives of observed variables [216]. In the same line, Martinelli et al. [208] model the indirect action of a variable
through its time-integrated effect,
, which assumes that its influence is mediated through an intermediate species, provided that the reaction follows the law of mass action. Interestingly, this enables modeling-delayed—rather than instantaneous—regulatory effects when only upstream regulators are documented.
A last class of approaches focuses on reconstructing latent states using neural networks. Lu et al. [186] propose a framework where a neural encoder maps temporal sequences of observed variables to estimate the unobserved components. From these reconstructed states, governing equations are then identified by sparse regression. Grigorian et al. [10] similarly train a hybrid neural ODE model on the observed trajectories and apply symbolic regression to the latent part of the learned representation, yielding interpretable equations. These approaches combine the expressive power of neural networks with the interpretability of symbolic models, enabling recovery of hidden dynamics from partial observations. Yet, major challenges remain. Methods that explicitly infer latent trajectories often suffer from identifiability issues, as the available data may not be sufficient to faithfully recover the equations and parameters related to unobserved variables, making inference sensitive to initialization and priors. Neural reconstruction techniques, while flexible, may obscure mechanistic interpretability and can overfit in low-data regimes.
Methodological challenges
Automatically learning ODE-based digital twins from biological time series nowadays faces several critical issues.
We identified four key challenges: (i) high dimensionality of biological systems leading to method scalability and model identifiability issues, (ii) handling unobserved time derivatives in a context of noisy and sparse data, (iii) selecting candidate function libraries, (iv) quantifying the robustness and uncertainty of inferred models (Fig. 5).
Figure 5.
The four main methodological challenges in digital twin discovery: (1) High dimensionality of biological systems: as the number of variables increases, the size of the candidate library and the number of possible models also grow, leading to major computational challenges, along with an increased need for data to ensure model identifiability (2) Handling unobserved derivatives: when time derivatives are not directly available from data, one must rely on numerical approximation, alternative formulations of ODEs, or numerically solving the system (3) Selecting the candidate function library: the choice of candidate functions shapes the search space and impacts the plausibility of inferred models (4) Quantifying robustness and uncertainties of inferred models: identifying stable model structures across datasets and methodologies, and quantifying uncertainty in parameter estimates are crucial to ensure robust model inference.
High dimension of biological systems leading to computational and low-data regime challenge
Biological systems involve large networks of interacting species and great structural complexity that poses both computational and identifiability issues. First, discovering large systems requires high computational resources: as model dimension increases, sparse regression algorithms must handle rapidly growing candidate function libraries, while symbolic methods face a combinatorial explosion of possible expressions. For instance, performing sparse regression for a
-variable system and a function library
containing first- and second-order terms yields 65 possible terms; selecting up to four active terms per equation already produces over 722 000 possible combinations per equation. Second, high dimensionality requires more numerous and diverse datasets to constrain the model and avoid overfitting, which may be challenging considering the cost of biological experiments. However, with the rise of multi-omics and genetic or drug screening technologies, collecting large-scale time-varying datasets is becoming increasingly feasible.
Nowadays, only a few model inference methods scale to moderately large systems. Mangan et al. [115] developed a SINDy-based approach to infer a seven-dimensional yeast glycolysis model that required
900 temporal trajectories generated from distinct initial conditions. Next, Galagali et al. [11] explored all 1024 sub-networks of a 10-dimensional system using a Bayesian framework, showing that their MCMC-based approach could shift between models involving one or two active pathways, demonstrating the utility of probabilistic methods in navigating medium-scale model spaces.
Other strategies aim to reduce the dimension of the problem. Gelß et al. [152] use low-rank tensor decompositions to efficiently represent function libraries, exploiting multi-linear structures to reduce complexity. However, while this scales to systems from 10 to 100 variables, it assumes that the dynamics admits a compact tensor-product structure, e.g. only involving functions like
and
, which rarely holds in biological systems. More recently, Sadria et al. [217] applied SINDy to gene programs extracted from single-cell RNA-seq data. By projecting expression profiles onto a six-dimensional latent space, they enabled SINDy to operate on compressed representations of complex systems. However, the resulting dynamics remain confined to the latent space, lacking explicit correspondence to mechanistic biological variables, thus limiting biological interpretability.
Thus, a key challenge remains to design scalable algorithms that retain mechanistic interpretability in high dimension. This likely requires hybrid approaches that combine structural priors, modularity, and efficient inference to navigate vast model spaces without exhaustive search. Equally crucial is leveraging existing databases for systematically incorporating large-scale datasets, which provide rich biological data constraining dynamics recovery.
Approaches for handling unobserved derivatives
A fundamental challenge in inferring systems of ODEs is the common unavailability of the vector field, as time derivatives are rarely experimentally measured. Consequently, methods fall into three broad categories: those that numerically estimate derivatives, those that approximate the candidate model solution within the optimization procedure, and those that circumvent differentiation through integral reformulation of the problem.
In the first class, derivatives are estimated using either finite differences or smoothing methods like splines or Gaussian Processes [170], and then passed to the inference algorithm. Finite differences are fast but highly sensitive to measurement noise, which can dominate the derivative signal in biological data. Smoothing methods mitigate this by filtering out high-frequency fluctuations before differentiation, thereby producing more stable derivative estimates. However, this denoising step may also obscure sharp or transient biological features. In addition, such two-step processes that fully decouple derivative estimation from model learning rule out the possibility of correcting for the bias of derivative approximation during the model inference phase.
Alternatively, recent approaches numerically solve the candidate models during the inference process, enabling their direct comparison with state variable measurements. For example, He et al. [127] incorporate total variation regularization on the state trajectories during optimization, to encourage piecewise smooth solutions, which implicitly yield noise-robust derivatives without requiring their explicit computation. These regularized trajectories are then used internally for model fitting, tightly coupling trajectory estimation with parameter inference. More expressive frameworks rely on Gaussian Processes or neural ODEs to jointly learn both the latent dynamics and observational noise. A prominent example is GP-NODE [159], which combines sparse identification of dynamics via SINDy, a differentiable neural ODE solver, and a Gaussian Process observation model. At each iteration, candidate dynamics are integrated via a neural ODE module to produce a continuous trajectory. This trajectory is then compared with time series and a Gaussian Process is used to model the mismatch: its mean is given by the neural ODE trajectory, and its covariance captures the residual uncertainty due to observation noise or model misspecification. These methods improve robustness but introduce challenges: higher computational cost, identifiability issues, and sensitivity to hyperparameters such as neural network depth or Gaussian process kernel choice.
The last class of methods avoids derivative estimation by reformulating the inference problem. In integral formulations [156–158, 165], the ODE is rewritten as an integral equation, replacing differentiation with numerical integration. This makes the approach more robust to noise but introduces trade-offs such as increased computational cost and possible error accumulation over long trajectories. Weak formulations [87, 161–164] project the dynamics onto a family of smooth test functions, enhancing stability and noise resilience. A benchmark on chaotic systems found the weak integral formulation to consistently outperform other strategies, even in noiseless cases [161, 194]. However, design choices like selecting appropriate basis functions or integration intervals may significantly impact performance.
Selection, refinement, or generation of candidate function libraries
Candidate functions define the hypothesis space from which governing equations are inferred. In sparse regression, they form a predefined library of features; in symbolic regression, they emerge dynamically from combinations of basic operators. While conceptually distinct, both approaches rely on suitable function sets to capture system dynamics. Overly limited candidate sets can hinder expressiveness, while overly large ones can overwhelm inference. To address this, two strategies based on data-driven or knowledge-based heuristics emerged: properly selecting the candidate function library before model learning, or refining it during model discovery.
Optimized selection of candidate functions may be performed by using mutual information (MI), a widely used criterion that quantifies how much knowing one variable reduces uncertainty about another one. Applied to data, MI allows for preliminary feature pruning [20, 58, 83]. It may be preceded by entropy filtering, which discards near-constant variables, ensuring selected features are informative [82]. Other methods use structural constraints to reduce candidate sets. He et al. [32] apply Taylor expansions approximated from data to reveal properties such as separability or polynomial degree, and then constrain symbolic regression. Zhang et al. [176] apply dimensional analysis to retain only physically consistent terms. Bendinelli et al. [51] filter terms based on known symmetries (e.g. invariance under reflection or rotation), ensuring consistency with system properties.
Next, candidate function may be engineered during model learning through dynamical selection, transformation, or generation of features, which may not only reduce the hypothesis space and accelerate computation, but also facilitate the discovery of more accurate and interpretable models [144]. Some sparse regression methods incorporate internal refinement mechanisms: Naozuka et al. [144] use global sensitivity analysis to rank term importance during training and iteratively exclude the least performing candidate functions. França et al. [118] combine Pearson correlation with an information-theoretic criterion to retain nonredundant, informative features in a two-step filtering process. Bhadriraju et al. [122] use stepwise regression with F-tests to iteratively retain statistically significant terms, while Dropout-SINDy [119] enhances robustness by randomly excluding subsets of candidate terms and aggregating submodels via median coefficients. Symbolic regression methods may also involve candidate function refinement: GP-GOMEA [35] applies linkage learning to evolve and recombine useful expression subtrees; Amir Haeri et al. [21] grow symbolic trees based on statistical heuristics favoring informative variables; and SyMANTIC [83] recursively expands expressions while enforcing low-complexity constraints to maintain interpretability.
Finally, deep learning techniques offer new ways to generate candidate functions by leveraging models pretrained on large corpora of equations and dynamical systems. Some methods offer to map input data directly to symbolic equation terms [9, 40, 43, 44, 71]. LLM-based methods extend this further by treating expression synthesis as a prompt-driven task, refining terms based on residuals of data fit [57, 61, 62]. In sparse regression, Al-Khwarizmi [184] exemplifies this dynamic feature construction by incorporating visual inputs (e.g. annotated plots or schematics) into LLMs to build domain-specific candidate function libraries on the fly. These approaches shift candidate generation from static enumeration to a flexible, context-aware process. However, they face the typical limitations of LLM models, including a lack of reliability and biological relevance, and high dependency on prompt quality [55].
Challenges remain as many of these feature selection methods rely on ad hoc heuristics with limited theoretical grounding, and are based on a one-shot step rather than iterative workflows incorporating data fit residuals or expert feedback.
Quantifying robustness and uncertainty of inferred models
Digital twins are frequently applied in high-stakes domains like pharmacology and medicine, and ultimately serve for clinical decision-making, so that the accuracy of model predictions is of utmost importance. In such context, inferred models must not only reproduce observed data, but also remain robust to data perturbations (e.g. noise, subsampling, shifts in observation windows) and provide uncertainty quantification of parameter estimates, mathematical terms, or full equations. Crucially, the field’s historical focus on physical systems has led to the neglect of true biological robustness that must be retained in models. Biological networks typically involve multiple species connected through nonlinear, feedback-rich interactions ([218, fig. 6]; [219, fig. 4]) reflecting evolutionary pressure to maintain functions despite environmental noise, molecular fluctuations, and structural perturbations [220, 221]. Robustness arises through redundancy [222] and dynamic compensation [223], enabling systems to absorb perturbations while preserving essential behavior. Current inference frameworks largely ignore this resilience, yielding models that may fit data but fail to generalize across biological contexts. However, few methods integrate these aspects.
First, several empirical strategies assess the stability and confidence of inferred models by repeating the learning phase using synthetically perturbed datasets. Dataset bagging (i.e. bootstrap aggregating) and subsequent sparse regression applied to each batch yields model ensembles, where robust terms are identified by their recurrence across runs [120, 139]. Trajectory subsampling, i.e. repeating inference on different temporal segments, offers complementary insight into dynamic consistency under observational window shifts [102, 153, 158]. Next, for optimization-sensitive neural models whose results may significantly vary across runs, ensemble averaging can identify consistent dynamics across noisy replicates [9], while multiple restarts with different initializations may provide a quantification of inter-run variability [10, 188]. Altogether, these empirical probes serve as practical tools for empirically assessing both model robustness and uncertainty, but standard tests are critically needed to assess model stability under realistic biological variations.
As a second approach, Bayesian inference provides a principled framework to quantify uncertainty in both parameter values and model structure. At the parameter level, sparsity-promoting priors [159, 179, 182] yield posteriors that help identify which terms are consistently supported by the data (Section 3.3.2), thereby also yielding insights into robustness. At the structural level, Jin et al. [80] introduce a fully Bayesian symbolic regression framework where priors are placed on both expression trees and coefficients, enabling posterior inference over full equations via MCMC. This allows one to sample diverse candidate models and compute confidence scores of symbolic terms or motifs. Similarly, Dugan et al. [49] and Galagali et al. [11] propose strategies to approximate probability distributions of functional forms or reaction networks, respectively, thereby enabling structural uncertainty quantification. However, Bayesian approaches are computationally costly and scale poorly to large networks. Even when posterior distributions are obtained, they often suffer from overconfidence and poor identifiability.
Synthesis on best-performing model learning methods for biology
Strengths and limitations of existing symbolic regression methods in terms of identified challenges
Symbolic regression methods offer an interpretable framework for discovering compact analytical expressions, yet they struggle to meet the specific requirements of biological modeling as summarized in Table 1. Among the 48 best-performing methods, only three of them address more than three challenges out of the eight identified ones. Eleven out of 48 algorithms handle either missing data or unobserved variables. In addition, none of the reviewed methods provides solutions for the challenge of integrating multiple experimental conditions. Similarly, symbolic regression methods do not meet key methodological criteria such as avoiding derivative estimation (8/48 methods) and quantifying model uncertainty (9/48 methods). Importantly, high-dimensional systems remain particularly challenging: only seven methods have been tested on systems with four or more variables, and they sometimes report severe computational burdens in such settings [27, 78]. These unresolved challenges are compounded by a structural limitation: most symbolic regression frameworks neglect explicit modeling of inter-variable dependencies, making most of them poorly suited for inferring coupled ODE systems. Nonetheless, symbolic regression approaches show remarkable efforts in the generation and selection of candidate functions (22/48 methods, Section 4.3). In parallel, 29 methods interestingly integrate prior knowledge, most often through pretrained neural networks or LLMs.
Table 1.
Comparative overview of symbolic regression methods selected for addressing key challenges in biological model discovery [9–11, 16, 17, 20, 21, 27, 32, 35, 38, 40, 43, 44, 49, 51, 52, 55–64, 67, 69, 71, 75–77, 79–88, 91, 94, 102, 103, 105]: each row corresponds to a method, and each column indicates whether the method tackles specific biological challenges (missing data, multiple experimental conditions, prior knowledge integration, unobserved variables), or specific methodological challenges (avoids numerical derivative estimation, performs uncertainty quantification, has been tested on high-dimensional systems, or includes an optimization of the candidate function set); the table also lists the programming languages available for each method; methods are ordered such that those addressing the largest number of challenges and providing an available implementation appear at the top; the colored disks indicate the symbolic regression category of the method; methods with limited applicability to biological model inference were excluded for clarity. Alt text: Comparative table listing the best symbolic regression methods for biological model discovery, with columns indicating their ability to address one or more biological or methodological challenges
Among existing methods, the neural-based frameworks by d’Ascoli et al. [9] and Grigorian et al. [10] stand out for their innovative contributions, addressing five out of the eight challenges identified in this review. d’Ascoli et al. [9] proposed ODEFormer, a Transformer-based model trained on synthetic ODEs, capable of recovering closed-form symbolic dynamics from a single noisy trajectory. On the other hand, Grigorian et al. [10] developed a hybrid method combining a neural ODE for latent trajectory inference with symbolic regression. Both methods successfully deal with the integration of prior knowledge, the applicability in high dimension, unobserved derivatives handling, and uncertainty quantification, while solely ODEFormer includes an advanced mechanism for selecting candidate functions. Only Grigorian et al. [10] can successfully recover partially observed systems although this could be implemented in ODEFormer via the same approach. Of note, neither methods currently handle missing data, which may be addressed through techniques developed in other model learning studies (see Section 3.1). Next, neither methods allows for the integration of datasets from multiple experimental conditions. While direct incorporation into these symbolic architectures may be complex, a two-step strategy inspired by Gui et al. [195] could offer a feasible and modular alternative, by first learning a neural invariant representation across conditions, and subsequently performing symbolic regression on the extracted features. Overall, these adaptations would help apply symbolic regression methods in more realistic and applicable biological modeling scenarios.
Strengths and limitations of existing sparse regression methods in terms of identified challenges
As compared with symbolic regression approaches, sparse regression methods are generally more effective at addressing the biological and methodological challenges investigated in this review, thus appearing more adapted to discover biological systems (Table 2). Indeed, among the 55 best-performing methods, 18 of them address more than three challenges (as compared with 3/48 for symbolic regression). Regarding biological issues, nearly half of them handles missing data (25/55 methods) or the integration of prior knowledge (22/55 methods), most often through prior distributions on parameters. In terms of methodological challenges, 23 methods include uncertainty quantification, typically through posterior distributions or bootstrap procedures (Section 4.4), while 26 algorithms avoid the need for derivative estimation. However, only a limited subset of sparse regression frameworks may integrate multiple experimental conditions (8/55 methods) or unobserved variables (4/55 methods). Nevertheless, the solutions proposed for these challenges—discussed in Sections 3.2 and 3.4—could inspire the development of more comprehensive approaches. Finally, only six methods incorporate candidate function selection or refinement, despite the fact that such procedures exist (see Section 4.3) and could be implemented [144].
Table 2.
Comparative overview of sparse regression methods selected for addressing key challenges in biological model discovery [54, 114, 117–120, 122, 129, 131, 133, 135, 137, 139, 143, 144, 146, 147, 150, 153–173, 175–186, 188, 190–192]: each row corresponds to a method, and each column indicates whether the method tackles specific biological challenges (missing data, multiple experimental conditions, prior knowledge integration, unobserved variables), or specific methodological challenges (avoids numerical derivative estimation, performs uncertainty quantification, has been tested on high-dimensional systems, or includes an optimization of the candidate function set); the table also lists the programming languages available for each method; methods are ordered such that those addressing the largest number of challenges and providing an available implementation appear at the top; the colored disks indicate the symbolic regression category of the method; methods with limited applicability to biological model inference were excluded for clarity. Alt text: Comparative table listing the best sparse regression methods for biological model discovery, with columns indicating their ability to address one or more biological or methodological challenges
Among sparse regression approaches, the methods that address the highest number of challenges were all based on Bayesian frameworks, demonstrating the superiority of such approach for model learning [159, 164, 168, 170, 177, 178, 180, 182]. The methods by Bhouri et al. [159], Jiang et al. [182], and Long et al. [180] emerge as the only pipelines addressing more than five out of the eight performance criteria. All three methods (i) embed interpolation or trajectory reconstruction techniques to mitigate the effects of data noise or missing values, (ii) integrate prior knowledge through sparsity-enforcing parameter priors, (iii) were validated on biological systems with at least four dimensions, (iv) directly perform a numerical resolution of the ODE system to avoid the instability of finite differences-based derivative approximations, and (v) offer a quantification of model uncertainty through parameter posterior distribution. Bhouri et al. [159] go further by handling unobserved variables (see Section 4.2). To further improve these methods, several directions emerge from the review. For handling multiple experimental conditions, Bayesian sparse regression is compatible with multi-condition model discovery as in Schaeffer et al. [147] and Park et al. [155]. Another solution, particularly when larger datasets are available, is to adopt a two-stage pipeline using the neural representation learning approach of Gui et al. [195] to capture invariant functions and apply sparse Bayesian inference on those learned functions to extract interpretable mathematical structures. Finally, regarding candidate function selection, most documented sparse regression methods would benefit from incorporating statistical criteria upstream or during inference (see Section 4.3).
Discussion
In this review, we surveyed recent advances in data-driven inference of digital twins represented as mechanistic ODEs, and focused on methodologies of two main approaches—symbolic or sparse regression (summarized in our interactive Shiny application (https://u1331systemspharmacology.shinyapps.io/model_learning_review/)). We assessed how these methods address four key biological challenges: noisy and irregular data, heterogeneity across conditions, incorporation of prior biological knowledge, and partial observability of system states. We also considered four methodological issues associated with scalability to high-dimensional systems, unobserved state variable derivatives, candidate function library design, and model uncertainty quantification. Overall, sparse regression emerges as more efficient than symbolic regression, particularly in Bayesian formulations, which excel at incorporating prior knowledge and quantifying uncertainty in both parameters and inferred models—crucial features in noisy, underdetermined biological settings. However, no inference method addresses more than six out of the eight challenges, which advocates for future developments. Scalability to high-dimensional networks remains a main bottleneck, as candidate libraries grow combinatorially and data demands rise with model size.
Of utmost interest, deep learning methods are increasingly combined with sparse and symbolic regression, resulting in improved performance. In sparse regression, neural ODEs help denoise trajectories and capture latent dynamics before term selection [158, 159]. In symbolic regression, hybrid neural ODEs blend neural approximators with domain constraints to guide expression search and improve interpretability [10, 51]. These approaches complement FMs and LLMs, which offer new ways to exploit prior knowledge from literature, structured databases, or expert input. As an important example, LLMs trained on ODE simulations can predict symbolic equations from incomplete trajectories, which can serve as an educated prior or for assigning plausibility scores to candidate library terms based on biological context. However, their lack of explicit temporal modeling limits their ability to simulate dynamics, and challenges remain regarding their consistency, factual reliability, and bias, underscoring the need for rigorous evaluation and control.
In conclusion, while Bayesian frameworks and deep learning have pushed the field forward, critical challenges remain before digital twin discovery can be routinely deployed in biological applications. Addressing scalability, establishing rigorous benchmarks, and responsibly integrating AI tools will be key to advancing our understanding of complex biological systems and translating dynamical modeling into clinically meaningful discoveries.
Perspectives
Toward integrated and reliable frameworks
Overall, our analysis shows that no single model learning method currently addresses all the challenges of biological systems, thus impairing real-world applicability. Since each challenge has been addressed in at least one study, progress is likely to come less from entirely new algorithms than from integrating existing ones into flexible, modular frameworks. To that end, we advocate for a CRN-based approach, which naturally encodes biological structure through stoichiometry, kinetics, and reaction directionality (Section 3.3). Effective frameworks should also prioritize reliability under adverse conditions—avoiding explicit differentiation, dynamically refining candidate functions, and rigorously quantifying model uncertainty. Notably, few methods currently distinguish epistemic uncertainty arising from limited knowledge or model misspecification from aleatoric noise, limiting interpretability. Future work should assemble these components into general-purpose tools for data-driven modeling of complex living systems.
Looking ahead, the most exciting opportunities lie in combining these approaches with emerging AI models. In this area, promising directions include (i) hybrid pipelines that integrate mechanistic priors with LLM-guided candidate generation, (ii) retrieval-augmented strategies that anchor outputs in curated biological knowledge sources, and (iii) tools that translate expert input or experimental context into executable constraints, making models more adaptive and user-driven.
The need for comprehensive benchmarks
Our review also underscores the critical need for comprehensive benchmarks to objectively evaluate inference methods. Therefore, we highlight the key components required for the future development of a comprehensive benchmark capable of evaluating model-learning methods across biological and methodological challenges. In the absence of well-curated pairs of complete real-world data and corresponding models, algorithm evaluation mostly relies on synthetic data generated from a predefined model—corresponding to the ground truth to be recovered—that serves as inputs for inference. To ensure practical relevance, curated biological ODE systems from repositories such as BioModels [224] should be prioritized. Moreover, alongside classic low- or medium-dimension models (e.g. Lotka-Volterra, yeast glycolysis, or MAPK signaling [94, 115]), larger systems such as the circadian clock [225] must also be included, as they feature feedback loops and oscillatory behavior. These curated models provide biologically plausible testbeds with known structure and parameters for assessing recovery performance, identifiability, and interpretability.
Next, efficient benchmarks must also probe the four biological challenges highlighted in this review. (i) To evaluate sensitivity to data poor quality such as noise, sparse sampling, and irregular time grids, benchmarks should systematically vary synthetic data parameters like noise level, observation frequency, and number of initial conditions. (ii) To assess how methods handle biological variability, benchmarks can emulate inter-subject or inter-condition differences by altering the model parameters while preserving the underlying network structure. For instance, reaction rates may be perturbed to reflect cell-type differences, or set to zero to simulate knockouts; initial concentrations can be varied to mimic genetic heterogeneity. (iii) To probe the influence of prior knowledge, one should vary both the quantity and reliability of available priors by gradually introducing partial reaction networks or known kinetic forms, or deliberately including misleading information such as spurious interactions. This reveals how methods balance data-driven discovery with prior constraints. (iv) To test the ability to recover hidden dynamics, benchmarks should vary the fraction of observed species, thus testing inference under partial observability. Since these challenges often interact, combining them will yield more realistic evaluations and enable challenge-aware comparison of inference methods.
Methods
Review methodology This review was conducted using a non-systematic strategy starting by querying PubMed and arXiv using combinations of “model learning,” “data-driven modeling,” “sparse regression,” “symbolic regression,” “equation discovery,” “system identification,” and “SINDy.” This was complemented by backward snowballing from key references and extending to publications citing them. Articles were selected based on three criteria: (i) ability to learn both structure and parameters of models of an ODE system, (ii) relevance to challenges related to biological modeling, (iii) originality of the method. For symbolic regression, only about 20% met these criteria, while the majority of the reviewed sparse regression methods met these criteria.
Best performing methods In Section 5, we considered as best-performing all methods that satisfy at least one biological or methodological challenge identified in this article. This yields 48 symbolic and 55 sparse regression methods, which are the ones displayed in the summary Tables 1 and 2.
Shiny application To interactively navigate, filter, and visualize the methods reviewed in this article, we developed a Shiny application (Shiny for Python). Its modular design enables easy updates, offering a dynamic complement to the static review for researchers seeking suitable modeling tools. Each entry is annotated with the following:
Bibliographic information: based on Google Scholar.
General method characteristics: method name, method type (symbolic or sparse regression), concise description of the method’s strategy, types of equations identified (ODEs, PDEs, etc.), and biological systems or benchmarks used for evaluation.
Biological and methodological challenges: handle missing data, handle unobserved species, incorporation of prior knowledge, handle of multiple experimental conditions or individuals, preprocessing procedures applied, and the types of candidate functions used.
Technical aspects: optimization algorithm, hyperparameter tuning strategy, use of Bayesian inference, use of deep learning architectures, and uncertainty quantification capabilities.
Implementation details: programming language and URL to the public implementation (if available).
Key Points.
We review data-driven methods for inferring biological ODE/Chemical Reaction Network (CRN) “digital twins” from time-series data, focusing on symbolic regression and sparse regression.
We organize the field around eight core challenges: four biological (noise/sparsity/irregular sampling, multi-condition heterogeneity, prior knowledge, latent variables) and four methodological (high dimensionality, unobserved derivatives, candidate-library design, uncertainty quantification).
Across these challenges, sparse regression, especially in Bayesian frameworks, generally outperforms symbolic approaches by integrating priors and providing principled uncertainty estimates.
We highlight emerging roles for deep learning and LLM/FMs to denoise trajectories, generate/rank candidate terms, and incorporate literature-derived knowledge while noting reliability and bias concerns.
We advocate hybrid, modular pipelines grounded in CRN structure and Bayesian inference, and propose a challenge-aligned benchmarking guideline to evaluate methods systematically.
Acknowledgements
We sincerely thank Lucie Gaspard-Boulinc (Institut Curie, INSERM U1331) for her contribution in the development of the Shiny application and for her help with the visualization of the results of this review.
Contributor Information
Clémence Métayer, Inserm U1331, Institut Curie, PSL Research University, CBIO-Center for Computational Biology, Mines Paris, Cancer Systems Pharmacology team, Saint-Cloud 92210, France.
Annabelle Ballesta, Inserm U1331, Institut Curie, PSL Research University, CBIO-Center for Computational Biology, Mines Paris, Cancer Systems Pharmacology team, Saint-Cloud 92210, France.
Julien Martinelli, Aalto University, ELLIS Institute Finland, Espoo 11000, Finland.
Conflicts of interest: None declared.
Funding
JM was supported by the Research Council of Finland (Flagship programme: Finnish Center for Artificial Intelligence FCAI and decision 341763). CM PhD studentship was funded by Inria, Inserm and Institut Curie (Paris, France). AB was supported by the ATIP-Avenir program (INCA, 2018), INSERM and Institut Curie.
Data availability
No datasets have been utilized in this review paper.
References
- 1. Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: a review on inference methods. PLoS Comput Biol 2023;19:e1011254. 10.1371/journal.pcbi.1011254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Machado D, Costa R, Rocha M. et al. Modeling formalisms in systems biology. AMB Express 2011;1:45. 10.1186/2191-0855-1-45 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Kitano H. Computational systems biology. Nature 2002;420:206–10. 10.1038/nature01254 [DOI] [PubMed] [Google Scholar]
- 4. James K, Sneyd J. Mathematical physiology: II: Systems physiology. New York, NY: Springer New York, 2009. 10.1007/978-0-387-75847-3 [DOI] [Google Scholar]
- 5. North JS, Wikle CK, Schliep EM. A review of data-driven discovery for dynamic systems. Int Stat Rev 2023;91:464–92. 10.1111/insr.12554 [DOI] [Google Scholar]
- 6. Ghadami A, Epureanu BI. Data-driven prediction in dynamical systems: recent developments. Philos Transact A Math Phys Eng Sci 2022;380:20210213. 10.1098/rsta.2021.0213 [DOI] [Google Scholar]
- 7. Augusto DA, Barbosa HJC. Symbolic regression via genetic programming. In: Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks, 173–178, Rio de Janeiro, Brazil, 2000. 10.1109/SBRN.2000.889734 [DOI] [Google Scholar]
- 8. Schmidt M, Lipson H. Distilling free-form natural laws from experimental data. Science (New York, NY) 2009;324:81–5. 10.1126/science.1165893 [DOI] [Google Scholar]
- 9. d’Ascoli S, Becker S, Schwaller P. et al. ODEFormer: symbolic regression of dynamical systems with transformers. In: Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024), Vienna, Austria, 2024. URL https://openreview.net/forum?id=TzoHLiGVMo
- 10. Grigorian G, George SV, Arridge S. Learning governing equations of unobserved states in dynamical systems. Physica D 2025;472:134499. 10.1016/j.physd.2024.134499 https://www.sciencedirect.com/science/article/pii/S0167278924004494 [DOI] [Google Scholar]
- 11. Galagali N, Marzouk YM. Exploiting network topology for large-scale inference of nonlinear reaction models. J. R. Soc. Interface 2019;16:20180766. 10.1098/rsif.2018.0766. URL: https://royalsocietypublishing.org/doi/abs/10.1098/rsif.2018.0766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. La Cava W, Burlacu B, Virgolin M. et al. Contemporary symbolic regression methods and their relative performance. Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW. (eds), Advances in Neural Information Processing Systems 2021. Curran Associates, Inc., Red Hook, NY, USA, 2021;1–16. [Google Scholar]
- 13. Virgolin M, Pissis SP. Symbolic regression is NP-hard. Transactions on Machine Learning Research, pages 1–11, 2022. URL: https://openreview.net/forum?id=LTiaPxqe2e
- 14. de França FO. Alleviating overfitting in transformation-interaction-rational symbolic regression with multi-objective optimization. Genet Program Evolvable Mach 2023;24:13. 10.1007/s10710-023-09461-3 [DOI] [Google Scholar]
- 15. Burlacu B, Kronberger G, Kommenda M. Operon C++: an efficient genetic programming framework for symbolic regression. In Coello Coello Carlos Artemio (ed), Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO ‘20, pages 1562–1570, New York, NY, USA, July 2020. Association for Computing Machinery, NY, USA. 10.1145/3377929.3398099 [DOI] [Google Scholar]
- 16. Gaucel S, Keijzer M, Lutton E, Tonda A. Learning dynamical systems using standard symbolic regression. In Nicolau M, Krawiec K, Heywood MI, Castelli M, García-Sánchez P, Merelo JJ, Rivas Santos VM, Sim K (eds), Genetic Programming, 25–36, Berlin, Heidelberg, 2014. Springer. 10.1007/978-3-662-44303-3_3 [DOI] [Google Scholar]
- 17. Randall DL, Townsend TS, Hochhalter JD, and Bomarito GF. Bingo: a customizable framework for symbolic regression with genetic programming. In: Fieldsend JE, Wagner M. (ed), Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ‘22, 2282–2288, New York, NY, USA, 2022. Association for Computing Machinery. 10.1145/3520304.3534031 [DOI] [Google Scholar]
- 18. Arnaldo I, Krawiec K, O’Reilly U-M. Multiple regression genetic programming. In Igel C. (ed), Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO ‘14, pages 879–886, New York, NY, USA, July 2014. Association for Computing Machinery, NY, USA. 10.1145/2576768.2598291 [DOI] [Google Scholar]
- 19. Cornforth T, Lipson H. Symbolic regression of multiple-time-scale dynamical systems. In Soule T, Moore JH. (eds), Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO ‘12, 735–742, New York, NY, USA, 2012. Association for Computing Machinery, NY, USA. 10.1145/2330163.2330266 [DOI] [Google Scholar]
- 20. Chen Q, Zhang M, Xue B. Feature selection to improve generalization of genetic programming for high-dimensional symbolic regression. IEEE Trans Evol Comput 2017a;21:792–806. 10.1109/TEVC.2017.2683489 [DOI] [Google Scholar]
- 21. Haeri MA, Ebadzadeh MM, Folino G. Statistical genetic programming for symbolic regression. Appl Soft Comput 2017;60:447–69. 10.1016/j.asoc.2017.06.050 [DOI] [Google Scholar]
- 22. Kommenda M, Burlacu B, Kronberger G. et al. Parameter identification for symbolic regression using nonlinear least squares. Genet Program Evolvable Mach 2020;21:471–501. 10.1007/s10710-019-09371-3 [DOI] [Google Scholar]
- 23. Smits GF, Kotanchek M. Pareto-front exploitation in symbolic regression. In O’Reilly U-M, Tina Yu, Rick Riolo, and Bill Worzel (eds), Genetic Programming Theory and Practice II, pages 283–99. Springer US, Boston, MA, 2005. ISBN 978-0-387-23254-6. 10.1007/0-387-23254-0_17 [DOI] [Google Scholar]
- 24. Uy NQ, Hoai NX, O’Neill M. et al. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet Program Evolvable Mach 2011;12:91–119. 10.1007/s10710-010-9121-2 [DOI] [Google Scholar]
- 25. Bongard J, Lipson H. Automated reverse engineering of nonlinear dynamical systems. Proc Natl Acad Sci U S A 2007;104:9943–8. 10.1073/pnas.0609476104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. La Cava W, Danai K, Spector L. Inference of compact nonlinear dynamic models by epigenetic local search. Eng Appl Artif Intel 2016a;55:292–306. 10.1016/j.engappai.2016.07.004 [DOI] [Google Scholar]
- 27. Cranmer M. Interpretable machine learning for science with PySR and SymbolicRegression.Jl. 2023. URL: http://arxiv.org/abs/2305.01582. arXiv:2305.01582 [astro-ph].
- 28. Icke I, Bongard JC. Improving genetic programming based symbolic regression using deterministic machine learning. In Tan KC. (ed), Proceedings of the 2013 IEEE Congress on Evolutionary Computation (CEC 2013), 1763–1770, 2013. Institute of Electrical and Electronics Engineers (IEEE), Piscataway, New Jersey, USA. 10.1109/CEC.2013.6557774 [DOI] [Google Scholar]
- 29. Zhong J, Liang F, Cai W. et al. Multifactorial genetic programming for symbolic regression problems. IEEE Trans Syst Man Cybern Syst 2020;50:4492–505. 10.1109/TSMC.2018.2853719. URL: https://ieeexplore.ieee.org/document/8419217 [DOI] [Google Scholar]
- 30. Davidson JW, Savic DA, Walters GA. Symbolic and numerical regression: experiments and applications. Inform Sci 2003;150:95–117. 10.1016/S0020-0255(02)00371-7 [DOI] [Google Scholar]
- 31. Stinstra E, Rennen G, Teeuwen G. Metamodeling by symbolic regression and pareto simulated annealing. Struct Multidiscip Optim 2008;35:315–26. 10.1007/s00158-007-0132-4 [DOI] [Google Scholar]
- 32. He B, Lu Q, Yang Q, Luo J, Wang Z. Taylor genetic programming for symbolic regression. In Fieldsend JE. (ed), Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ‘22, 946–954, New York, NY, USA, 2022. Association for Computing Machinery. 10.1145/3512290.3528757 [DOI] [Google Scholar]
- 33. Kronberger G, de Franca FO, Burlacu B. et al. Shape-constrained symbolic regression—improving extrapolation with prior knowledge. Evol Comput 2022;30:75–98. 10.1162/evco_a_00294 [DOI] [PubMed] [Google Scholar]
- 34. Virgolin M, Alderliesten T, Bosman PAN. Linear scaling with and within semantic backpropagation-based genetic programming for symbolic regression. In López-Ibáñez M, Auger A, Stützle. et al. (eds), Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ‘19, 1084–1092, New York, NY, USA, 2019. Association for Computing Machinery. 10.1145/3321707.3321758 [DOI] [Google Scholar]
- 35. Virgolin M, Alderliesten T, Witteveen C. et al. Improving model-based genetic programming for symbolic regression of small expressions. Evol Comput 2021;29:211–37. 10.1162/evco_a_00278 [DOI] [PubMed] [Google Scholar]
- 36. La Cava W, Lee S, Danai K. Epsilon-lexicase selection for regression. In Friedrich T, Neumann F, Sutton AM. (eds), Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO ‘16, pp. 741–748, New York, NY, USA, 2016b. Association for Computing Machinery, NY, USA. 10.1145/2908812.2908898 [DOI] [Google Scholar]
- 37. Petersen BK, Landajuela M, Mundhenk TN. et al. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients. In Proceedings of the International Conference on Learning Representations (ICLR 2021), 2021. URL: http://arxiv.org/abs/1912.04871. arXiv:1912.04871.
- 38. Tian Y, Zhou W, Viscione M. et al. Interactive symbolic regression with co-design mechanism through offline reinforcement learning. Nat Commun 2025;16:3930. 10.1038/s41467-025-59288-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Xu Y, Liu Y, Sun H. RSRM: reinforcement symbolic regression machine. 2023. URL: http://arxiv.org/abs/2305.14656. arXiv:2305.14656 [cs].
- 40. Biggio L, Bendinelli T, Neitz A. et al. Neural symbolic regression that scales. In Meila M, Zhang T. (eds), Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, pp. 936–945, 2021. PMLR, Brookline, MA, USA. URL: https://proceedings.mlr.press/v139/biggio21a.html
- 41. Li W, Li W, Sun L. et al. Transformer-based model for symbolic regression via joint supervised learning. In: The Eleventh International Conference on Learning Representations, Kigali, Rwanda, 2023. URL: https://openreview.net/forum?id=ULzyv9M1j5
- 42. Vastl M, Kulhánek J, Kubalík J. et al. SymFormer: end-to-end symbolic regression using transformer-based architecture. IEEE Access 2024;12:37840–9. 10.1109/ACCESS.2024.3374649 [DOI] [Google Scholar]
- 43. Becker S, Klein M, Neitz A. et al. Predicting ordinary differential equations with transformers. In Krause A, Brunskill E, Cho K. et al. (eds), Proceedings of the 40th International Conference on Machine Learning. 2002 (ed), p. 1978. PMLR, Brookline, MA, USA, 2023. URL: https://proceedings.mlr.press/v202/becker23a.html
- 44. Holt S, Qian Z, van der Schaar. Deep generative symbolic regression. 2023. URL: http://arxiv.org/abs/2401.00282. arXiv:2401.00282 [cs].
- 45. Kim S, Lu PY, Mukherjee S. et al. Integration of neural network-based symbolic regression in deep learning for scientific discovery. IEEE Trans Neural Networks Learn Syst 2021;32:4166–77. 10.1109/TNNLS.2020.3017010 [DOI] [Google Scholar]
- 46. Martius G, Lampert CH. Extrapolation and learning equations. 2016. URL: http://arxiv.org/abs/1610.02995. arXiv:1610.02995 [cs].
- 47. Sahoo S, Lampert C, Martius G. Learning equations for extrapolation and control. In Dy J, Krause A, (eds), International Conference on Machine Learning, pp. 4442–50. PMLR, Brookline, MA, USA, 2018. [Google Scholar]
- 48. Werner M, Junginger A, Hennig P, Martius G. Informed equation learning. 2021. arXiv preprint arXiv:2105.06331 [cs.LG]. URL: http://arxiv.org/abs/2105.06331 [Google Scholar]
- 49. Dugan O, Dangovski R, Costa A. et al. OccamNet: a fast neural model for symbolic regression at scale. 2023. URL: http://arxiv.org/abs/2007.10784
- 50. Kubalík J, Derner E, Babuška R. Toward physically plausible data-driven models: a novel neural network approach to symbolic regression. IEEE Access 2023;11:61481–501. 10.1109/ACCESS.2023.3287397 [DOI] [Google Scholar]
- 51. Bendinelli T, Biggio L, Kamienny P-A. Controllable neural symbolic regression. In Krause A, Brunskill E, Cho K. et al. (eds), Proceedings of the 40th International Conference on Machine Learning, pp. 2063–77. PMLR, Brookline, MA, USA, 2023.. URL: https://proceedings.mlr.press/v202/bendinelli23a.html [Google Scholar]
- 52. Pervez A, Locatello F, Gavves E. Mechanistic neural networks for scientific machine learning. In Salakhutdinov R, Kolter Z, Heller K. et al. (eds), Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Proceedings of Machine Learning Research, vol. 235, pp. 40484–40501. PMLR, Brookline, MA, USA. 2024. URL: http://arxiv.org/abs/2402.13077 [Google Scholar]
- 53. Garmaev S, Mishra S, Fink O. NOMTO: Neural Operator-based symbolic Model approximaTion and discOvery. 2025. URL: http://arxiv.org/abs/2501.08086. arXiv:2501.08086 [cs].
- 54. Sun F, Yang L, Wang Q. et al. PiSL: Physics-informed Spline Learning for data-driven identification of nonlinear dynamical systems. Mech Syst Signal Process 2023;191:110165. 10.1016/j.ymssp.2023.110165. https://www.sciencedirect.com/science/article/pii/S0888327023000729 [DOI] [Google Scholar]
- 55. Sharlin S, Josephson TR. In context learning and reasoning for symbolic regression with large language models. arXiv preprint arXiv:2410.17448 [cs.CL], 2024. URL: http://arxiv.org/abs/2410.17448
- 56. Grayeli A, Sehgal A, Reyes OC. et al. Symbolic regression with a learned concept library. In Globerson A, Mackey L, Belgrave D. et al. (eds), Advances in Neural Information Processing Systems 2024;37:44678–709. Neural Information Processing Systems Foundation, Inc., Vancouver, BC, Canada. [Google Scholar]
- 57. Holt S, Qian Z, Liu T. et al. Data-driven discovery of dynamical systems in pharmacology using large language models. In Globerson A, Mackey L, Belgrave D. et al. (eds), Advances in Neural Information Processing Systems 2024;37:96325–66. Neural Information Processing Systems Foundation, Inc., Vancouver, BC, Canada. URL: https://proceedings.neurips.cc/paper_files/paper/2024/hash/aea8bdc42d8ba3a67a69b3f18be93f69-Abstract-Conference.html [Google Scholar]
- 58. Song Z, Ju M, Ren C. et al. LLM-Feynman: leveraging large language models for universal scientific formula and theory discovery. 2025. URL: https://arxiv.org/abs/2503.06512 arXiv preprint arXiv:2503.06512 [cs.LG].
- 59. Ma P, Wang T-H, Guo M. et al. LLM and simulation as bilevel optimizers: a new paradigm to advance physical scientific discovery. In Salakhutdinov R, Kolter Z, Heller K. et al. (eds), Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Proceedings of Machine Learning Research, vol. 235, pp. 33940–33962. PMLR, Brookline, MA, USA. 2024. URL: https://proceedings.mlr.press/v235/ma24m.html [Google Scholar]
- 60. Khanghah KN, Patel A, Malhotra R. et al. Large language models for extrapolative modeling of manufacturing processes. Journal of Intelligent Manufacturing, 36. Springer Nature, Berlin, Heidelberg, Germany. 2025. 10.1007/s10845-025-02638-w [DOI] [Google Scholar]
- 61. Mengge D, Chen Y, Wang Z. et al. Large language models for automatic equation discovery of nonlinear dynamics. Phys Fluids 2024;36:097121. 10.1063/5.0224297 [DOI] [Google Scholar]
- 62. Shojaee P, Meidani K, Gupta S. et al. LLM-SR: scientific equation discovery via programming with large language models. 2024. URL: http://arxiv.org/abs/2404.18400. arXiv:2404.18400 [cs].
- 63. Li Y, Li W, Yu L. et al. ChatSR: Multimodal Large Language Models for Scientific Formula Discovery. 2024. URL: https://arxiv.org/abs/2406.05410. arXiv:2406.05410 [cs.AI].
- 64. Merler M, Haitsiukevich K, Dainese N. et al. In-context symbolic regression: leveraging large language models for function discovery. In Fu X, Fleisig E. (eds), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pp. 589–606, Bangkok, Thailand: Association for Computational Linguistics, Zhuhai, China. 2024. [Google Scholar]
- 65. Elaarabi M, Borzacchiello D, Le Bot. et al. Adaptive parameters identification for nonlinear dynamics using deep permutation invariant networks. Mach Learn 2025;114:22. 10.1007/s10994-024-06732-7 [DOI] [Google Scholar]
- 66. Kamienny P-A, Lample G, Lamprier S. et al. Deep generative symbolic regression with Monte-Carlo-tree-search. In Krause A, Brunskill E, Cho K. et al. (eds), Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Proceedings of Machine Learning Research, vol. 202, pp. 15655–15668. PMLR, Brookline, MA, USA. 2023.. URL: https://proceedings.mlr.press/v202/kamienny23a.html [Google Scholar]
- 67. Yu Z, Ding J, Li Y. Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length. 2025a. URL: http://arxiv.org/abs/2411.03753. arXiv:2411.03753 [cs].
- 68. Mundhenk T, Landajuela M, Glatt R. et al. Symbolic regression via deep reinforcement learning enhanced genetic programming seeding. In Ranzato M'A, Beygelzimer A, Dauphin YN, Liang PS, Vaughan JW. (eds), Advances in Neural Information Processing Systems, Vol. 34, pp. 24912–23. Neural Information Processing Systems Foundation, Inc., Curran Associates, Inc., Vancouver, BC, Canada, 2021.. URL: https://proceedings.neurips.cc/paper/2021/hash/d073bb8d0c47f317dd39de9c9f004e9d-Abstract.html [Google Scholar]
- 69. Qiu H, Liu S, Yao Q. Neural symbolic regression of complex network dynamics. 2024. URL: http://arxiv.org/abs/2410.11185. arXiv:2410.11185 [cs].
- 70. Landajuela M, Lee CS, Yang J. et al. A unified framework for deep symbolic regression. In Koyejo S, Mohamed S, Agarwal A. et al. (eds), Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ‘22, pp. 33985–98, Red hook. NY, USA: Neural Information Processing Systems Foundation, Inc.; Curran Associates, Inc., New Orleans, LA, USA, 2022. [Google Scholar]
- 71. Meidani K, Shojaee P, Reddy CK. et al. SNIP: bridging mathematical symbolic and numeric realms with unified pre-training. 2024. URL: http://arxiv.org/abs/2310.02227. arXiv:2310.02227 [cs].
- 72. Cranmer M, Cui C, Fielding DB. et al. Disentangled sparsity networks for explainable AI. In workshop on sparse neural networks. Sparsity in Neural Networks Workshop 2021 (ICLR workshop). 2021;7. URL: https://astroautomata.com/data/sjnn_paper.pdf [Google Scholar]
- 73. Cranmer M, Gonzalez AS, Battaglia P. et al. Discovering symbolic models from deep learning with inductive biases. In Larochelle H, Ranzato M'A, Hadsell R, Balcan M-F, Lin H-T, (eds), Advances in Neural Information Processing Systems, Vol. 33, pp. 17429–42. Neural Information Processing Systems Foundation, Inc.; Curran Associates, Inc., Red Hook, NY, USA, 2020.. URL: https://proceedings.neurips.cc/paper_files/paper/2020/hash/c9f2f917078bd2db12f23c3b413d9cba-Abstract.html [Google Scholar]
- 74. Brence J, Todorovski L, Džeroski S. Probabilistic grammars for equation discovery. Knowledge-Based Syst 2021;224:107077. 10.1016/j.knosys.2021.107077 [DOI] [Google Scholar]
- 75. Brence J, Džeroski S, Todorovski L. Dimensionally-consistent equation discovery through probabilistic attribute grammars. Inform Sci 2023;632:742–56. 10.1016/j.ins.2023.03.073 [DOI] [Google Scholar]
- 76. Omejc N, Gec B, Brence J. et al. Probabilistic Grammars for Modeling Dynamical Systems from Coarse, Noisy, and Partial Data. Machine Learning. 113:7689–7721. Springer Science, Berlin/Heidelberg, Germany. 2024. 10.1007/s10994-024-06522-1 [DOI] [Google Scholar]
- 77. Kusner MJ, Paige B, Hernández-Lobato JM. Grammar variational autoencoder. In Precup D, Teh YW, (eds), Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Proceedings of Machine Learning Research, vol. 70, pp. 1945–1954. PMLR, Brookline, MA, USA. 2017. [Google Scholar]
- 78. Yu K, Chatzi E, Kissas G. Grammar-based ordinary differential equation discovery. Mechani-cal Systems and Signal Processing, 240, 113395, Elsevier, Amsterdam, The Netherlands, 2025. [Google Scholar]
- 79. Guimerà R, Reichardt I, Aguilar-Mogas A. et al. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Sci Adv 2020;6:eaav6971. 10.1126/sciadv.aav6971 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Jin Y, Fu W, Kang J. et al. Bayesian symbolic regression. 2019. https://arxiv.org/abs/1910.08892. arXiv:1910.08892 [stat.ME].
- 81. Bomarito GF, Leser PE, Strauss NCM. et al. Bayesian model selection for reducing bloat and overfitting in genetic programming for symbolic regression. In Trautmann H, Doerr C, Moraglio A. et al. (eds), Proceedings of the 2022 Genetic and Evolutionary Computation Conference Companion (GECCO ’22), pp. 526–529. Association for Computing Machinery (ACM), New York, NY, USA. [Google Scholar]
- 82. AlMomani AARR, Sun J, Bollt E. How entropic regression beats the outliers problem in nonlinear system identification. Chaos: An Interdisciplinary Journal of Nonlinear Science, 30, 013107. 10.1063/1.5133386 [DOI] [Google Scholar]
- 83. Muthyala MR, Sorourifar F, Peng Y. et al. SyMANTIC: an efficient symbolic regression method for interpretable and parsimonious model discovery in science and beyond. Ind Eng Chem Res 2025;64:3354–69. 10.1021/acs.iecr.4c03503 [DOI] [Google Scholar]
- 84. Liu J, Long Z, Wang R. et al. RODE-net: learning ordinary differential equations with randomness from data. 2020. URL: http://arxiv.org/abs/2006.02377. arXiv:2006.02377 [math.NA].
- 85. Ivanchik E, Hvatov A. Knowledge-aware differential equation discovery with automated background knowledge extraction. Inform Sci 2025;712:122131. 10.1016/j.ins.2025.122131 [DOI] [Google Scholar]
- 86. He M, Narayanaswamy A, Riley P. et al. Evolving symbolic density functionals. Sci Adv 2022;8:eabq0279. 10.1126/sciadv.abq0279 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Qian Z, Kacprzyk K, van der Schaar M. D-CODE: Discovering Closed-Form ODEs from Observed Trajectories. In Proceedings of the International Conference on Learning Representations (ICLR 2022). 2022.
- 88. Atkinson S, Subber W, Wang L. et al. Data-driven discovery of free-form governing differential equations. 2019. https://arxiv.org/abs/1910.05117. arXiv preprint arXiv:1910.05117 [cs.CE].
- 89. Ly DL, Lipson H. Learning symbolic representations of hybrid dynamical systems. J Mach Learn Res 2012;13:3585–618. URL: http://jmlr.org/papers/v13/ly12a.html [Google Scholar]
- 90. Tohme T, Liu D, Youcef-Toumi K. GSR: a generalized symbolic regression approach. In Charlin L, Kamath G, Murray N, Shah NB, (eds), Transactions on Machine Learning Research. Journal of Machine Learning Research Inc., New York, NY, USA. 2023. https://openreview.net/forum?id=lheUXtDNvP [Google Scholar]
- 91. Daniels BC, Nemenman I. Automated adaptive inference of phenomenological dynamical models. Nat Commun 2015;6:8133. 10.1038/ncomms9133 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Vladislavleva EJ, Smits GF, Den Hertog. Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans Evol Comput 2009;13:333–49. 10.1109/TEVC.2008.926486 [DOI] [Google Scholar]
- 93. Haider C, de Franca FO, Burlacu B. et al. Shape-constrained multi-objective genetic programming for symbolic regression. Appl Soft Comput 2023;132:109855. 10.1016/j.asoc.2022.109855 [DOI] [Google Scholar]
- 94. Martinelli J, Grignard J, Soliman S. et al. Reactmine: a statistical search algorithm for inferring chemical reactions from time series data. 2022. https://arxiv.org/abs/2209.03185. arXiv:2209.03185 [q-bio.QM].
- 95. Udrescu S-M, Tegmark M. AI Feynman: a physics-inspired method for symbolic regression. Sci Adv 2020;6:eaay2631. 10.1126/sciadv.aay2631 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Udrescu S-M, Tan A, Feng J. et al. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. In Larochelle H, Ranzato M'A, Hadsell R, Balcan M-F, Lin H-T, (eds), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020 (NeurIPS 2020), pages 4860–4871. Curran Associates, Inc., Red Hook, NY, USA. [Google Scholar]
- 97. Weilbach J, Gerwinn S, Weilbach C. et al. Inferring the structure of ordinary differential equations. 2021. https://arxiv.org/abs/2107.07345. arXiv:2107.07345 [cs.LG].
- 98. McRee RK. Symbolic regression using nearest neighbor indexing. In Gustafson S, Kotanchek M, (eds), Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO ’10), pp. 1983–1990. 2010. Association for Computing Machinery (ACM), New York, NY, USA. Conference held in Portland, Oregon, USA. [Google Scholar]
- 99. McConaghy T. FFX: fast, scalable, deterministic symbolic regression technology. In Riolo R, Vladislavleva E, Moore JH (eds), Genetic Programming Theory and Practice IX, pages 235–60. Springer, New York, NY, 2011. ISBN 978-1-4614-1770-5. 10.1007/978-1-4614-1770-5_13 [DOI] [Google Scholar]
- 100. Chen C, Luo C, Jiang Z. Elite bases regression: a real-time algorithm for symbolic regression. In Liu Y, Zhao L, Cai G. et al. (eds), 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD 2017), pp. 529–535. Institute of Electrical and Electronics Engineers (IEEE), Guilin, China; IEEE Circuits and Systems Society. 2017b. 10.1109/FSKD.2017.8393325 [DOI] [Google Scholar]
- 101. de França. A greedy search tree heuristic for symbolic regression. Inform Sci 2018;442-443:18–32. 10.1016/j.ins.2018.02.040 [DOI] [Google Scholar]
- 102. Kartelj A, Djukanović M. RILS-ROLS: robust symbolic regression via iterated local search and ordinary least squares. J Big Data 2023;10:71. 10.1186/s40537-023-00743-2 [DOI] [Google Scholar]
- 103. Bansal M, Gatta GD, di Bernardo. Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinf (Oxf) 2006;22:815–22. 10.1093/bioinformatics/btl003 [DOI] [Google Scholar]
- 104. Sakamoto E, Iba H. Inferring a system of differential equations for a gene regulatory network by using genetic programming. In Kim J-H, Zhang B-T, Fogel GB, (eds), Proceedings of the 2001 Congress on Evolutionary Computation (CEC 2001), Vol. 1, pp. 720–726. IEEE Press, Piscataway, NJ, USA/Seoul, South Korea: IEEE Computational Intelligence Society. 10.1109/CEC.2001.934462 [DOI] [Google Scholar]
- 105. Schmidt MD, Vallabhajosyula RR, Jenkins JW. et al. Automated refinement and inference of analytical models for metabolic networks. Phys Biol 2011;8:055011. 10.1088/1478-3975/8/5/055011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Orzechowski P, La Cava W, Moore JH. Where are we now? A large benchmark study of recent symbolic regression methods. In Hartmann T, O’Reilly U-M, Ochoa G, (eds), Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’18), pp. 1183–1190. Association for Computing Machinery, ACM SIGEVO, ACM Press, New York, NY, USA, 2018.
- 107. de Franca FO, Virgolin M, Kommenda M. et al. SRBench++: principled benchmarking of symbolic regression with domain-expert interpretation. IEEE Transactions on Evolutionary Computation, 29:1127–37. 10.1109/TEVC.2024.3423681 [DOI] [Google Scholar]
- 108. Gilpin W. Chaos as an interpretable benchmark for forecasting and data-driven modelling. In Vanschoren J, Yeung-Levy S, (eds), Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS 2021). Neural Information Processing Systems Foundation, Inc., published by Curran Associates, Inc., Red Hook, NY, USA, 2021.
- 109. Shojaee P, Nguyen N-H, Meidani K. et al. LLM-SRBench: a new benchmark for scientific equation discovery with large language models. In Balcan M-F, Weinberger KQ, Alayrac J-B, Smith V, (eds), Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), Proceedings of Machine Learning Research (PMLR), Vol. 267. Proceedings of Machine Learning Research, Vancouver, BC, Canada, 2025. [Google Scholar]
- 110. Ouma WZ, Pogacar K, Grotewold E. Topological and statistical analyses of gene regulatory networks reveal unifying yet quantitatively different emergent properties. PLoS Comput Biol 2018;14:e1006098–17. 10.1371/journal.pcbi.1006098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc Natl Acad Sci U S A 2016a;113:3932–7. 10.1073/pnas.1517384113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112. Kaptanoglu AA, de Silva, Fasel U. et al. PySINDy: a comprehensive python package for robust sparse system identification. Journal of Open Source Software, 2021a;7:3994. [Google Scholar]
- 113. de Silva, Champion K, Quade M. et al. PySINDy: a python package for the sparse identification of nonlinear dynamical systems from data. J Open Source Software 2020;5:2104. 10.21105/joss.02104 [DOI] [Google Scholar]
- 114. Somacal A, Barrera Y, Boechi L. et al. Uncovering differential equations from data with hidden variables. Phys Rev E 2022;105:054209. 10.1103/PhysRevE.105.054209 [DOI] [PubMed] [Google Scholar]
- 115. Mangan NM, Brunton SL, Proctor JL. et al. Inferring biological networks by sparse identification of nonlinear dynamics. IEEE Trans Mol Biol Multi-Scale Commun 2016;2:52–63. 10.1109/TMBMC.2016.2633265 [DOI] [Google Scholar]
- 116. Brunton SL, Proctor JL, Kutz JN. Sparse identification of nonlinear dynamics with control (SINDYc). IFAC-PapersOnLine 2016b;49:710–5. 10.1016/j.ifacol.2016.10.249 [DOI] [Google Scholar]
- 117. Delahunt CB, Kutz JN. A toolkit for data-driven discovery of governing equations in high-noise regimes. IEEE access 2022;10:31210–34. 10.1109/ACCESS.2022.3159335 [DOI] [Google Scholar]
- 118. França T, Braga AMB, Ayala HVH. Feature engineering to cope with noisy data in sparse identification. Expert Syst Appl 2022;188:115995. 10.1016/j.eswa.2021.115995 [DOI] [Google Scholar]
- 119. Abdullah F, Alhajeri MS, Christofides PD. Modeling and control of nonlinear processes using sparse identification: Using dropout to handle noisy data. Ind Eng Chem Res 2022a;61:17976–92. 10.1021/acs.iecr.2c02639 [DOI] [Google Scholar]
- 120. Fasel U, Kutz JN, Brunton BW, Brunton SL. Ensemble-SINDy: robust sparse model discovery in the low-data, high-noise limit, with active learning and control. Proc R Soc A Math Phys Eng Sci, 478: 20210904, 2022. 10.1098/rspa.2021.0904 [DOI] [Google Scholar]
-
121.
Cortiella A, Park K-C, Doostan A. Sparse identification of nonlinear dynamical systems via reweighted
-regularized least squares. Comput Methods Appl Mech Eng 2021;376:113620. [Google Scholar] - 122. Bhadriraju B, Narasingam A, kwon J s-i. Machine learning-based adaptive model identification of systems: application to a chemical process. Chem Eng Res Des 2019;152:372–83. [Google Scholar]
- 123. Quade M, Abel M, Kutz JN. et al. Sparse identification of nonlinear dynamics for rapid model recovery. Chaos (Woodbury, NY) 2018;28:063116. 10.1063/1.5027470 [DOI] [Google Scholar]
- 124. Schaeffer H, Tran G, Ward R. Extracting sparse high-dimensional dynamics from limited data. SIAM J Appl Math 2018;78, 78:3279–95. 10.1137/18M116798X [DOI] [Google Scholar]
- 125. Wu K, Xiu D, Xiu D. Numerical aspects for approximating governing equations using data. J Comput Phys 2019;384, 384:200–21. 10.1016/j.jcp.2019.01.030 [DOI] [Google Scholar]
- 126. Wentz J, Doostan A. Derivative-based SINDy (DSINDy): addressing the challenge of discovering governing equations from noisy data. Comput Methods Appl Mech Eng 2023;413:116096. [Google Scholar]
- 127. He X, Sun ZK. Sparse identification of dynamical systems by reweighted l1-regularized least absolute deviation regression. Commun Nonlinear Sci Numer Simul, 2024;131:107813. 10.1016/j.cnsns.2023.107813 [DOI] [Google Scholar]
- 128. Jiang F, Lin D, Yang F. et al. Regularized least absolute deviation-based sparse identification of dynamical systems. Chaos 2023;33:013103. 10.1063/5.0130526 [DOI] [PubMed] [Google Scholar]
- 129. Champion K, Zheng P, Aravkin AY. et al. A unified sparse optimization framework to learn parsimonious physics-informed models from data. IEEE Access, 2020;8:169259–71. 10.1109/ACCESS.2020.3023625 [DOI] [Google Scholar]
- 130. Tran G, Ward R. Exact recovery of chaotic systems from highly corrupted data. Multiscale Model Simul 2017;15:1108–29. 10.1137/16M1086637 [DOI] [Google Scholar]
- 131. Mouli SC, Alam MA, Ribeiro B. MetaPhysiCa: OOD robustness in physics-informed machine learning. 2023. https://arxiv.org/abs/2303.03181. arXiv:2303.03181 [cs.LG].
- 132. Lemus J, Herrmann B. Multi-objective SINDy for parameterized model discovery from single transient trajectory data. Nonlinear Dyn 2025;113:10911–27. 10.1007/s11071-024-10825-2 [DOI] [Google Scholar]
- 133. Bertsimas D, Gurnee W. Learning sparse nonlinear dynamics via mixed-integer optimization. Nonlinear Dyn 2023;111:6585–604. 10.1007/s11071-022-08178-9 [DOI] [Google Scholar]
- 134. Carderera A, Pokutta S, Schütte C. et al. CINDy: conditional gradient-based identification of non-linear dynamics–noise-robust recovery. 2021. https://arxiv.org/abs/2101.02630. arXiv:2101.02630 [math.DS].
- 135. Hokanson JM, Iaccarino G, Doostan A. Simultaneous identification and denoising of dynamical systems. SIAM J Sci Comput 2023;45:A1413–37. 10.1137/22M1486303 [DOI] [Google Scholar]
- 136. Zheng P, Askham T, Brunton SL. et al. A unified framework for sparse relaxed regularized regression: SR3. IEEE Access 2018;7:1404–23. [Google Scholar]
- 137. Kaheman K, Brunton SL, Nathan Kutz J. et al. Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data. Machine Learn 2022;3:015031. 10.1088/2632-2153/ac567a [DOI] [Google Scholar]
- 138. Kaheman K, Kutz JN, Brunton SL. SINDy-PI: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics. Proc R Soc A Math Phys Eng Sci 2020;476:20200279. 10.1098/rspa.2020.0279 [DOI] [Google Scholar]
- 139. Abdullah F, Zhe W, Christofides PD. Handling noisy data in sparse model identification using subsampling and co-teaching. Comput Chem Eng 2022b;157:107628. 10.1016/j.compchemeng.2021.107628 [DOI] [Google Scholar]
- 140. Kaiser E, Kutz JN, Brunton SL. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit, Proc R Soc A Math Phys Eng Sci. 2018;474:20180335. 10.1098/rspa.2018.0335 [DOI] [Google Scholar]
- 141. Champion KP, Brunton SL, Kutz JN. Discovery of nonlinear multiscale systems: sampling strategies and embeddings. SIAM J Appl Dyn Syst 2019a;18:312–33. 10.1137/18M1188227 [DOI] [Google Scholar]
- 142. Kaptanoglu AA, Callaham JL, Aravkin A. et al. Promoting global stability in data-driven models of quadratic nonlinear dynamics. Phys Rev Fluids 2021b;6:094401. 10.1103/PhysRevFluids.6.094401 [DOI] [Google Scholar]
- 143. Ukorigho OF, Owoyele OO. A competitive learning approach for specialized models: an approach to modelling complex physical systems with distinct functional regimes. Proc R Soc A Math Phys Eng Sci 2025;481:20240124. 10.1098/rspa.2024.0124 [DOI] [Google Scholar]
- 144. Naozuka GT, Rocha HL, Silva RS. et al. SINDy-SA framework: enhancing nonlinear system identification with sensitivity analysis. Nonlinear Dyn 2022;110:2589–609. 10.1007/s11071-022-07755-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145. Dong X, Bai Y-L, Lu Y. et al. An improved sparse identification of nonlinear dynamics with Akaike information criterion and group sparsity. Nonlinear Dyn 2023;111:1485–510. 10.1007/s11071-022-07875-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146. Maddu S, Cheeseman BL, Müller CL. et al. Learning physically consistent differential equation models from data using group sparsity. Phys Rev E 2021;103:042310. 10.1103/PhysRevE.103.042310 [DOI] [PubMed] [Google Scholar]
- 147. Schaeffer H, Tran G, Ward R. Learning dynamical systems and bifurcation via group sparsity. 2017. https://arxiv.org/abs/1709.01558. arXiv:1709.01558 [math.NA].
- 148. Lu Y, Xu W, Jiao Y. et al. Sparse identification of nonlinear dynamical systems via non-convex penalty least squares. Chaos 2022a;32:023113. 10.1063/5.0076334 [DOI] [PubMed] [Google Scholar]
- 149. Mangan NM, Kutz JN, Brunton SL. et al. Model selection for dynamical systems via sparse regression and information criteria. Proc R Soc A Math Phys Eng Sci 2017;473:20170009. 10.1098/rspa.2017.0009 [DOI] [Google Scholar]
- 150. Gennemark P, Wedelin D. ODEion—a software module for structural identification of ordinary differential equations. J Bioinform Comput Biol 2014;12:1350015. 10.1142/S0219720013500157 [DOI] [PubMed] [Google Scholar]
- 151. Mangan NM, Askham T, Brunton SL. et al. Model selection for hybrid dynamical systems via sparse regression. Proc R Soc A Math Phys Eng Sci 2019;475:20180534. 10.1098/rspa.2018.0534 [DOI] [Google Scholar]
- 152. Gelß P, Klus S, Eisert J. et al. Multidimensional approximation of nonlinear dynamical systems. J Comput Nonlinear Dyn 2019;14:061006. 10.1115/1.4043148 [DOI] [Google Scholar]
- 153. Lejarza F, Baldea M. Data-driven discovery of the governing equations of dynamical systems via moving horizon optimization. Sci Rep 2022;12:11836. 10.1038/s41598-022-13644-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154. Kacprzyk K, Holt S, Berrevoets J. et al. ODE discovery for longitudinal heterogeneous treatment effects inference. 2024.https://arxiv.org/abs/2403.10766. arXiv:2403.10766 [cs.LG].
- 155. Park MJ, Choi Y, Lee N. et al. SpReME: sparse regression for multi-environment dynamic systems. 2023. https://arxiv.org/abs/2302.05942. arXiv:2302.05942 [cs.LG].
- 156. Goyal P, Benner P. Discovery of nonlinear dynamical systems using a Runge–Kutta inspired dictionary-based sparse regression approach. Proc R Soc A Math Phys Eng Sci 2022;478:20210883. 10.1098/rspa.2021.0883 [DOI] [Google Scholar]
- 157. Anvari M, Marasi H, Kheiri H. Implicit Runge–Kutta based sparse identification of governing equations in biologically motivated systems. Sci Rep 2025;15:32286. 10.1038/s41598-025-10526-9 [DOI] [Google Scholar]
- 158. Wu X, McDermott ML, MacLean AL. Data-driven model discovery and model selection for noisy biological systems. PLoS Comput Biol 2025;21:e1012762. 10.1371/journal.pcbi.1012762 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159. Bhouri MA, Perdikaris P. Gaussian processes meet NeuralODEs: a Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data. Philos Transact A Math Phys Eng Sci 2022;380:20210201. 10.1098/rsta.2021.0201 [DOI] [Google Scholar]
- 160. Lee K, Trask N, Stinis P. Structure-preserving sparse identification of nonlinear dynamics for data-driven modeling. In Dong B, Li Q, Wang L, Xu Z-Q J, (eds), Proceedings of Mathematical and Scientific Machine Learning (PMLR Vol. 190), pp. 65–80. Proceedings of Machine Learning Research (PMLR), Microtome Publishing, Vancouver, British Columbia, Canada.
- 161. Messenger DA, Bortz DM. Weak SINDy: Galerkin-based data-driven model selection. Multiscale Model Simul 2021;19:1474–97. 10.1137/20M1343166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162. Pantazis Y, Tsamardinos I. A unified approach for sparse dynamical system inference from temporal measurements. Bioinformatics 2019;35:3387–96. 10.1093/bioinformatics/btz065 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163. Nicolaou ZG, Huo G, Chen Y. et al. Data-driven discovery and extrapolation of parameterized pattern-forming dynamics. Phys Rev Res 2023;5:L042017. 10.1103/PhysRevResearch.5.L042017 [DOI] [Google Scholar]
- 164. Fung L, Fasel U, Juniper M. Rapid Bayesian identification of sparse nonlinear dynamics from scarce and noisy data. Proc R Soc A Math Phys Eng Sci 2025;481:20240200. 10.1098/rspa.2024.0200 [DOI] [Google Scholar]
- 165. Schaeffer H, McCalla SG. Sparse model selection via integral terms. Phys Rev E 2017;96:023302. 10.1103/PhysRevE.96.023302 [DOI] [PubMed] [Google Scholar]
- 166. Wei B. Sparse dynamical system identification with simultaneous structural parameters and initial condition estimation. Chaos, SolitonsFractals 2022;165:112866. 10.1016/j.chaos.2022.112866 [DOI] [Google Scholar]
- 167. Huang Y, Wang H, Liu G. et al. NeuralCODE: neural compartmental ordinary differential equations model with AutoML for interpretable epidemic forecasting. ACM Trans Knowl Discov Data 2025a;19:1–18. 10.1145/3694688 [DOI] [Google Scholar]
- 168. Course K, Nair PB. State estimation of a physical system with unknown governing equations. Nature 2023;622:261–7. 10.1038/s41586-023-06574-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169. Meng Y, Qiu Y. Sparse discovery of differential equations based on multi-fidelity Gaussian process. J Comput Phys 2025;523:113651. 10.1016/j.jcp.2024.113651 [DOI] [Google Scholar]
- 170. Sun L, Huang D, Sun H. et al. Bayesian spline learning for equation discovery of nonlinear dynamics with quantified uncertainty. In Koyejo S, Mohamed S, Agarwal A. et al. (eds), Advances in neural information processing systems 35: 36th Conference on Neural Information Processing Systems (NeurIPS 2022), pp. 6927–6940. Neural Information Processing Systems Foundation, Inc., New Orleans, LA, USA. [Google Scholar]
- 171. Pan W, Yuan Y, Gonçalves J. et al. A sparse Bayesian approach to the identification of nonlinear state-space systems. IEEE Trans Automat Contr 2016;61:182–7. 10.1109/TAC.2015.2426291, 10.1109/TAC.2015.2426291 [DOI] [Google Scholar]
- 172. Zhang S, Lin G. SubTSBR to tackle high noise and outliers for data-driven discovery of differential equations. J Comput Phys 2021;428:109962. 10.1016/j.jcp.2020.109962 [DOI] [Google Scholar]
- 173. Niven RK, Mohammad-Djafari A, Cordier L, Abel M, Quade M. Bayesian identification of dynamical systems. In von Toussaint U, Preuss R, (eds), Proceedings of the 39th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2019), Proceedings 2019, 33(1), article 33. MDPI AG, Basel, Switzerland. 2020. 10.3390/proceedings2019033033 [DOI]
- 174. Kroll TW, Kamps O. Sparse identification of evolution equations via Bayesian model selection. 2025.https://arxiv.org/abs/2501.01476. arXiv:2501.01476 [physics.data-an].
- 175. Fuentes R, Nayek R, Gardner P. et al. Equation discovery for nonlinear dynamical systems: a Bayesian viewpoint. Mech Syst Signal Process 2021;154:107528. 10.1016/j.ymssp.2020.107528 [DOI] [Google Scholar]
- 176. Zhang S, Lin G. Robust data-driven discovery of governing physical laws with error bars. Proc R Soc A Math Phys Eng Sci 2018;474:20180305. 10.1098/rspa.2018.0305 [DOI] [Google Scholar]
- 177. North JS, Wikle CK, Schliep EM. A Bayesian approach for data-driven dynamic equation discovery. J Agric Biol Environ Stat 2022;27:728–47. 10.1007/s13253-022-00514-1 [DOI] [Google Scholar]
- 178. Foo YS, Zanca A, Flegg JA. et al. Quantifying structural uncertainty in chemical reaction network inference. 2025. https://arxiv.org/abs/2505.15653. arXiv:2505.15653 [stat.ME].
- 179. Nayek R, Fuentes R, Worden K. et al. On spike-and-slab priors for Bayesian equation discovery of nonlinear dynamical systems via sparse linear regression. Mech Syst Signal Process 2021;161:107986. 10.1016/j.ymssp.2021.107986 [DOI] [Google Scholar]
- 180. Long D, Xing W, Krishnapriyan A. et al. Equation discovery with Bayesian spike-and-slab priors and efficient kernels. In Dasgupta S, Mandt S, Li Y, (eds), Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (pp. 2413–2421). Proceedings of Machine Learning Research, Vol. 238. PMLR, Valencia, Spain. [Google Scholar]
- 181. Hirsh SM, Barajas-Solano DA, Kutz JN. Sparsifying priors for Bayesian uncertainty quantification in model discovery. R Soc Open Sci 2022;9:211823. 10.1098/rsos.211823 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182. Jiang R, Singh P, Wrede F. et al. Identification of dynamic mass-action biochemical reaction networks using sparse Bayesian methods. PLoS Comput Biol 2022;18:e1009830. 10.1371/journal.pcbi.1009830 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 183. Yang Y, Aziz Bhouri M, Perdikaris P. et al. Bayesian differential programming for robust systems identification under uncertainty. Proc R Soc A Math Phys Eng Sci 2020;476:20200290. 10.1098/rspa.2020.0290 [DOI] [Google Scholar]
- 184. Mower CE, Bou-Ammar H. Al-Khwarizmi: discovering physical laws with foundation models. 2025. https://arxiv.org/abs/2502.01702. arXiv:2502.01702 [cs.LG].
- 185. Bhadriraju B, Bangi MSF, Narasingam A. et al. Operable adaptive sparse identification of systems: application to chemical processes. AIChE J 2020;66:e16980. [Google Scholar]
- 186. Lu PY, Bernad JA, Soljačić M. Discovering sparse interpretable dynamics from partial observations. Commun Phys 2022b;5:1–7. 10.1038/s42005-022-00987-z [DOI] [Google Scholar]
- 187. Champion K, Lusch B, Kutz JN. et al. Data-driven discovery of coordinates and governing equations. Proc Natl Acad Sci U S A 2019b;116:22445–51. 10.1073/pnas.1906995116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188. Nardini JT, Lagergren JH, Hawkins-Daarud A. et al. Learning equations from biological data with limited time samples. Bull Math Biol 2020;82:119. 10.1007/s11538-020-00794-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 189. Bonneau R, Reiss DJ, Shannon P. et al. The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol 2006;7:R36. 10.1186/gb-2006-7-5-r36 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190. Zhang J, Zhu W, Wang Q. et al. Differential regulatory network-based quantification and prioritization of key genes underlying cancer drug resistance based on time-course RNA-seq data. PLoS Comput Biol 2019;15:e1007435. 10.1371/journal.pcbi.1007435 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191. Hoffmann M, Fröhner C, Noé F. Reactive SINDy: discovering governing reactions from concentration data. J Chem Phys 2019;150:025101. 10.1063/1.5066099 [DOI] [PubMed] [Google Scholar]
- 192. Massonis G, Villaverde AF, Banga JR. Distilling identifiable and interpretable dynamic models from biological data. PLoS Comput Biol 2023;19:e1011014. 10.1371/journal.pcbi.1011014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193. Piironen J, Vehtari A. Sparsity information and regularization in the horseshoe and other shrinkage priors. Electron J Stat. 2017;11:5018–51. 10.1214/17-EJS1337SI [DOI] [Google Scholar]
- 194. Kaptanoglu AA, Zhang L, Nicolaou ZG. et al. Benchmarking sparse system identification with low-dimensional chaos. Nonlinear Dyn 2023;111:13143–64. 10.1007/s11071-023-08525-4 [DOI] [Google Scholar]
- 195. Gui S, Li X, Ji S. Discovering physics laws of dynamical systems via invariant function learning. In Azizzadenesheli K, Ribeiro B, Zhang H, (eds), Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), Proceedings of Machine Learning Research, Vol. 267, pp. 20662–20693. Brookline, MA, USA: PMLR. 2025.
- 196. Barabási A-L, Albert R. Emergence of scaling in random networks. Science 1999;286:509–12. 10.1126/science.286.5439.509 [DOI] [PubMed] [Google Scholar]
- 197. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc B Methodol 1996;58:267–88. 10.1111/j.2517-6161.1996.tb02080.x [DOI] [Google Scholar]
- 198. Szklarczyk D, Kirsch R, Koutrouli M. et al. The string database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 2022;51:D638–46. 10.1093/nar/gkac1000 [DOI] [Google Scholar]
- 199. Surdo PL, Iannuccelli M, Contino S. et al. Signor 3.0, the signaling network open resource 3.0: 2022 update. Nucleic Acids Res 2023;51:D631–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 200. Luck K, Kim D-K, Lambourne L. et al. A reference map of the human binary protein interactome. Nature 2020;580:402–8. 10.1038/s41586-020-2188-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 201. Suraj Peri J, Navarro D, Kristiansen TZ. et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res 2004;32:497D–501. 10.1093/nar/gkh070 [DOI] [Google Scholar]
- 202. Milacic M, Beavers D, Conley P. et al. The reactome pathway knowledgebase 2024. Nucleic Acids Res 2023;52:D672–8. 10.1093/nar/gkad1025 [DOI] [Google Scholar]
- 203. Kanehisa M, Furumichi M, Sato Y. et al. KEGG: biological systems database as a model of the real world. Nucleic Acids Res 2024;53:D672–7. 10.1093/nar/gkae909 [DOI] [Google Scholar]
- 204. Karp PD, Billington R, Caspi R. et al. The bioCyc collection of microbial genomes and metabolic pathways. Brief Bioinform 2017;20:1085–93. 10.1093/bib/bbx085 [DOI] [Google Scholar]
- 205. Oughtred R, Rust J, Chang C. et al. The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci 2021;30:187–200. 10.1002/pro.3978 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 206. Stock M, Losert C, Zambon M. et al. Leveraging prior knowledge to infer gene regulatory networks from single-cell RNA-sequencing data. Mol Syst Biol 2025;21:214–30. 10.1038/s44320-025-00088-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 207. Fages F, Gay S, Soliman S. Inferring reaction systems from ordinary differential equations. Theor Comput Sci 2015;599:64–78. 10.1016/j.tcs.2014.07.032 [DOI] [Google Scholar]
- 208. Martinelli J, Dulong S, Li X-M. et al. Model learning to identify systemic regulators of the peripheral circadian clock. Bioinformatics 2021;37:i401–9. 10.1093/bioinformatics/btab297 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 209. Cui H, Wang C, Maan H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods 2024;21:1470–80. 10.1038/s41592-024-02201-0 [DOI] [PubMed] [Google Scholar]
- 210. Hao M, Gong J, Zeng X. et al. Large-scale foundation model on single-cell transcriptomics. Nat Methods 2024;21:1481–91. 10.1038/s41592-024-02305-7 [DOI] [PubMed] [Google Scholar]
- 211. Zhang E, Goto R, Sagan N. et al. LLM-Lasso: a robust framework for domain-informed feature selection and regularization. 2025. https://arxiv.org/abs/2502.10648. arXiv:2502.10648 [cs.LG].
- 212. Gao Y, Xiong Y, Gao X. et al. Retrieval-augmented generation for large language models: A survey. 2024. https://arxiv.org/abs/2312.10997. arXiv:2312.10997 [cs.CL].
- 213. Huang L, Weijiang Y, Ma W. et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Trans Inf Syst 2025b;43:1–55. 10.1145/3703155 [DOI] [Google Scholar]
- 214. Wenteler A, Occhetta M, Branson N. et al. PertEval-scFM: benchmarking single-cell foundation models for perturbation effect prediction. In Singh A, Fazel M, Hsu D. et al. (eds), Proceedings of the 42nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 267, 2025. ML Research Press, Vancouver, Canada. [Google Scholar]
- 215. Wong DR, Hill AS, Moccia R. Simple controls exceed best deep learning algorithms and reveal foundation model effectiveness for predicting genetic perturbations. Bioinformatics, 2025;41:btaf317. Oxford University Press. [Google Scholar]
- 216. Takens F. Detecting strange attractors in turbulence. In Rand DA, Young L-S, (eds), Dynamical Systems and Turbulence, Warwick 1980: Proceedings of a Symposium Held at the University of Warwick 1979/80 (Lecture Notes in Mathematics, Vol. 898, pp. 366–381). Springer-Verlag, Berlin & Heidelberg, Germany. 2006.
- 217. Sadria M, Swaroop V. Discovering governing equations of biological systems through representation learning and sparse model discovery. NAR Genomics Bioinf 2025;7:lqaf048. 10.1093/nargab/lqaf048 [DOI] [Google Scholar]
- 218. Kohn KW. Molecular interaction map of the mammalian cell cycle control and DNA repair systems. Mol Biol Cell 1999;10:2703–34. 10.1091/mbc.10.8.2703 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 219. Reinke H, Asher G. Crosstalk between metabolism and circadian clocks. Nat Rev Mol Cell Biol 2019;20:227–41. 10.1038/s41580-018-0096-9 [DOI] [PubMed] [Google Scholar]
- 220. Whitacre JM. Biological robustness: paradigms, mechanisms, and systems principles. Front Genet 2012;67:1–15. 10.3389/fgene.2012.00067 [DOI] [Google Scholar]
- 221. Young JT, Hatakeyama TS, Kaneko K. Dynamics robustness of cascading systems. PLoS Comput Biol 2017;13:e1005434, 1–17. 10.1371/journal.pcbi.1005434 [DOI] [Google Scholar]
- 222. Hunter P. Understanding redundancy and resilience. EMBO Rep 2022;23:e54742. 10.15252/embr.202254742 [DOI] [Google Scholar]
- 223. Kitano H. Biological robustness. Nat Rev Genet 2004;5:826–37. 10.1038/nrg1471 [DOI] [PubMed] [Google Scholar]
- 224. Malik-Sheriff RS, Glont M, Nguyen TVN. et al. BioModels—15 years of sharing computational models in life science. Nucleic Acids Res 2020;48:D407–15. 10.1093/nar/gkz1055 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 225. Hesse J, Martinelli J, Aboumanify O. et al. A mathematical model of the circadian clock and drug pharmacology to optimize irinotecan administration timing in colorectal cancer. Comput Struct Biotechnol J 2021;19:5170–83. 10.1016/j.csbj.2021.08.051 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No datasets have been utilized in this review paper.

















