Skip to main content
CPT: Pharmacometrics & Systems Pharmacology logoLink to CPT: Pharmacometrics & Systems Pharmacology
. 2024 Jul 11;13(8):1289–1296. doi: 10.1002/psp4.13149

Bridging pharmacology and neural networks: A deep dive into neural ordinary differential equations

Idris Bachali Losada 1, Nadia Terranova 1,
PMCID: PMC11330178  PMID: 38992975

Abstract

The advent of machine learning has led to innovative approaches in dealing with clinical data. Among these, Neural Ordinary Differential Equations (Neural ODEs), hybrid models merging mechanistic with deep learning models have shown promise in accurately modeling continuous dynamical systems. Although initial applications of Neural ODEs in the field of model‐informed drug development and clinical pharmacology are becoming evident, applying these models to actual clinical trial datasets—characterized by sparse and irregularly timed measurements—poses several challenges. Traditional models often have limitations with sparse data, highlighting the urgent need to address this issue, potentially through the use of assumptions. This review examines the fundamentals of Neural ODEs, their ability to handle sparse and irregular data, and their applications in model‐informed drug development.

INTRODUCTION

In the domain of clinical pharmacology, the modeling of pharmacological processes and systems presents unique challenges and opportunities. To illustrate this, population modeling uses different methodologies to simulate patient clinical outcomes, with statistical models and their interpretability well support decision‐making in drug discovery and development. 1 As an example of a model, nonlinear mixed effect (NLME), has been carefully calibrated by making pharmacological assumptions and incorporating a restricted but significant set of covariates explaining part of the observed variability. Additionally, combining NLME models with ordinary differential equations (ODEs), has significantly contributed to the field of drug development, 2 by including mechanistic and well‐understood dynamics based on biological assumptions. Nevertheless, such models are unable to handle high dimensional covariate space input compared with traditional machine learning (ML) models. Moreover, ML modeling like ensemble methods, mainly tree‐based and gradient boosting models, have proven to deal with complex dynamics and to be extremely valuable in model‐informed drug development. 3

Within the ML field, we have a subcategory of models classified as deep learning (DL) architectures. They are more advanced methods based on the concept of neural networks (NNs). They consist of layers connected to each other by neurons, able to process both static and longitudinal data as input over time. NN architectures can be composed by dense and hidden layers, where nonlinear transformations within each neuron are applied. Furthermore, a hidden function within NNs is represented by introducing nonlinearity into the NNs decision‐making process during the training. To be more specific, for NN dealing with longitudinal data or time series, this function is computed sequentially between each consecutive time step, summarizing the most relevant information retained over time from a longitudinal input data. However, when it comes to handle sparse and irregularly sampled data, NNs present several limitations.

Recent work explored solutions to deal with sparsity and irregular sampling of data points by combining DL models with ODEs. Neural ODEs 4 represent a potential advanced solution. 5 Moreover, they are offering a unique interpretation of the hidden state as a function based on a previous hidden state function and longitudinal input data over time, as shown in the Equation (1) below:

ht=fW·ht1+U·xt+b, (1)

where ht is the hidden state, a working memory capability that carries information from a sequential input data xt (e.g., list of covariates) from previous events and overwrites at every step t. W and U are weight matrices that are learned during training, b is a bias vector and f is an activation function, a nonlinear function (e.g., sigmoid, tanh…) that decides whether a neuron should be activated or not within the NNs.

Compared with NNs models, Neural ODEs 4 are incorporating ODEs concept to generate the dynamic of hidden function. For this purpose, a fully connected‐layer architecture that connects every input neuron to every output neuron is integrated within Neural ODEs. 3 This provides a surrogate function that mimics the ODE‐dynamics gnn, as illustrated by the following Equation (2):

dhtdt=gnnhttθ, (2)

where gnn approximates the ODE and replicates the ODE‐dynamics, θ represents the parameters of the model used during the training, ht contains some relevant patterns learned from the input data, gnn is mainly used to solve the different states of the hidden function over time to make a prediction at individual level for future measurements.

Within the Quantitative Pharmacology community, explainability, and interpretability of DL models are fundamental. There is a constant need to innovate with advanced models that bring understandable and quantifiable concepts like combining NNs with ODEs. As a reference, Neural ODEs 4 offer several advantages and applications compared with NNs models. For example, the application of explainable DL providing understanding of the factors influencing predictions, has been shown to be relevant for linking tumor dynamics and overall survival. 6 By deriving an explainable variable “kinetic rate” of the tumor behavior at the patient level, it is possible to interpret how the kinetic rate influences the patient survival outcome. Additionally, another context illustrating that the interpretability of Neural ODEs is in the field of PK, where complex body‐related behaviors are mechanistically translated into ODEs based on multiple variables. Indeed, the use of low‐dimensional Neural ODEs can be beneficial in simplifying the complexity of the ODE‐dynamics, offering advanced tools for application in PK modeling. 7

On the other hand, Neural ODEs present new challenges, including computational complexity and difficulties in training because of their continuous nature. We underscore the need for further investigations, specifically in clinical pharmacology use cases. In this mini review, we will outline the primary workflow of these models and highlight some applications in clinical pharmacology.

THEORETICAL

Different variations of the basic structure of Neural ODEs have been proposed in literature (Table 1). Stochastic Neural ODEs 8 introduce stochasticity based on stochastic differential equations, 9 supporting modeling of data having random noise, biological variability among individuals or uncertainties that are mainly approximations or assumptions made during the development of the model. Controlled Neural ODEs 10 integrate a control function, learned by the model, which dictates how the dynamics of the ODE evolve over time. Finally, Latent Neural ODEs 11 are a specific application of Neural ODEs for sparse and irregularly sampled time‐series modeling using longitudinal data as input. They are classified as probabilistic models and mainly based on the concept of Gaussian processes, a nonparametric supervised learning method used to solve regression and probabilistic classification problems. The main purpose of a Latent Neural ODE is to learn and approximate a surrogate model for the underlying ODE by deriving latent representation of the longitudinal input data, called “latent variables.” As shown in Figure 1, Latent Neural ODEs incorporate an input neural network (encoder) that processes the longitudinal input data into a dense representation within the hidden layers and converge to Gaussian processes 13 for approximating ODEs. 14 From this step, latent trajectories are created at low‐dimensional level for each patient. These correspond to different states of the latent variables that are estimated by a Gaussian sampling from the encoder, 13 as mentioned below through sampling procedure:

zt0pzt0=N0,1 (3)
dztdt=fnnzttθfnn, (4)
wherezt1,zt2,ztN=ODESolvezt0fnnθfnnt0tN

TABLE 1.

A summary of the different categories of Neural ODEs cited in scientific literature.

Types of Neural ODEs Characteristics Opportunities Challenges Main references
Neural ODE

Focusing on continuous‐time modeling of the data using ODEs

Modeling deterministic dynamics, meaning the output is fully determined by the input and learned parameters

Automatically adapting the time steps during integration, which allows them to handle data with irregular time intervals

Memory‐efficient because they do not require storing intermediate hidden states for each time step

Training can be computationally intensive, especially for large datasets

Inability to capture fast‐changing dynamics

Chen, Ricky TQ, et al. “Neural ordinary differential equations.” Advances in neural information processing systems 31 (2018)
Stochastic Neural ODE Introducing stochasticity into the model, allowing probabilistic modeling and uncertainty estimation

Allowing probabilistic modeling, making them suitable for tasks where uncertainty is relevant

Providing uncertainty estimates for model predictions

Incorporating stochasticity will add more complexity to the model, making harder to train it Liu, Xuanqing, et al. “Neural sde: Stabilizing neural ode networks with stochastic noise.” arXiv preprint arXiv:1906.02355 (2019)
Controlled Neural ODE Allowing for the inclusion of explicit control inputs, making them suitable of control and optimization tasks

Adapting control inputs in real‐time to respond to changing external conditions

Designed with the ability to control and manipulate the system's dynamics

Adding control inputs increases the complexity of the model Kidger, Patrick, et al. “Neural controlled differential equations for irregular time series.” Advances in Neural Information Processing Systems 33 (2020): 6696–6707
Latent Neural ODE Introducing latent variables to capture hidden or unobservable features in the data

For generative modeling, allowing the generation of data points similar to the training data

They are suited for tasks where understanding underlying structures or generative processes is essential

Introducing latent variables can increase the complexity and make training more challenging

The interpretation of latent variables and their impact on model behavior can be less intuitive to traditional models

Rubanova, Yulia, Ricky TQ Chen, and David K. Duvenaud. “Latent ordinary differential equations for irregularly sampled time series.” Advances in neural information processing systems 32 (2019)

FIGURE 1.

FIGURE 1

Neural latent ODE 10 illustration for PK time course prediction for each individual patient. The input consists of including, TFDS, the time in hours between each dose; TIME, the time in hours since the start of the treatment; AMT, the dosing amount in milligrams; CYCL, the current dosing cycle number and finally the PK_Cycle1 as the first cycle of observation. 12 The final hidden state summarizes key information learned. Gaussian sampling derives a latent variable from this state, offering insights into individual patient behavior for precise clinical outcome predictions. Dosing details and the first 20 PK values are added in the Neural ODE decoder, enhancing model accuracy.

zt0 is the initial condition at the timestep t0 of the latent variable zt, zt0 is sampled by a normal distribution with mean μ=0 and standard deviation σ=1,fnn is a fully connected‐layer network. The latter is part of the encoder and computing dztdt, which precisely represents the substituted ODE with the mean and standard deviation parameters being estimated. 11 , 13 θfnn are weights and biases learned by fnn.

Finally, an output neural network (decoder) is used to transform the evolved latent states back into the data space where the prediction shows up. This process allows Latent Neural ODEs 11 to interpolate (fit) or extrapolate (predict) time series. In this review, we will explore the Latent Neural ODEs, 11 renowned for its high predictive power, to make predictions at the patient level using longitudinal clinical trial datasets.

RNNs AS ENCODERS

Latent Neural ODEs are processing the longitudinal data in input using encoders. These are typically based on Recurrent Neural Network (RNNs), 15 a DL architecture that is trained to process sequential data, encoding the temporal, heterogenous, and sparse representation of input features or covariates over time, for multivariate time series prediction. RNNs as encoders are composed by different layers where longitudinal data are processed sequentially over time. 15 The encoder provides a global representation of the most relevant patterns learned across all the patient population. Furthermore, these RNNs architectures are handling informative missingness 16 , 17 and sparsity of clinical data with less assumptions compared with traditional pharmacological models where forward fill—backward fill and mean average imputations are mainly utilized. Indeed, RNNs manage sparsity by using a binary mask, 16 an indicator to the model for identifying whether or when covariates are measured or not. This provides a clearer view of the clinical trial dataset structure, providing the model a more thorough understanding of the data. This opens opportunities to employ such methods in real clinical data context where the sparsity represents the main complexity for time‐series modeling. Indeed, RNNs are extensively employed in the field of time‐series modeling, with Gated Recurrent Units (GRUs) 18 and Gated Recurrent Unit—Decay (GRUDs) able to include missingness as a feature when sparsity is ubiquitous across the data. GRUs are particularly adept at time‐series modeling by incorporating gating mechanisms that regulate information flow, as shown in Figure 2, and thereby allowing the retention of relevant information over longer sequences. This ultimately enables adaptability to the irregular time intervals commonly seen in sparse data. 19 , 20

FIGURE 2.

FIGURE 2

Depiction of a gated recurrent unit (GRU) 17 in our RNN encoder for irregular data. The GRU assigns weights to covariates. Accurate predictions adjust these weights using stochastic gradient descent, reducing the loss function. High loss may lead to certain covariates getting reduced weights or being deemed irrelevant. This RNN is receiving as input a covariate vector X and the hidden state function at the visit t‐k. This architecture consists of a reset gate (Pink) to discard non‐essential data, an update gate (Green), and a candidate hidden state gate (Purple) to retain pivotal covariates for predicting future clinical outcomes. Different mathematical steps are computed within GRU architecture where some activation functions are represented like tangent hyperbolic function (Tanh) and a random activation function sigma (σ).

GRUDs, on the other hand, explicitly models the decay of information over time, making it especially effective in scenarios where data are not only sparse but also missing at random—a typical situation in healthcare and other contexts. The GRUD model incorporates the time elapsed since the last observation as a structural input feature to update the hidden state function computed at each time step. This encoder helps preserve the time‐related patterns within the data, even when dealing with missing values. 17

USING AN ODE SOLVER TO RECONSTRUCT THE TRAJECTORY OF AN UNDERLYING DISEASE VARIABLE

The challenge of solving complex dynamics behind ODEs is also fundamental, particularly in Neural ODEs. Different solutions are mainly conceived, such as adaptive solvers and augmented Neural ODEs 12 that extend the model state with additional dimensions to learn more complex ODEs, solved from dtod+p where every data point is concatenated with a vector of zeros. 12 This method is employed to simplify the resolution of ODEs by avoiding intersections in the ODE‐flow 12 within d. This continuous‐time modeling allows Neural ODEs to adapt to the irregularity of data, representing a more flexible approach to deal with sparse and irregular data compared with traditional ML models.

INTERPRETABILITY OF THE LATENT NEURAL ODE

The interpretability of the latent dynamics as explainable 6 variables is crucial as it can effectively underline the intricate relationship between covariates and the clinical outcome. The dynamics of the latent variables for each individual patient are identified by solving the ODE, making an interpolation across the first observations of the patient outcome, and extending those dynamics by an extrapolation of the patient outcome beyond the observable horizon. In theory, Latent Neural ODEs 11 are complex DL architectures coupled with ODEs, where the latent variable zt is defined by sampling using Gaussian processes. 13 , 14 The mean and standard deviation parameters are estimated from the final layer of an RNN encoder as exhibited in Figure 3, by computing the posterior probability distribution q to obtain parameters of distribution, as cited by the following Equations (5) and (6):

qzt0xtitiiθfnn=Nzt0μzt0σzt0, (5)

where μzt0,σzt0 comes from hidden state of RNNxtitiiθfnn

Samplezt0qzt0xtitii (6)

FIGURE 3.

FIGURE 3

Outlined Steps: 1. RNN encoder 17 processes sequences of covariates for all patients, calculating a hidden state between consecutive timesteps. 2. The final hidden function captures crucial information from the patient population, identifying key covariates at a population level. 3. Gaussian sampling introduces variability, creating patient‐specific scenarios to refine individual predictions, generating initial conditions for the latent variable zt. 10 , 12 4. These initial conditions help determine individual latent trajectories for each patient, enabling patient‐level clinical outcome predictions.

The initial condition zt0 is estimated from the latent variable for each patient to solve the ODE and fully reconstruct individual latent trajectories aiming the extrapolation of clinical outcomes for future clinical measurements, as shown in Figure 4. In essence, through the strategic combination of RNN‐based encoders and the inherent capabilities of ODEs, these models effectively handle the sparsity and heterogeneity of clinical data, providing a robust and flexible framework for modeling complex dynamics within these challenging contexts.

FIGURE 4.

FIGURE 4

The Gaussian sampling of the final hidden state sets the initial condition zt0 for each patient's latent variable zt. 10 , 12 The ODE solver determines individual latent trajectories for patient‐level clinical outcome predictions. These outcomes are refitted over blue dots and extrapolated across yellow dots for multiple visits per patient. 10 , 12

APPLICATION OF NEURAL ODEs IN CLINICAL PHARMACOLOGY

The mathematical modeling of drug kinetics—PK and pharmacodynamics (PD) is inherently complex because of the many interacting physiological variables. Frequently used in pharmacometrics, NLME models are limited by the amount of covariates information incorporated into the modeling, making harder to identify the top relevant predictors from a high dimensional input data and predict the clinical outcome itself. Neural ODEs offer a fresh perspective by dealing with large datasets, specifically heterogenous, sparse and multimodal data. Case studies involving the application of Neural ODEs in these areas have demonstrated promising results, although more research is required to validate these findings in diverse clinical settings.

Using an example from the clinical pharmacology field referenced by, 21 Neural ODEs are used for predicting PK for individual patients. Drug dosing and PK measurements in patients are collected at irregular time intervals, adding complexity compared with traditional NNs architectures where time‐varying gaps between consecutive visits across different patients are toughly assimilated and learned by traditional NNs. 15 To address this, the Latent Neural ODE 11 model is potentially a DL solution for fitting irregularly sampled data points over time when additional information is provided continuously over time to make prediction beyond observable data. 21 For example, the main objective of PK Neural ODEs‐based models is to predict PK values at the individual level by adding linearly specific dose regimens information 21 into the ODE, as shown in Figure 1. This approach allows the model to enhance its flexibility when the treatment may change or be stopped, 21 enabling more tailored predictions at the individual level. It yields promising results when compared with ML techniques like LightGBM, a tree‐boosting model and long short‐term memory (LSTM), a type of RNN. 21 Indeed, Latent Neural ODE 11 can accurately reproduce and extrapolate PK trajectories for individual patients even if the dose is discontinued. 21 Advantages of this Latent Neural ODE 11 compared with NLME model are its ability to approximate and learn the ODE at the patient‐level 11 by including high dimensional input covariates into the modeling.

An extension of this work was to develop a pharmacology‐informed neural ODE‐based model, illustrating a more complex approach where both PK and PD data are utilized, 22 to simulate patient responses to untested dosing regimens by bridging PK dynamics with PD predictions for each individual patient, this pharmacology‐guided NN encompasses all the relevant pharmacological mechanistic aspects to predict accurately PD values within an easy‐to‐use workflow. Furthermore, the use of Neural ODEs 4 compared with NLME modeling, simplifies the set of multiple mechanistic and pharmacological assumptions to a low‐dimensional dynamical system where ODEs are solved at the patient level and reducing complexity. 7

In practice, pharmacological models can describe dynamics of carefully chosen variables through complex ODEs system. However, these models can incorporate limited and meaningful interpretations of variable behaviors that are often not observable in clinical environments, but observable in laboratory setting, 23 these variables are called “Expert variables.” This problem is addressed in 23 where the authors developed a pharmacology‐informed neural ODE for COVID‐19 disease progression modeling integrating the dynamics described by the pharmacological model for dexamethasone (Dex) with 5 Expert variables 23 (e.g., innate immune response, Dex concentration in Lung, Dex Concentration in Plasma, Viral Load and Adaptive Immune Response). This approach aims to give some guidance and prior knowledge to inform the model by incorporating both expert and latent variables in clinical environment through ODEs. This use case is highlighting the opportunity to be able to customize the Latent Neural ODE using a mechanistic approach represented by expert variables. 23 This use case illustrates the utility to employ a pharmacology‐informed DL‐based model within a clinical environment for improving clinical decisions.

Regarding technical limitations in the context of Neural ODEs, 3 a frequently encountered issue is overfitting, 24 particularly in cases where the dataset size is limited. Overfitting happens when a model, too intricate compared with its training data, ends up learning patterns specific to that data and fails to accurately predict new, unseen data. On effective countermeasure is the pretraining of the Neural ODE. This process involves training the model first on a broad and varied dataset. Such an approach equips the model with a more general understanding, laying a solid foundation before refining it for a specific task. Moreover, employing data augmentation 25 methods, which increase the size of the training dataset through the generation of altered data versions, proves beneficial in preventing overfitting, especially when acquiring additional data are impractical.

CONCLUSION

Neural ODEs hold great promise for transforming healthcare data analytics, but several unanswered questions remain. These models suffer from overfitting when the sample size is small. Searching for methods that utilize the pretraining of the Neural ODE is essential to combat overfitting and build a sustainable AI framework by using data augmentation or looking for more rich datasets. Also, opportunities to further explore and leverage applications of these methods are concrete within the clinical pharmacology field. As an example, these methods can offer new venues in advancing clinical oncology by enhancing current approaches for the prediction of total tumor size modeling and even more individual target lesion sizes (iTLs) over time by using. The use of pharmacology‐informed models combining lab values, biomarkers, and treatment plan for prediction of clinical outcomes and longitudinal endpoints plays a significant role to advance personalized medicine and data‐driven decision‐making. Despite several challenges, Neural ODEs offer a promising new avenue for the value‐driven application of AI in healthcare, particularly in complex domains like clinical pharmacology and model‐informed drug development.

FUNDING INFORMATION

No funding was received for this work.

CONFLICT OF INTEREST STATEMENT

Idris Bachali Losada is an employee of Randstad and contributed as a paid contractor for the Merck Quantitative Pharmacology, Ares Trading SA (an affiliate of Merck KGaA, Darmstadt, Germany), Lausanne, Switzerland. Nadia Terranova is an employee of Merck Quantitative Pharmacology, Ares Trading SA (an affiliate of Merck KGaA, Darmstadt, Germany), Lausanne, Switzerland.

Losada IB, Terranova N. Bridging pharmacology and neural networks: A deep dive into neural ordinary differential equations. CPT Pharmacometrics Syst Pharmacol. 2024;13:1289‐1296. doi: 10.1002/psp4.13149

REFERENCES

  • 1. Song L, He CY, Yin NG, Liu F, Jia YT, Liu Y. A population pharmacokinetic model for individualised dosage regimens of vancomycin in Chinese neonates and young infants. Oncotarget. 2017;8(62):105211‐105221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Tornøe CW, Agersø H, Jonsson EN, Madsen H, Nielsen HA. Non‐linear mixed‐effects pharmacokinetic/pharmacodynamic modelling in NLME using differential equations. Comput Methods Prog Biomed. 2004;76(1):31‐40. [DOI] [PubMed] [Google Scholar]
  • 3. Keutzer L, You H, Farnoud A, et al. Machine learning and pharmacometrics for prediction of pharmacokinetic data: differences, similarities and challenges illustrated with rifampicin. Pharmaceutics. 2022;14(8):1530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chen RTQ, Rubanova Y, Betterncourt J, Duvenaud D. Neural ordinary differential equations. NeurIPS. 2018;31. 10.48550/arXiv.1806.07366 [DOI] [Google Scholar]
  • 5. Terranova N, Renard D, Shahin MH, et al. Artificial intelligence for quantitative modeling in drug discovery and development: an Innovation & Quality (IQ) consortium perspective on use cases and best practices. Clin Pharmacol Ther. 2023;115:658‐672. doi: 10.1002/cpt.3053 [DOI] [PubMed] [Google Scholar]
  • 6. Laurie M, Lu J. Explainable deep learning for tumor dynamic modeling and overall survival prediction using neural‐ODE. NPJ Syst Biol Appl. 2023;9(1):58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Bräm DS, Nahum U, Schropp J, Pfister M, Koch G. Low‐dimensional Neural ODEs and their application in pharmacokinetics. J Pharmacokinet Pharmacodyn. 2023;14:1‐8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Xuanqing L, Xiao T, Si S, Cao Q, Kumar S, Hsieh CJ. Neural sde: Stabilizing neural ode networks with stochastic noise. 2019: arXiv preprint arXiv:1906.02355.
  • 9. Leander J, Jirstrand M, Eriksson UG, Palmér R. A stochastic mixed effects model to assess treatment effects and fluctuations in home‐measured peak expiratory flow and the association with exacerbation risk in asthma. CPT Pharmacometrics Syst Pharmacol. 2022;11(2):212‐224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Kidger P, Morrill J, Foster J, Lyons T. Neural controlled differential equations for irregular time series. Adv Neural Inf Proces Syst. 2020;33:6696‐6707. [Google Scholar]
  • 11. Rubanova Y, Chen RTQ, Duvenaud D. Latent ODEs for irregularly‐sampled time series. Adv Neural Inf Proces Syst. 2019;32. 10.48550/arXiv.1907.03907 [DOI] [Google Scholar]
  • 12. Dupont E, Doucet A, Teh YW. Augmented Neural ODEs. Advances in Neural Information Processing Systems. NeurIPS; 2019:32. 10.48550/arXiv.1801.01236 [DOI] [Google Scholar]
  • 13. Yang G. Wide feedforward or recurrent neural networks of any architecture are Gaussian processes. Adv Neural Inf Proces Syst. 2019;32. 10.48550/arXiv.1910.12478 [DOI] [Google Scholar]
  • 14. Raissi M, Perdikaris P, Karniadakis GE. Multistep neural networks for datadriven discovery of nonlinear dynamical systems. arXiv preprint arXiv:1801.01236 2018a.
  • 15. Che S, Purushotham S, Cho K, Sontag D, Liu Y. Recurrent neural networks for multivariate time series with missing values. 2016. arXiv:1606.01865. [DOI] [PMC free article] [PubMed]
  • 16. Rodenburg FJ, Yoshihide S, Nobuhiro H. Improving RNN performance by modelling informative missingness with combined indicators. Appl Sci. 2019;9(8):1623. [Google Scholar]
  • 17. Che Z, Purushotham S, Cho K, Sontag D, Liu Y. Recurrent neural networks for multivariate time series with missing values. Sci Rep. 2018;8:6085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Dey R, Salem FM. Gate‐variants of gated recurrent unit (GRU) neural networks. IEEE 60th international midwest symposium on circuits and systems. MWSCAS 2017. 10.48550/arXiv.1701.05923 [DOI] [Google Scholar]
  • 19. Pachal S, Achar A. Sequence prediction under missing data: an RNN approach without imputation. Proceedings of the 31st ACM International Conference on Information and Knowledge Management. Association for Computing Machinery; 2022:1605‐1614. [Google Scholar]
  • 20. Kim YJ, Chi M. Temporal belief memory: imputing missing data during RNN training. Proceedings of the 27th International Joint Conference on Artificial Intelligence. IJCAI; 2018:2326‐2332. [Google Scholar]
  • 21. Lu J, Deng K, Zhang X, Liu G, Guan Y. Neural‐ODE for pharmacokinetics modeling and its advantage to alternative machine learning models in predicting new dosing regimens. Iscience. 2021;24(7):102804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Lu J, Bender B, Jin JY, Guan Y. Deep learning prediction of patient response time course from early data via neural‐pharmacokinetic/pharmacodynamic modelling. Nat Mach Intell. 2021;3(8):696‐704. [Google Scholar]
  • 23. Qian Z, Zame WR, Fleuren LM, Elbers P, van der Schaar M. Integrating expert ODEs into Neural ODEs: pharmacology and disease progression. Advances in Neural Information Processing Systems. Vol 34. NeurIPS; 2021:11364‐11383. [Google Scholar]
  • 24. Zhang H, Zhang L, Jiang Y. Overfitting and underfitting analysis for deep learning based end‐to‐end communication systems. 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP). IEEE; 2019:1‐6. [Google Scholar]
  • 25. Lashgari E, Liang D, Maoz U. Data augmentation for deep‐learning‐based electroencephalography. J Neurosci Methods. 2020;346:108885. [DOI] [PubMed] [Google Scholar]

Articles from CPT: Pharmacometrics & Systems Pharmacology are provided here courtesy of Wiley

RESOURCES