Abstract
Differential equations-based epidemic compartmental models and deep neural networks-based artificial intelligence (AI) models are powerful tools for analyzing and fighting the transmission of COVID-19. However, the capability of compartmental models is limited by the challenges of parameter estimation, while AI models fail to discover the evolutionary pattern of COVID-19 and lack explainability. This paper aims to provide a novel method (called Epi-DNNs) by integrating compartmental models and deep neural networks (DNNs) to model the complex dynamics of COVID-19. In the proposed Epi-DNNs method, the neural network is designed to express the unknown parameters in the compartmental model and the Runge–Kutta method is implemented to solve the ordinary differential equations (ODEs) so as to give the values of the ODEs at a given time. Specifically, the discrepancy between predictions and observations is incorporated into the loss function, then the defined loss is minimized and applied to identify the best-fitted parameters governing the compartmental model. Furthermore, we verify the performance of Epi-DNNs on the real-world reported COVID-19 data on the Omicron epidemic in Shanghai covering February 25 to May 27, 2022. The experimental findings on the synthesized data have revealed its effectiveness in COVID-19 transmission modeling. Moreover, the inferred parameters from the proposed Epi-DNNs method yield a predictive compartmental model, which can serve to forecast future dynamics.
MSC: 34A34, 68T07
Keywords: Compartmental models, Deep neural networks, Parameter estimation, Runge–Kutta method, COVID-19
Highlights
-
•
Compartmental models with time-varying parameters capture COVID-19 dynamics well.
-
•
Modeling COVID-19 dynamics by coupling the compartmental model and neural networks.
-
•
Using neural networks to express time-varying parameters in compartmental models.
-
•
Applying Fourier transformation to reduce the stochastic and noise of real-world data.
-
•
Analyzing the effect of intervention policies and providing predictions.
1. Introduction
Epidemic compartmental models categorized the population into different compartments to analyze the transmission dynamics of infectious diseases based on disease status, serving as an incredibly powerful tool for detecting, understanding, and combating outbreaks [1]. Kermack and McKendrick first constructed a fundamental susceptible–infected–recovered (SIR) compartmental model to study the transmission dynamics of the Black Death in London, the United Kingdom in the year 1927 [2]. Since the start of COVID-19, SIR compartmental model and its variants Susceptible–Exposed–Infectious–Removed (SEIR) [3], Susceptible–Infected–Recovered–Death (SIRD) [4], [5], [6], Susceptible– Exposed–Infectious–Quarantine–Removed (SEIQR) [7], Susceptible– Exposed–Infectious–Hospitalized–Removed (SEIHR) [8], et al. have been at the forefront of studying the transmission dynamics of COVID-19 outbreak and the impact of various non-pharmaceutical interventions. These compartmental models are formulated by a system of ordinary differential equations (ODEs), which are characterized by a set of parameters that are not known prior and required to be identified from data. Therefore, parameter estimation methods are frequently required to compute the parameters of the flow from one compartment to another in these compartmental models. In addition, compartmental models assume constant parameters of the flow from one compartment to another to reduce the complexity of modeling. Many research efforts focus on parameter estimation of compartmental models using maximum likelihood estimation, Markov Chain Monte Carlo (MCMC)-based Bayesian inference [9], [10], [11], and finite element methods [12]. However, these methods suffer from significant limitations which hinder their applications. One limit is that the computational cost of numerical simulations increases exponentially with the complexity of the parameters and models. Another limitation is that these parameter estimation methods are only suited for time-constant parameters, which fail to reflect the complex dynamic of the infectious disease over time in real-world scenarios.
Artificial intelligence(AI), especially deep neural networks (DNNs) models also played an important role in analyzing and fighting the transmission of the COVID-19 epidemic [13], [14]. Despite AI models having great power to fit the data and provide short-time prediction, two main weaknesses hinder their practical applications. One is that they cannot find the patterns of the disease transmission process and will suggest not reasonable predictions due to ignoring biological reality. In addition, they depend heavily on the quality and quantity of data, the model may not be useful if the data do not reasonably capture reality. Compartmental models and AI models have both been shown to be reliable tools in fighting against the COVID-19 pandemic, along with their corresponding limitations, respectively. Therefore, exploring how to combine compartmental models and AI models to enhance their performance is a promising research topic. Recently, Physics-informed Neural Networks (PINNs) approaches have shown success in combining differential equations into the neural networks to satisfy the equations while accurately fitting the data [15]. That is, using neural networks to model nonlinear systems, but reducing the required data and constraining the model’s search space with prior knowledge such as a system of differential equations. Since then, DNNs-based models are consistently used as the non-linear function approximation method and have shown their strong potential to address various science computing tasks in many fields. Additionally, several research efforts have attempted to apply the PINNs framework in modeling and analyzing the dynamics of COVID-19 [16], [17], [18], [19], [20].
The concept of PINNs was first proposed for time-dependent partial differential equations (PDEs), which provide a flexible computational framework to address various science computing tasks. Inspired by PINNs, we proposed an Epi-DNNs method to model the complex outbreak dynamics of COVID-19 by integrating real-world data, epidemic transmission laws, and numerical ODE solvers into DNNs. Specifically, we build DNNs to express the unknown parameters in the compartmental model and introduce a numerical ODE solver to solve the corresponding ODEs so as to give the values of the ODEs at a given time. The discrepancy between predictions and observations is formulated as the loss function, which is minimized and applied to identify the best-fitted parameters governing the compartmental model. We verify the effectiveness of the proposed Epi-DNNs method on the COVID-19 reported data in the real world across several regions. The findings of simulation experiments demonstrated that the proposed Epi-DNNs robustly perform data-driven parameter estimation for the COVID-19 transmission modeling. Thus, the main contributions of this paper are as follows:
-
•
To efficiently respond to the complexity of infectious disease transmission dynamics in the real world, we propose a method that combines mathematical modeling and neural network modeling. The proposed method considers the coefficients of the epidemic compartmental model as time-varying parameters that provide an accurate capture of transmission dynamics and reliable predictions.
-
•
We build separate neural networks for each time-varying parameter in the epidemic compartmental model respectively and perform Fourier transformation for the input data to reduce the inherent stochastic and noisy nature of real-world data. The proposed method offers a feasible way to efficiently estimate time-varying parameters, instead of handling time-varying parameters by dividing them into different time intervals, as conventional parameter estimation methods are limited to.
-
•
We apply the proposed method to real-world reported COVID-19 data to validate its effectiveness. More importantly, the proposed Epi-DNNs approach can be easily adapted to other compartmental models, providing a convenient way to model and analyze the dynamics of infectious disease transmission in any region.
The remaining of this paper is organized as follows: in Section 2, we briefly present the related works of COVID-19 transmission modeling. In Section 3, we introduce the Fourier-induced neural networks, the SIRD compartmental model, and the overview of the proposed Epi-DNNs method as well as its implementation details. In Section 4, we present simulation results based on the real-world reported data. Then, in Section 5, we present some discussions and suggestions. Finally, a brief conclusion is made in Section 6.
2. Related works
The coronavirus disease 2019 (COVID-19) and its related impact have emerged as one of the most complex and threatening public health challenges ever encountered. Given uncertainties in the transmission of COVID-19, and the impact of infections, hospitalizations, and deaths, the infectious disease transmission models have been widely used since the outbreak to answer a number of questions for decision-makers. Modeling approaches for studying infectious disease transmission primarily include compartmental models, statistical models, ensemble models, and individual models.
Compartmental models allow as much complexity in the model as is necessary and can represent non-linear processes and feedback. Li et al. applied a networked dynamic meta-population mode and Bayesian inference to infer the proportion of undetected individuals in COVID-19 early infections and analyze their contribution to virus spread [21]. Tian et al. performed a quantitative analysis of the effectiveness of control measures between December 31, 2019 and February 19, 2020, using a data set that included case reports, human movement, and public health interventions [22]. Wei et al. proposed an extended SEIR mode to evaluate how the implementation of clinical diagnostic criteria and universal symptom survey contributed to COVID-19 control in Wuhan, China [23]. Wang et al. used a modeling approach to reconstruct the full-spectrum dynamics of COVID-19 in Wuhan between January 1 to March 8, 2020 across 5 periods defined by events and interventions, identified the high covertness and high transmissibility features of the outbreak [24]. Liu et al. took the conversion rate between asymptomatic infections and reported/unreported symptomatic infections into account, and proposed an infectious dynamics model that adapts to all-people testing (APT). It adapted to densely populated metropolises for APT on prevention, where the result seemed more reasonable, and epidemic prediction became more accurate [25]. The primary limitation of these compartmental models is that they are subject to assumptions about the transmission process and the parameters.
Statistical models depend heavily on the quality and quantity of historical data used to make the prediction. Ensemble models are a compilation of multiple model outputs, which mitigate the risk of relying on just one model. Individual models incorporate each individual in the population as a separate agent in the model with their own individual assumptions and parameters. Each of these four categories of model structures has advantages and limitations. AI technologies have been intensively applied to modeling COVID-19, including daily infection prediction, medical imaging, health and clinic records, protein sequences, and drug discovery, et al. AI plays a significant role to control the COVID-19 pandemic disease, Intelligent Systems and Methods to Combat Covid-19 collection [26] categorizes and summarizes different intelligent systems and methods to prevent further COVID-19 spreading and provides a detailed description of various application scenarios. Chimmula et al. applied the long short-term memory (LSTM) networks to model the spread of infectious diseases in Canada to predict the severity of COVID-19 [27]. Jayanthi et al. used the Auto-regressive Integrated Moving Average (ARIMA) model, LSTM, Stacked LSTM, and Prophet [28] models to analyze and predict the global cumulative number of confirmed cases, death cases, and recovered cases [29]. Nabi et al. studied four deep learning models: LSTM, Gated recurrent unit (GRU) networks, Convolutional neural networks (CNN), and Multivariate CNN to understand the future dynamics of COVID-19 flawlessly [30].
Physics-informed machine learning introduces a learning bias by directly embedding prior knowledge to make a more accurate and robust performance. Recent studies have successfully applied physics-informed machine learning to study the complex outbreak dynamics of COVID-19 by integrating advanced epidemiology models into deep neural networks. Kharazmi et al. analyze several epidemiological models through the lens of PINNs to identify time-dependent parameters and data-driven fractional differential operators [18]. Long et al. proposed a variant of PINNs to identify the time-varying parameters of the Susceptible–Infectious–Recovered–Deceased model for the spread of COVID-19 by fitting daily reported cases [20]. Nascimento et al. proposed an approach that can implement hybrid models combining physics-informed and data-driven kernels, where the latter are used to reduce the gap between predictions and observations [31]. Cai et al. adopted a Caputo–Hadamard fractional derivative to refine the classical susceptible–exposed–infected–removed model, then inferring the fractional order and time-dependent parameters as well as unobserved dynamics of the fractional SEIR model via fractional physics-informed neural networks [32].
3. Modeling methodology
3.1. Neural network modeling
Deep neural networks.
Mathematically, a deep neural network (DNN) defines a mapping of the form
where and are the dimensions of the input and output, respectively. Generally, a standard neural unit of a DNN receives an input and produces an output , i.e., with and being weight matrix and bias vector, respectively. , which is referred to as the activation function, is designed to add element-wise non-linearity to the model.
A DNN with hidden layers can be regarded as a nested composition of sequential standard neural units. For convenience, we denote the output of the DNN by with standing for the set of all weights and biases. Specifically, the neuron in layer can be formulated as
(1) |
where represents the value of the neuron in the layer, represents the number of neurons in the layer, is the activation function of the layer, is the weight between the neuron in the layer and the neuron in the layer, and is the bias of the neuron in the layer.
ResNet block.
Residual Network architecture (ResNet) was proposed to solve the problem of vanishing/exploding gradient of deep convolutional neural networks in computer vision tasks [33]. The key idea of ResNet is the skip connections by allowing alternate shortcut path for the gradient to flow through, which enables the model to learn the identity functions to guarantee that the higher layer will perform at least as good as the lower layer. For the given advantages, ResNet methods have been widely used in DNNs for solving PDEs and have shown extraordinary performance in approximating the solution and high-order derivatives of PDEs [34], [35]. The architecture of ResNet is depicted in Fig. 1, where a ResNet block with a one-step connection produces a filtered version for the input as follows
(2) |
Fig. 1.
The architecture of ResNet.
Fourier mapping.
The activation function is one of the critical factors for designing the architecture of DNNs. As a non-linear transformation that bounds the value of the input data, it directly affects the performance of DNNs models in practical applications. Several different types of activation functions have been used in DNNs, such as and .
A well-proven phenomenon is that DNNs show a spectral bias or frequency preference, that is, DNNs will first capture the low-frequency components of input data [36]. Within this sense of spectral bias and Fourier approximation, a given real function can be expressed in the following sine and cosine expansions:
where and represent DNN modules, and are the frequencies of interest in the target function, where will always be included.
Definitely, recent works have shown that using Fourier feature mapping as an activation function can remarkably improve the capacity of DNNs [37], [38], [39], [40], [41]. Therefore, a novel activation function can be expressed as Eq. (3) based on spectral bias and Fourier approximation. It can mitigate the pathology of spectral bias and enable networks to well learn the target function [37], [38]:
(3) |
where is a user-specified vector (not trainable) which is consistent with the number of neural units in the first hidden layer for DNNs. By performing a Fourier feature mapping of the input data, the input points in can be mapped to the range . After that, the following layers of the neural network can process the feature information in Fourier space efficiently. The neural network architecture part of the Epi-DNNs method is shown in Fig. 2.
Fig. 2.
Illustration of the representation of Fourier basis in DNNs with ResNet block. represents the weight of the th hidden units in the first hidden layer.
3.2. Compartmental models
Compartmental models enable the simulation of multi-state population transitions by incorporating domain knowledge and mathematical assumptions to characterize the dynamics of infectious diseases. Transmission dynamics models of infectious diseases are generally represented as the following non-linear dynamical system:
(4) |
where (typically ) is the state variable, is the time, is the initial state, and stands for the parameters of dynamical system.
The basic Susceptible–Infection–Recovered–Death (SIRD) model is extended on the SIR model, which describes the interaction of the virus with the host during transmission, and divides the population into 4 types: susceptible, infected, recovered, and deceased. The SIRD model can be described by the following ordinary differential equations:
(5) |
where S(t), I(t), R(t), D(t) denote the number of susceptible, infected, recovered and deceased individuals over time, respectively. represents the transmission rate of the disease. represents the recovery rate, which is the proportion of infected individuals that recover from the disease per unit of time. is the death rate. The model is initialized at some conventional with values , and . denotes the removed individuals that are removed from the susceptible compartment due to death or immunization.
In the basic SIRD model, the three parameters of transmission rate , recovery rate , and death rate are considered time-constant. However, the long duration of the pandemic, the associated interventions implemented by authorities, and/or mutations of the virus, et al. result in the parameters in the SIRD model changing over time. Accordingly, compartmental models require time-varying parameters to accurately and effectively model the dynamic of COVID-19 epidemiological attributes including time-varying infection, recovery, and mortality rate. The time-varying SIRD model takes the transmission rate , recovery rate and death rate as functions of time: , the re-written differential equations are as follows:
(6) |
Among them, the four variables S(t), I(t), R(t), and D(t) have the same meaning as in Eq. (6). If we assume that the total population is constant, then the sum of the increase or decrease of the state of each population is 0, namely, .
3.3. Overview of Epi-DNNs
Here, DNNs with input t are parameterized by a set of parameters as the hypothesis spaces (denoted as ) and implemented to represent the data-driven surrogate . For given differential Eqs. (4), the task is to find the value of the unknown function at a given point . Runge–Kutta method is an effective and widely used numerical method to solve differential equations, the numerical accuracy determined by its order. The classical fourth-order Runge–Kutta method is the most commonly used numerical ODE solver. The formula below is used to compute the next value from the previous value . The value of is , where is the step height and .
(7) |
The formula basically computes the next value using the current plus the weighted average of four increments. Then the expression of time-varying parameters , , and for the SIRD model can be obtained by minimizing the following loss function:
(8) |
with
where the observed data for and at with a given time interval are denoted as , , and , respectively. stands for the nonlinear mapping of the fourth-order Runge–Kutta method for given input data and parameters. It should be noted that the fourth order Runge–Kutta Method has the local and global error of and with being the step size, respectively. In addition, we introduce four positive relaxing factors , and to balance the contribution of , , , and the regularization sum of network parameter in the loss function, respectively.
To obtain the ideal and , optimization methods such as gradient descent (GD) or stochastic gradient descent (SGD) are required to update the parameters of the DNNs during the training. In this context, the SGD is given by:
where the learning rate decreases with increasing and .
Algorithm 1 describes the workflow of the proposed Epi-DNNs method for solving nonlinear dynamical Eqs. (5).
4. Numerical simulations
4.1. Experimental setting
Data.
The simulation is based on the real-world COVID-19 data announced on the official website by Shanghai Municipal Health Commission.1 The related data set includes exhaustive information on the time series , , and covering February 25 to May 27 2022. Here refers to the sum of confirmed and asymptomatic (here asymptomatic infections are assumed to transform into recovered status after 6 days [42]). These time series data are smoothed with a 7-day moving average to smooth out the errors in the data. Key events and corresponding dates are taken into account to better understand the evolution of the COVID-19 transmission in Shanghai, which are listed below:
-
•
February 25, 2022: First asymptomatic case.
-
•
March 2, 2022: Closure of partial public places.
-
•
March 12, 2022: Closure of kindergartens, primary, secondary and vocational schools; limiting movement within city borders.
-
•
March 14, 2022: Public transportation closures of the long- distance bus.
-
•
March 15, 2022: Closure of universities; limiting gatherings.
-
•
March 15–16, 2022: Lockdown and mass PCR screening within high-risk areas.
-
•
March 18–20, 2022: Mass PCR screening within non-high-risk areas.
-
•
March 23, 2022: Adoption of makeshift isolation hospitals; lockdown and mass PCR screening within high-risk areas.
-
•
March 26–27, 2022: Lockdown and mass PCR screening within high-risk areas.
-
•
March 28, 2022: Lockdown of eastern Shanghai.
-
•
March 28–30, 2022: Mass PCR screening of eastern Shanghai.
-
•
April 1, 2022: Lockdown of western Shanghai; mass PCR screening of western Shanghai.
-
•
April 4, April 10, April 26, 2022: Mass PCR screening of all Shanghai.
-
•
May 1–7, 2022: Mass PCR screening of all Shanghai.
-
•
June 1, 2022: Lift lockdown.
Framework.
Each neural networks implemented in this paper comprise 5 layers, where the weight matrix and the bias vector of the th layer are respectively , , , , and , , , , . In this numerical experiment, all neural networks are trained by the Adam optimizer, where the initial learning rate is with a decay rate 95% for every 1000 epochs. In addition, the regular factors is set as 0.0005, and max epoch is set as 100 000.
4.2. Result analyses
Data fitting.
We observe in Fig. 3 that the value of the loss function tends to decrease from the beginning to the end during the training process and gradually stabilizes in the range of minimal values. The formula of loss functions (8) indicates that it represents the differences between real-world data and predicted data, therefore well-performed loss function demonstrates the excellent fit between data and model.
Fig. 3.
Data fitting performance: Loss of S, I, R, D and total loss during training. (a) Loss of S. (b) Loss of I. (c) Loss of R. (d) Loss of D. (e) Total loss.
Inferences.
We are interested to infer the parameters , , and by solving the inverse problem of the SIRD model. Fig. 4 shows the inferences of the time-varying parameters , , and from February 25 to May 20 2022. represents the effective reproduction number, less than 1 indicates that the transmission of the infectious disease will gradually disappear. Further, observe that the time behavior of the fitted parameters is consistent with their expected dynamics. Since the high transmissibility and immune escape properties of the Omicron variant, the infections increased sharply following the first case. The authorities of Shanghai started imposing the closure of partial public places on March 2, followed by a series of interventions to combat the outbreak of Omicron. These interventions achieved a certain success, as demonstrated by a significant reduction in transmission rates and . However, the outbreak was not under control (). On March 16, grid precise management was implemented but with limited effectiveness until March 26. As can be seen, the transmission rate behaves as a fluctuating oscillation, with consistently greater than 1. Until the lockdown of eastern Shanghai on March 28 and the lockdown of western Shanghai on April 1, transmission rates and effective reproduction number showed a continuous decreasing trend, with gradually approaching 0, and the outbreak was curbed. On the other hand, the recovery rate and the death rate are expected to increase and decrease, respectively, thanks to the use of more effective treatments for the disease.
Fig. 4.
Epi-DNNs results: Inferences of model parameters based on the available data from February 25 to May 20 2022. (a) Transmission rate . (b) Recovery rate . (c) Death rate . (d) Effective reproduction number .
Forecasting.
The non-linear ODEs system requires determined initial conditions and model parameters to make predictions. As the initial conditions can be obtained from the training data and the model parameters are already calibrated, we can forecast the epidemic dynamics by solving the forward problem. In the prediction part, the value of , and are assumed to be their final value of the training time window. Fig. 5 depicts the data fitting and prediction obtained by using the identified time-varying model with the parameters given above. The perfect match between the predictions and the observations demonstrates the parameters inferred by the learned network are very plausible, as well as the generalization ability of the model.
Fig. 5.
7-day predictions based on the time-varying SIRD model. The gray vertical line divides the fitting and prediction window. We have included the newly available data for the prediction period that was not used in the fitting to show the generalization ability of the model. (a) Current infections. (b) Cumulative recovery. (c) Cumulative deaths.
4.3. Evaluation metrics
By comparing forecasting results and observations, the performance of the proposed Epi-DNNs can be evaluated. We use four evaluation metrics to make fair and effective comparisons. They are mean absolute error (MAE), average absolute percentage error (MAPE), root mean square error (RMSE), and relative error (REL). Their corresponding equations are shown in Eqs. (9) (10) (11), (12), respectively.
(9) |
(10) |
(11) |
(12) |
To test the performance of the proposed Epi-DNNs method in the prediction, we did 3-days, 5-day, and 7-day experiments. The experimental results represented in Table 1 show the forecasting capability with high accuracy of the proposed Epi-DNNs method.
Table 1.
The prediction performance in 3 days, 5 days and 7 days.
Metrics | After May 20, 2022 |
||
---|---|---|---|
3-days | 5-days | 7-days | |
MAE of I | 29.57 | 148.82 | 160.27 |
MAE of R | 282.75 | 813.39 | 1495.28 |
MAE of D | 0.07 | 0.38 | 0.39 |
MAPE of I | |||
MAPE of R | |||
MAPE of D | |||
RMSE of I | 39.16 | 222.27 | 213.63 |
RMSE of R | 330.02 | 1079.20 | 1943.36 |
RMSE of D | 0.09 | 0.55 | 0.53 |
REL of I | |||
REL of R | |||
REL of D |
5. Discussion
The global pandemic COVID-19 has affected the lives of most people severely around the world, even causing numerous loss of lives. COVID-19 has reshaped the focus of global scientific attention and efforts, and researchers across the world have done much work to analyze the dynamic of COVID-19. Among them, exploring combining mathematical modeling and emerging AI technology to capture the complex outbreak dynamics of COVID-19 is a promising research topic. In this paper, we proposed the Epi-DNNs method to combine the deep learning method with the compartmental model to model the real-time dynamics of COVID-19. Experiment results demonstrate that the time-varying parameters of the compartmental model identified by the proposed Epi-DNNs method are consistent with expectations.
The transmission rate determines the dynamics of the epidemic, and the time-varying estimated by the proposed Epi-DNNs method can accurately capture the changes in government interventions and individual behaviors. The recovery rate and the death rate are expected to increase and decrease, respectively, thanks to the more effective treatments for the disease. The identified and by our proposed Epi-DNNs method also fit well with the improved capacity of the healthcare system to fight against COVID-19. The effective reproduction number R is the transmission process of the virus, which represents the number of people transferred from the susceptible group to the infected group per unit of time. Chen et al. divided the epidemic into three phases to describe the epidemiological characteristics and spatiotemporal transmission dynamics of the Omicron outbreak in Shanghai and estimated the dynamics of [43]. Lou et al. constructed an extended compartmental model to retrospectively analyze the epidemic in Shanghai from 26 February 2022 to 31 May 2022 across four periods defined by related interventions and estimated R. As shown in Figure, the value of estimated by proposed Epi-DNNs method is consistent with those given by other researchers [44]. More importantly, by applying estimated parameters to the compartmental model to depict the dynamics of COVID-19, the perfect fitting between model predictions and observed data also underscores that parameters yield great fitness.
For different research scenarios, compartmental models are required to divide different compartments such as asymptomatic and symptomatic, adding the virus mutations, or adding the vaccination campaign. The proposed Epi-DNNs method is easy to be implemented without any background knowledge about numerical analysis (for example, stability conditions). For applying the Epi-DNNs method to other compartmental models, practitioners only needs to redefine the transformation matrix for each compartment according to the equations and build DNNs with the help of some libraries that implement deep neural networks. Therefore, the proposed Epi-DNNs method is applicable for parameter estimation of other compartmental models and other areas around the world and future infectious diseases. Although the proposed Epi-DNNs provide an important modeling method for infectious disease transmission, there are also some limitations. Due to that it is impossible to build a state-of-the-art compartmental model that can represent all the scenarios of COVID-19, we selected only the SIRD model as an example to test the performance of Epi-DNNs. In addition, the data we use is official statistics, and there will be some differences from the actual data in the real world. Furthermore, the proposed Epi-DNNs method employs fully connected networks to build the model, which may not be the optimal approach. In the following works, we will try to apply other neural networks, such as RNNs and LSTMs, with the compartmental model for COVID-19 modeling.
6. Conclusion
In this paper, we proposed a novel Epi-DNNs method to identify the time-vary parameter for the epidemic compartmental model to accurately depict the dynamic of COVID-19. Incorporating domain knowledge, mathematical modeling, and AI techniques, we analyze the dynamics of COVID-19 using the compartment model and identify its parameters with neural networks and real-world observations. Experimental results revealed that the proposed Epi-DNNs method indeed calibrates the parameters of the compartmental model accurately and effectively. Based on the estimated parameters, reliable predictions are performed to validate the feasibility and predictability of the proposed idea to model the dynamic of COVID-19. We emphasize that our method can easily be implemented without any background knowledge about numerical analysis (for example, stability conditions) but about some libraries for implementing neural networks. Therefore, the proposed Epi-DNNs method is applicable for parameter estimation of other compartmental models and other areas around the world and future infectious diseases.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Xiao Ning, Linlin Jia, Yongyue Wei, Xi-An Li and Feng Chen declare that they have no conflict of interest or financial conflicts to disclose.
Funding
The study was supported by the National Natural Science Foundation of China (82041024 to Feng Chen and 81973142 to Yongyue Wei). This study was also partially supported by the Bill & Melinda Gates Foundation (INV-006371).
Footnotes
References
- 1.Wei Y., Sha F., Zhao Y., Jiang Q., Hao Y., Chen F. Better modelling of infectious diseases: lessons from covid-19 in China. Bmj. 2021;375 doi: 10.1136/bmj.n2365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. A, Contain. Pap. Math. Phys. Charact. 1927;115(772):700–721. [Google Scholar]
- 3.He S., Peng Y., Sun K. SEIR modeling of the COVID-19 and its dynamics. Nonlinear Dynam. 2020;101(3):1667–1680. doi: 10.1007/s11071-020-05743-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Calafiore G.C., Novara C., Possieri C. A time-varying SIRD model for the COVID-19 contagion in Italy. Annu. Rev. Control. 2020;50:361–372. doi: 10.1016/j.arcontrol.2020.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lalwani S., Sahni G., Mewara B., Kumar R. Predicting optimal lockdown period with parametric approach using three-phase maturation SIRD model for COVID-19 pandemic. Chaos Solitons Fractals. 2020;138 doi: 10.1016/j.chaos.2020.109939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rubio-Herrero J., Wang Y. A flexible rolling regression framework for the identification of time-varying SIRD models. Comput. Ind. Eng. 2022;167 [Google Scholar]
- 7.Youssef H., Alghamdi N., Ezzat M.A., El-Bary A.A., Shawky A.M. Study on the SEIQR model and applying the epidemiological rates of COVID-19 epidemic spread in Saudi Arabia. Infect. Dis. Model. 2021;6:678–692. doi: 10.1016/j.idm.2021.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Niu R., Chan Y.-C., Wong E.W., van Wyk M.A., Chen G. A stochastic SEIHR model for COVID-19 data fluctuations. Nonlinear Dynam. 2021;106(2):1311–1323. doi: 10.1007/s11071-021-06631-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jagan M., DeJonge M.S., Krylova O., Earn D.J. Fast estimation of time-varying infectious disease transmission rates. PLoS Comput. Biol. 2020;16(9) doi: 10.1371/journal.pcbi.1008124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ge Y., Zhang W.-B., Wu X., Ruktanonchai C.W., Liu H., Wang J., Song Y., Liu M., Yan W., Yang J., et al. Untangling the changing impact of non-pharmaceutical interventions and vaccination on European Covid-19 trajectories. Nature Commun. 2022;13(1):1–9. doi: 10.1038/s41467-022-30897-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Xue L., Jing S., Miller J.C., Sun W., Li H., Estrada-Franco J.G., Hyman J.M., Zhu H. A data-driven network model for the emerging COVID-19 epidemics in Wuhan, Toronto and Italy. Math. Biosci. 2020;326 doi: 10.1016/j.mbs.2020.108391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Viguerie A., Lorenzo G., Auricchio F., Baroli D., Hughes T.J., Patton A., Reali A., Yankeelov T.E., Veneziani A. Simulating the spread of COVID-19 via a spatially-resolved susceptible–exposed–infected–recovered–deceased (SEIRD) model with heterogeneous diffusion. Appl. Math. Lett. 2021;111 doi: 10.1016/j.aml.2020.106617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Shorten C., Khoshgoftaar T.M., Furht B. Deep Learning applications for COVID-19. J. Big Data. 2021;8(1):1–54. doi: 10.1186/s40537-020-00392-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Clement J.C., Ponnusamy V., Sriharipriya K., Nandakumar R. A survey on mathematical, machine learning and deep learning models for COVID-19 transmission and diagnosis. IEEE Rev. Biomed. Eng. 2021;15:325–340. doi: 10.1109/RBME.2021.3069213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Raissi M., Perdikaris P., Karniadakis G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019;378:686–707. [Google Scholar]
- 16.Linka K., Schafer A., Meng X., Zou Z., Karniadakis G.E., Kuhl E. 2022. Bayesian Physics-Informed Neural Networks for real-world nonlinear dynamical systems. arXiv preprint arXiv:2205.08304. [Google Scholar]
- 17.Nguyen L., Raissi M., Seshaiyer P. Modeling, Analysis and Physics Informed Neural Network approaches for studying the dynamics of COVID-19 involving human-human and human-pathogen interaction. Comput. Math. Biophys. 2022;10(1):1–17. [Google Scholar]
- 18.Kharazmi E., Cai M., Zheng X., Zhang Z., Lin G., Karniadakis G.E. Identifiability and predictability of integer-and fractional-order epidemiological models using physics-informed neural networks. Nat. Comput. Sci. 2021;1(11):744–753. doi: 10.1038/s43588-021-00158-0. [DOI] [PubMed] [Google Scholar]
- 19.Ning X., Li X.-A., Wei Y., Chen F. Euler iteration augmented physics-informed neural networks for time-varying parameter estimation of the epidemic compartmental model. Front. Phys. 2022;10:1300. [Google Scholar]
- 20.Long J., Khaliq A., Furati K.M. Identification and prediction of time-varying parameters of COVID-19 model: a data-driven deep learning approach. Int. J. Comput. Math. 2021;98(8):1617–1632. [Google Scholar]
- 21.Li R., Pei S., Chen B., Song Y., Zhang T., Yang W., Shaman J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2) Science. 2020;368(6490):489–493. doi: 10.1126/science.abb3221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tian H., Liu Y., Li Y., Wu C.-H., Chen B., Kraemer M.U., Li B., Cai J., Xu B., Yang Q., et al. An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China. Science. 2020;368(6491):638–642. doi: 10.1126/science.abb6105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wei Y., Wei L., Jiang Y., Shen S., Zhao Y., Hao Y., Du Z., Tang J., Zhang Z., Jiang Q., et al. Implementation of clinical diagnostic criteria and universal symptom survey contributed to lower magnitude and faster resolution of the COVID-19 epidemic in Wuhan. Engineering. 2020;6(10):1141–1146. doi: 10.1016/j.eng.2020.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hao X., Cheng S., Wu D., Wu T., Lin X., Wang C. Reconstruction of the full transmission dynamics of COVID-19 in Wuhan. Nature. 2020;584(7821):420–424. doi: 10.1038/s41586-020-2554-8. [DOI] [PubMed] [Google Scholar]
- 25.Liu X.-X., Yang J., Fong S., Dey N., Millham R.C., Fiaidhi J. All-people-test-based methods for COVID-19 infectious disease dynamics simulation model: Towards citywide COVID testing. Int. J. Environ. Res. Public Health. 2022;19(17):10959. doi: 10.3390/ijerph191710959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Joshi A., Dey N., Santosh K. Springer; 2020. Intelligent Systems and Methods To Combat Covid-19. [Google Scholar]
- 27.Chimmula V.K.R., Zhang L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos Solitons Fractals. 2020;135 doi: 10.1016/j.chaos.2020.109864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Taylor S.J., Letham B. Forecasting at scale. Amer. Statist. 2018;72(1):37–45. [Google Scholar]
- 29.Devaraj J., Elavarasan R.M., Pugazhendhi R., Shafiullah G., Ganesan S., Jeysree A.K., Khan I.A., Hossain E. Forecasting of COVID-19 cases using deep learning models: Is it reliable and practically significant? Results Phys. 2021;21 doi: 10.1016/j.rinp.2021.103817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Nabi K.N., Tahmid M.T., Rafi A., Kader M.E., Haider M.A. Forecasting COVID-19 cases: A comparative analysis between recurrent and convolutional neural networks. Results Phys. 2021;24 doi: 10.1016/j.rinp.2021.104137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Nascimento R.G., Fricke K., Viana F.A. A tutorial on solving ordinary differential equations using Python and hybrid physics-informed neural network. Eng. Appl. Artif. Intell. 2020;96 [Google Scholar]
- 32.Cai M., Em Karniadakis G., Li C. Fractional SEIR model and data-driven predictions of COVID-19 dynamics of Omicron variant. Chaos. 2022;32(7) doi: 10.1063/5.0099450. [DOI] [PubMed] [Google Scholar]
- 33.K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
- 34.Yu B., et al. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Statist. 2018;6(1):1–12. [Google Scholar]
- 35.Zou Z., Zhang H., Guan Y., Zhang J. Deep residual neural networks resolve quartet molecular phylogenies. Mol. Biol. Evol. 2020;37(5):1495–1507. doi: 10.1093/molbev/msz307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Xu Z.-Q.J., Zhang Y., Luo T., Xiao Y., Ma Z. 2019. Frequency principle: Fourier analysis sheds light on deep neural networks. arXiv preprint arXiv:1901.06523. [Google Scholar]
- 37.Wang S., Wang H., Perdikaris P. On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. Comput. Methods Appl. Mech. Engrg. 2021;384 [Google Scholar]
- 38.Tancik M., Srinivasan P.P., Mildenhall B., Fridovich-Keil S., Raghavan N., Singhal U., Ramamoorthi R., Barron J.T., Ng R. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. arXiv preprint arXiv:2006.10739. [Google Scholar]
- 39.Xu Z.-Q.J., Zhang Y., Luo T., Xiao Y., Ma Z. Frequency principle: Fourier analysis sheds light on deep neural networks. Commun. Comput. Phys. 2020;28(5):1746–1767. [Google Scholar]
- 40.Rahaman N., Arpit D., Baratin A., Draxler F., Lin M., Hamprecht F.A., Bengio Y., Courville A. International Conference on Machine Learning. 2019. On the spectral bias of deep neural networks. [Google Scholar]
- 41.Li X., Xu Z.J., Zhang L. 2021. Subspace Decomposition based DNN algorithm for elliptic-type multi-scale PDEs. arXiv preprint arXiv:2112.06660. [Google Scholar]
- 42.Cai J., Deng X., Yang J., Sun K., Liu H., Chen Z., Peng C., Chen X., Wu Q., Zou J., et al. Modeling transmission of SARS-CoV-2 omicron in China. Nat. Med. 2022:1–8. doi: 10.1038/s41591-022-01855-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chen Z., Deng X., Fang L., Sun K., Wu Y., Che T., Zou J., Cai J., Liu H., Wang Y., et al. Epidemiological characteristics and transmission dynamics of the outbreak caused by the SARS-CoV-2 Omicron variant in Shanghai, China: a descriptive study. Lancet Reg. Health-Western Pac. 2022;29 doi: 10.1016/j.lanwpc.2022.100592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Lou L., Zhang L., Guan J., Ning X., Nie M., Wei Y., Chen F. Retrospective modeling of the omicron epidemic in shanghai, China: Exploring the timing and performance of control measures. Trop. Med. Infect. Dis. 2023;8(1):39. doi: 10.3390/tropicalmed8010039. [DOI] [PMC free article] [PubMed] [Google Scholar]