Hybrid modeling and forecasting of COVID-19: integrating SEAIQHRD and GPR for improved predictions

Mallela Ankamma Rao; Emad K Jaradat; Medisetty Padma Devi; Prasantha Bharathi Dhandapani; Rebecca Muhumuza Nalule; Mohannad Al-Hmoud

doi:10.1186/s12879-025-12494-x

. 2026 Jan 10;26:210. doi: 10.1186/s12879-025-12494-x

Hybrid modeling and forecasting of COVID-19: integrating SEAIQHRD and GPR for improved predictions

Mallela Ankamma Rao ¹, Emad K Jaradat ², Medisetty Padma Devi ³, Prasantha Bharathi Dhandapani ⁴, Rebecca Muhumuza Nalule ^5,^✉, Mohannad Al-Hmoud ²

PMCID: PMC12857096 PMID: 41520122

Abstract

This study introduces a dual-hybrid COVID-19 forecasting modeling approach that integrates an eight-compartment SEAIQHRD model with Gaussian Process Regression (GPR) and ARIMA-based residual learning to enhance predictive performance. A central methodological contribution is the incorporation of convergence and stability diagnostics, demonstrating reliable parameter estimation through multi-start optimization and bootstrap analysis. Although the SEAIQHRD model captures core disease progression, it is limited in representing nonlinear multi-wave patterns and reporting inconsistencies. The SEAIQHRD–ARIMA hybrid improves short-term linear adjustments, while the SEAIQHRD–GPR hybrid effectively models nonlinear residual structure and provides uncertainty-aware forecasts. Using COVID-19 data from India, both hybrids outperform the standalone model, with the GPR variant yielding the greatest accuracy. Forecast superiority, confirmed by DM, CW, GW, Wilcoxon, and Friedman tests, underscores the robustness and applicability of the proposed modeling approach for public-health.

Clinical trial Not applicable.

Keywords: COVID-19 forecasting; SEAIQHRD compartmental model; Gaussian Process Regression (GPR); ARIMA residual modeling; Hybrid epidemic models; Convergence and stability analysis; Uncertainty-aware prediction; Forecast superiority tests (DM, CW, GW)

Introduction

The Coronavirus Disease 2019 (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), rapidly evolved into a global public-health emergency with profound social and economic consequences. Its high transmissibility and heterogeneous clinical presentation—from asymptomatic infection to severe respiratory illness—created substantial challenges for surveillance and healthcare preparedness [1]. To curb transmission, countries implemented a range of public-health interventions, including physical distancing, mask use, travel restrictions, testing, quarantine, and periodic lockdowns]. To curb transmission, countries implemented a range of public-health interventions, including physical distancing, mask use, travel restrictions, testing, quarantine, and periodic lockdowns [2]. In this context, mathematical and statistical modeling has played a critical role in characterizing transmission dynamics, evaluating intervention effectiveness, predicting epidemic trajectories, and guiding policy decisions. As the pandemic progressed, the need for more flexible and accurate forecasting tools became increasingly apparent, particularly those capable of capturing multi-wave patterns, reporting inconsistencies, and nonlinear epidemic behavior.

Mathematical modeling is essential for understanding infectious-disease transmission and evaluating public-health interventions. Compartmental models such as SIR and SEIR help quantify transmission dynamics and assess the effects of vaccination, quarantine, and behavioral changes [3]. Recent advances show that incorporating computational and ecological perspectives enhances realism in epidemic modeling [4], while soft-computing techniques further strengthen the integration of mechanistic and data-driven approaches [5]. Together, these developments provide robust tools for forecasting and informed decision-making during outbreaks.

Statistical models play a central role in epidemiology by extracting patterns from observed data, quantifying uncertainty, and generating forecasts that inform public-health decisions. Time-series approaches such as ARIMA and state-space models effectively capture trends and abrupt shifts in disease incidence [6], while systematic studies underscore their value under conditions of limited data and reporting delays [7]. Advances in Bayesian methods and machine-learning-enhanced statistical tools have further strengthened real-time epidemic monitoring [8]. Moreover, [9] highlights modern statistical and hybrid innovations that enhance forecasting accuracy and improve the policy relevance of epidemic analyses.

Hybrid models integrate mechanistic compartmental structures with machine-learning and statistical approaches to generate more accurate and flexible epidemic forecasts. While traditional SIR and SEIR models capture core disease dynamics, hybrid frameworks better address real-world complexities by combining mechanistic insight with data-driven adaptability [10, 11]. Recent advances further incorporate environmental and policy factors to improve multi-wave forecasting and strengthen public-health decision making [12]. These integrated approaches provide a versatile platform for analyzing and predicting infectious-disease dynamics.

The recent hybrid models in epidemiology show strong improvements in capturing nonlinear and rapidly changing disease dynamics. By integrating mechanistic frameworks with data-driven methods, they overcome limitations of traditional models and yield more accurate short-term and multi-wave forecasts. These advances underscore the importance of integrated approaches for understanding outbreaks and supporting public-health decisions.

Kiarie et al. use an extended SEIR model together with an ARIMA framework to analyze COVID-19 transmission in Kenya and generate short-term forecasts [13]. Their results reveal clear multi-wave patterns and predict continued transmission, with daily infections and severe cases expected to rise in the forecast window. The study shows that combining SEIR dynamics with ARIMA improves forecasting accuracy and supports public-health planning.

Ala’raj et al. present a hybrid SEIRD–ARIMA model designed to improve short- and long-term forecasting of COVID-19 dynamics using U.S. data [14]. Their model corrects SEIRD residuals with ARIMA predictions and incorporates an ascertainment-rate parameter to handle underreporting. The results show that the hybrid approach achieves very low prediction errors and provides reliable forecasts to support policymakers in planning effective pandemic responses.

Zhao et al. propose a hybrid framework that combines logistic growth, ARIMA models, machine-learning regressors, and SEIR-based structures to enhance COVID-19 forecasting [15]. By integrating linear, nonlinear, and mechanistic components, their model outperforms standalone ARIMA and SEIR approaches, achieving lower RMSE and MAE values. The hybrid method delivers reliable short-term predictions, supporting more informed public-health planning.

Jin et al. develop an ARIMA–LSTM hybrid model that improves COVID-19 time-series prediction by combining linear and nonlinear components [16]. Building on this idea, our study employs an integrated mathematical–statistical–hybrid framework to better capture epidemic dynamics. The results show reduced prediction errors and improved short-term forecast reliability for public-health decision making.

Kong et al. propose a hybrid SIRD–eRNN model that combines epidemic dynamics with recurrent neural networks to improve short-term COVID-19 forecasting using regional mobility information [17]. Their results show that the model achieves lower RMSE and more accurate three-day-ahead predictions than existing spatiotemporal approaches. This demonstrates the effectiveness of integrating mechanistic structure with deep learning for reliable epidemic forecasting.

Cheng et al. develop a hybrid SEIRV–DNN framework that integrates epidemic dynamics with deep neural networks to infer time-varying parameters and improve COVID-19 forecasting [18]. Using data from China, their model achieves high accuracy across epidemic compartments, with R² values close to 0.99. The results demonstrate that dynamics-informed neural networks provide reliable multi-wave predictions and strong real-time forecasting capability.

Heredia Cacha et al. develop an ensemble model that combines classical growth models with machine-learning predictors to forecast COVID-19 spread in Spain using incidence, mobility, vaccination, and weather data [19]. Their results show that the ensemble consistently improves 14-day forecasting accuracy by correcting the overestimation of ML models and the underestimation of population-based models. This demonstrates the effectiveness of hybrid ensemble strategies for reliable, real-time epidemic prediction.

Saleem et al. present a systematic literature review examining machine learning, deep learning, and mathematical modeling approaches used for forecasting and analyzing the epidemiology of COVID-19 [20]. Their analysis of 57 high-quality studies shows that CNN and SVM are among the most frequently used algorithms, while compartmental models effectively estimate key indicators such as the basic reproduction number and doubling time. The review concludes that integrating ML/DL techniques with epidemiological modeling significantly enhances predictive performance and supports healthcare and policy decision-making.

Recent studies also highlight the importance of rigorous analytical structures when designing hybrid models for complex systems. For example, Yang et al. demonstrate that a chaotic quantum adaptive optimization framework can outperform conventional methods in high-dimensional, uncertainty-driven scheduling problems [21]. Motivated by such structured and algorithmically disciplined designs, our study adopts a similarly robust approach by integrating a detailed mechanistic model with a strong nonparametric learning component.

Similarly, Li et al. develop a strictly structured hybrid forecasting framework by combining Fourier-based preprocessing, regularized Bi-LSTM modeling, and a chaotic quantum adaptive whale optimization algorithm to enhance prediction accuracy in highly nonlinear environments [22]. Their results show that integrating signal denoising, deep learning, and advanced metaheuristic optimization produces superior performance under strong time variation. This reinforces the need for disciplined, multi-stage hybrid designs when modeling complex systems, further motivating the structured SEAIQHRD–GPR framework adopted in our work.

Al-Musaylh et al. [23] demonstrated that an Extreme Learning Machine (ELM) model provides highly accurate near–real-time seawater-level predictions, outperforming EmNN, XGBoost, and MLR models, with correlation values reaching up to 0.999. This study highlights the effectiveness of advanced AI models and rigorous accuracy metrics in environmental forecasting.

Al-Musaylh et al. [24] developed a CLSTM hybrid deep-learning model that integrates convolutional and LSTM networks to perform multistep solar ultraviolet index (UVI) prediction. Their results showed that the CLSTM model outperformed several benchmark AI and deterministic approaches across 10-, 30-, and 60-minute horizons, demonstrating its strong capability for accurate environmental forecasting.

Ghimire et al. [25] proposed an explainable hybrid Deeply-Fused Nets (FNET) framework integrating CNN and BiLSTM components to improve electricity demand forecasting. Their model provided superior accuracy and uncertainty-aware predictions, demonstrating the value of hybrid AI architectures with interpretability for energy-related time-series analysis.

Al-Daffaiea et al. [26] developed an online sequential extreme learning machine (OS-ELM) model to forecast the Iraqi Dinar–USD exchange rate and benchmarked it against the classical ARIMA model. Their results showed that OS-ELM achieved markedly lower RRMSE and MAPE values, demonstrating superior performance for volatile financial time-series prediction and supporting more accurate economic decision-making.

AL-Musaylh et al. [27] introduced a hybrid forecasting framework that integrates empirical wavelet transform (EWT) with machine-learning models to decompose nonlinear and non-stationary gas consumption data. Their EWT-M5 model tree achieved substantially lower prediction errors than traditional and hybrid benchmarks, demonstrating the effectiveness of wavelet-assisted AI approaches for accurate energy-demand forecasting.

Mohanad et al. [28] developed a particle-swarm-optimized support vector regression (PSO-SVR) model to forecast daily electricity demand using climate variables across major regions in Queensland, Australia. Their results showed that PSO-SVR consistently outperformed traditional SVR and MARS models, highlighting the effectiveness of optimization-enhanced machine-learning techniques for improving energy-demand prediction accuracy.

Zhang et al. [29] proposed a hybrid VMD-SR-SVR-CBCS model that combines variational mode decomposition, a self-recurrent SVR mechanism, chaotic mapping, and an improved cuckoo search algorithm for electric-load forecasting. Their approach effectively overcomes premature convergence and boundary-search issues, achieving superior forecasting accuracy on real-world datasets and demonstrating the strength of decomposition-and optimization-enhanced hybrid AI models.

Hong et al. [30] proposed a forecasting framework for nonlinear floating-platform motion using a hybrid-kernel support vector regression model optimized via a chaotic efficient bat algorithm. By integrating ensemble empirical mode decomposition to extract intrinsic dynamic components, their approach achieved highly accurate motion predictions on real platform data, demonstrating the effectiveness of decomposition- and optimization-enhanced hybrid AI models.

Zhang and Hong [31] introduced a CEEMDAN-SVR-QDA hybrid model that integrates adaptive noise decomposition with a quantum-enhanced dragonfly optimization mechanism to improve electric-load forecasting accuracy. Their approach effectively mitigates premature convergence and low population diversity issues, achieving superior performance on real-world datasets from Japan and the UK.

Dong et al. [32] proposed a seasonal hybrid electric-load forecasting model (SSVRCCS) that integrates support vector regression with an improved chaotic cuckoo search algorithm enhanced by tent chaotic mapping to prevent premature convergence. By incorporating a seasonal mechanism to capture cyclic load behavior, their model demonstrated superior forecasting performance on datasets from Australia and the United States.

Ghimire et al. [33] developed an Integrated Multi-Head Self-Attention Transformer (TNET) model to forecast electricity demand using local climate variables from Queensland, Australia. Their model outperformed several state-of-the-art deep learning approaches and generated reliable prediction intervals, demonstrating the effectiveness of attention-based architectures for accurate, climate-informed energy-demand prediction.

Kang and Zheng [34] conducted a large retrospective analysis of case series on fever of unknown origin (FUO) in China from 2013 to 2022, revealing that infectious diseases remain the predominant cause, with bloodstream infections increasing and tuberculosis markedly declining. Their study also reported decreases in autoimmune and neoplastic causes, providing important clinical insights into the evolving etiological patterns of FUO.

Zhang et al. [35] emphasized the heightened vulnerability of patients with advanced cancer to severe SARS-CoV-2 infection, noting substantially higher mortality rates compared with the general population. Their study underscores the urgent need for effective therapeutic strategies—particularly combining antiviral and immunosuppressive treatments—to manage severe viral pneumonia in this immunocompromised patient group.

Ewing et al. [36] reviewed emerging evidence that COVID-19 and Long COVID can cause substantial organ damage in both symptomatic and asymptomatic individuals. Their findings highlight the need to broaden the definition of Long COVID or recognize COVID-induced organ injury as a distinct condition, emphasizing the importance of long-term clinical monitoring following infection.

Guo et al. [37] reviewed evidence that SARS-CoV-2 infection may trigger a range of autoimmune disorders and play a central role in the pathogenesis of Long COVID. Their study highlights the importance of autoimmunity in explaining persistent symptoms and multi-system involvement, emphasizing the need for ongoing research into post-infection immune dysregulation.

Ariyasingha et al. [38] investigated high-capacity production of hyperpolarized butane gas using parahydrogen-induced polarization, achieving significantly longer polarization lifetimes and much faster production rates than prior propane-based agents. Their work demonstrated feasible phantom and lung ventilation imaging on standard clinical MRI systems, highlighting the potential of low-cost, proton-hyperpolarized gases for accessible functional pulmonary imaging.

Chowdhury et al. [39] demonstrated the feasibility of rapid lung ventilation MRI using proton-hyperpolarized propane gas produced via parahydrogen-induced polarization. Their work showed that hyperpolarized propane enables high-resolution ventilation imaging on standard low-field clinical MRI scanners within seconds, highlighting its potential as an inexpensive and accessible contrast agent for pulmonary imaging.

Ariyasingha et al. [40] investigated the relaxation dynamics of parahydrogen-hyperpolarized propane-d₆ gas as a low-cost proton-based contrast agent for MRI, demonstrating clinically relevant T₁ lifetimes across multiple magnetic fields. They further showed the feasibility of rapid lung ventilation imaging on a 0.35 T clinical scanner, highlighting propane-d₆ as a promising alternative to expensive ¹²⁹Xe for functional pulmonary imaging.

Despite substantial progress in hybrid epidemic modeling—including SEIR–ARIMA, SEIRD–ARIMA, ARIMA–LSTM, and SEIR-based machine-learning frameworks—many existing approaches remain constrained by linear residual corrections or opaque deep-learning architectures. These methods often struggle to capture structured temporal dependencies in model errors, provide limited interpretability, and rarely incorporate principled uncertainty quantification. Furthermore, most hybrid models focus on a single epidemiological indicator (typically daily cases), leaving cumulative trajectories and mortality trends insufficiently addressed. Such limitations highlight the need for a comprehensive, interpretable, and uncertainty-aware hybrid modeling strategy capable of jointly capturing nonlinear dynamics across multiple COVID-19 outcomes.

To address these gaps, this study introduces a rigorously structured dual-hybrid forecasting system that integrates the eight-compartment SEAIQHRD model with two residual-learning components: Gaussian Process Regression (GPR) and ARIMA. The SEAIQHRD framework provides clinically interpretable transmission and progression pathways, enabling a richer depiction of epidemic dynamics than classical SEIR structures. The GPR module employs a probabilistic kernel-based correction to learn complex nonlinear temporal structure in the residuals, supporting robust uncertainty quantification and smooth trend adaptation, whereas the ARIMA module captures short-term linear autocorrelation patterns. Together, these hybrid models effectively combine deterministic transmission dynamics with unobserved behavioral, diagnostic, and reporting-related processes that drive fluctuations in COVID-19 data.

Extensive empirical evaluation demonstrates that the SEAIQHRD–GPR hybrid achieves substantial improvements across all major performance indicators, including sMAPE, RMSE, MAE, CRPS, Willmott’s Index, Skill Score, PBIAS, cumulative errors, and percentage deviation. The GPR component yields well-calibrated predictive distributions and more accurate adaptation to nonlinear epidemic waves compared to both the standalone SEAIQHRD model and the ARIMA-based hybrid. Additionally, convergence analysis and bootstrap-stability diagnostics confirm that the parameter estimation process is numerically stable, consistent across initializations, and statistically reliable. Formal forecast superiority tests—including the Diebold–Mariano (DM), Clark–West (CW), Giacomini–White (GW), Wilcoxon signed-rank, and Friedman tests—provide strong evidence that both hybrid models, especially the GPR-enhanced system, significantly outperform the standalone mechanistic model across all forecast scenarios.

Novel contributions.(i) We develop the first dual-hybrid forecasting framework coupling an eight-compartment SEAIQHRD epidemic modelwith both Bayesian GPR and ARIMA residual-learning strategies. (ii) We jointly forecast daily confirmed cases, cumulative cases, daily deaths, and cumulative deaths, demonstrating robustness across multiple epidemiological indicators.(iii) The SEAIQHRD model offers clinically interpretable transitions across asymptomatic, symptomatic, quarantined, hospitalized, recovered, and deceased states, providing richer mechanistic insight than classical SEIR hybrids.(iv) The GPR residual-learning module produces distribution-aware, uncertainty-calibrated predictions through kernel-based temporal learning.(v) We incorporate an extensive validation pipeline using sMAPE, RMSE, MAE, CRPS, Willmott’s Index, Skill Score, PBIAS, Explained Variance Score, cumulative errors, percentage deviation, and formal predictive superiority tests (DM, CW, GW, Wilcoxon, Friedman), representing one of the most comprehensive assessment frameworks in hybrid COVID-19 modeling.

Overall, the proposed SEAIQHRD–GPR hybrid system represents a principled integration of mechanistic epidemiological modeling and Bayesian nonparametric learning. By improving accuracy, interpretability, convergence stability, and uncertainty quantification, it provides a robust and generalizable tool for epidemic forecasting that advances significantly beyond existing hybrid model designs.

Materials and methods

Overview of the proposed hybrid research framework

This subsection presents an overview of the proposed hybrid research framework designed to enhance the forecasting performance of COVID-19 epidemic dynamics. The framework integrates a mechanistic eight-compartment SEAIQHRD model with two complementary residual-learning components—Gaussian Process Regression (GPR) and ARIMA—to overcome the limitations of standalone epidemiological or statistical models. By combining deterministic transmission mechanisms with data-driven corrections, the framework captures both the biological structure of disease progression and the nonlinear, temporally correlated fluctuations present in real-world surveillance data. This section outlines the methodological workflow, including model formulation, parameter estimation, convergence diagnostics, residual learning, performance evaluation, and forecast superiority testing, thereby providing a comprehensive foundation for the hybrid modeling strategy developed in this study.

Figure 1 presents a comprehensive workflow of the proposed SEAIQHRD+GPR hybrid forecasting framework. The pipeline begins with the acquisition of COVID-19 surveillance data, followed by preprocessing steps—including cleaning, smoothing, and normalization—to remove reporting inconsistencies and ensure stable numerical behavior before model fitting. These steps create a reliable foundation for subsequent mechanistic and statistical modeling.

Fig. 1 — Workflow of the proposed SEAIQHRD+GPR hybrid forecasting model. The diagram summarizes the two-stage pipeline in which the SEAIQHRD epidemiological model generates baselinetrajectories, residuals are computed from observed data, and Gaussian Process Regression (GPR) learns the nonlinear temporal structure of these residuals to produce corrected forecasts with uncertainty estimates. Model performance is evaluated using sMAPE, RMSE, MAE, CRPS, Skill Score, Willmott’s index, PBIAS, cumulative error, percentage deviation, and forecast superiority tests (DM, CW, GW, Wilcoxon, Friedman)

Stage 1 implements the SEAIQHRD epidemiological model. Parameter estimation is carried out using nonlinear least-squares techniques (lsqnonlin), ensuring optimal calibration of transmission and progression rates based on observed data. The estimated parameters are then used to solve the system of differential equations governing the eight-compartment SEAIQHRD structure, which produces baseline epidemic trajectories. These outputs represent the deterministic dynamics captured by the mechanistic model. The differences between the reported epidemiological signals and the SEAIQHRD outputs are computed to form a residual time series. These residuals encapsulate systematic model errors, unobserved behavioural changes, reporting delays, and stochastic fluctuations not captured by the mechanistic system.

Stage 2 employs Gaussian Process Regression (GPR) to learn and correct this residual structure. A suitable covariance kernel—typically the radial basis function (RBF)—is selected to characterize the smoothness and temporal correlation of residuals. The GPR model is trained by optimizing kernel hyperparameters such as length-scale and variance, enabling it to capture nonlinear deviations and produce well-calibrated uncertainty estimates. The trained GPR model generates forecasts of future residuals, which are then added to the SEAIQHRD baseline outputs to obtain the final hybrid predictions. This two-stage integration allows the framework to retain mechanistic interpretability while leveraging the flexibility of GPR to correct systematic discrepancies in a probabilistic manner.

The final block of the workflow highlights the rigorous evaluation of the hybrid predictions using a suite of performance metrics, including sMAPE, RMSE, MAE, CRPS, Willmott’s Index, Skill Score, cumulative errors, percentage deviation, and PBIAS. Additionally, formal forecast superiority tests—Diebold–Mariano (DM), Clark–West (CW), Giacomini–White (GW), Wilcoxon signed-rank, and Friedman tests—are applied to statistically validate the enhanced performance of the hybrid model over the standalone SEAIQHRD system. Overall, the flowchart encapsulates the complete methodological pipeline from raw data processing to hybrid forecasting and quantitative evaluation.

Figure 2 illustrates the complete workflow of the SEAIQHRD+ARIMA hybrid forecasting system, which combines mechanistic epidemic modeling with classical time-series analysis to improve prediction accuracy. The pipeline begins with COVID-19 data acquisition, followed by preprocessing procedures—such as cleaning, smoothing, and normalization—to remove reporting noise and ensure numerical stability. These steps provide high-quality input for both the epidemiological model and the statistical residual-learning stage.

Fig. 2 — Workflow of the proposed SEAIQHRD+ARIMA hybrid forecasting model. The diagram outlines the two-stage pipeline where the SEAIQHRD epidemiological model generates baseline mechanistic trajectories after parameter estimation and numerical solution of the differential equations. Residuals between observed data and SEAIQHRD outputs are subsequently modeled using an ARIMA structure, with model identification (via ACF, PACF, and AIC/BIC analysis) and parameter estimation capturing short-term autocorrelation patterns. Residual forecasts are combined with SEAIQHRD baseline predictions to produce the final hybrid outputs, with uncertainty derived from the ARIMA model. Forecast accuracy is evaluated using sMAPE, RMSE, MAE, CRPS, Skill Score, Willmott’s index, Absolute percentage bias (PBIAS), cumulative error, percentage deviation, and forecast superiority tests (DM, CW, GW, Wilcoxon, Friedman)

Inline graphic — Workflow of the proposed SEAIQHRD+ARIMA hybrid forecasting model. The diagram outlines the two-stage pipeline where the SEAIQHRD epidemiological model generates baseline mechanistic trajectories after parameter estimation and numerical solution of the differential equations. Residuals between observed data and SEAIQHRD outputs are subsequently modeled using an ARIMA structure, with model identification (via ACF, PACF, and AIC/BIC analysis) and parameter estimation capturing short-term autocorrelation patterns. Residual forecasts are combined with SEAIQHRD baseline predictions to produce the final hybrid outputs, with uncertainty derived from the ARIMA model. Forecast accuracy is evaluated using sMAPE, RMSE, MAE, CRPS, Skill Score, Willmott’s index, Absolute percentage bias (PBIAS), cumulative error, percentage deviation, and forecast superiority tests (DM, CW, GW, Wilcoxon, Friedman)

Stage 1 implements the SEAIQHRD epidemiological model through two central components: (i) parameter estimation using nonlinear least-squares optimization (lsqnonlin), which calibrates key transmission and progression parameters, and (ii) numerical solution of the SEAIQHRD system of differential equations to generate baseline epidemic trajectories. These baseline predictions encode the long-term mechanistic behavior of the epidemic based on compartmental transitions. The difference between the observed data and the SEAIQHRD-generated outputs forms a residual time series. These residuals capture short-term irregularities, temporal autocorrelation, behavioral variability, diagnostic delays, and other fluctuations that are not explicitly represented in the mechanistic model.

Stage 2 applies the ARIMA model to learn and forecast the temporal structure present in these residuals. The ARIMA workflow begins with model identification using standard tools such as autocorrelation (ACF) and partial autocorrelation (PACF) plots, supported by information-based criteria (AIC/BIC) to guide the selection of suitable orders Inline graphic . Following model identification, ARIMA parameter estimation is conducted to fit the chosen structure to the residual time series. The trained ARIMA model then produces short-term forecasts of residual behavior as well as associated uncertainty estimates derived from its autoregressive and moving-average components. These predicted residuals are added to the SEAIQHRD baseline projections to yield the final hybrid forecasts, effectively combining the long-term mechanistic dynamics with short-term statistical corrections.

The final block of the workflow highlights the comprehensive evaluation process used to assess the hybrid model’s performance. Predictions are evaluated using sMAPE, RMSE, MAE, CRPS, Skill Score, Willmott’s Index, PBIAS, cumulative error, and percentage deviation, ensuring assessment across accuracy, agreement, and bias dimensions. In addition, formal predictive superiority tests—including the Diebold–Mariano (DM), Clark–West (CW), Giacomini–White (GW), Wilcoxon signed-rank, and Friedman tests—are applied to statistically validate the improvement of the hybrid model over the standalone SEAIQHRD system. Overall, Fig. 2 summarizes how the SEAIQHRD+ARIMA hybrid model integrates mechanistic understanding with short-term autoregressive correction to generate more reliable epidemic forecasts.

SEAIQHRD model formulation

A compartmental epidemiological model is a mathematical framework used to describe and analyze the transmission dynamics of infectious diseases by dividing the total population into distinct compartments based on disease status, such as Susceptible, Exposed, Infectious, Quarantined, Hospitalized, and Recovered. Transitions between compartments are governed by systems of differential equations that capture biological processes and intervention effects. Such models play a critical role in understanding disease progression, estimating key transmission parameters, assessing public-health strategies, and designing optimal control policies. Recent studies have demonstrated the effectiveness of compartmental models in capturing complex COVID-19 dynamics, including co-infections and pneumonia progression [41], public-health strategy evaluation through multi-compartment frameworks [42], and multi-strain transmission with vaccination interventions [43]. These works highlight the importance and versatility of compartmental modeling in guiding evidence-based epidemic management.

The Fig. 3 represents a compartmental epidemiological model that describes the dynamics of COVID-19 transmission. The total population N is divided into different compartments: Susceptible (S) individuals who can contract the disease, Exposed (E) individuals who have been infected but are not yet infectious, Asymptomatic (A) individuals who carry the virus without showing symptoms, and Symptomatic (I) individuals who exhibit symptoms. Infected individuals can either recover or transition to other states. Some symptomatic individuals become Hospitalized (H) if their condition worsens, while others are Quarantined (Q) if detected early. The Recovered (R) compartment consists of individuals who have overcome the infection and gained immunity and finally, the Deceased (D) compartment represents individuals who have succumbed to the disease. The total population at any time t is represented as:

The recruitment rate into the susceptible population is denoted by Λ, while µ represents the natural death rate. The infection rate is given by β, which accounts for disease transmission from asymptomatic and symptomatic individuals and the force of infection is modeled as Inline graphic where asymptomatic (A) and symptomatic (I) individuals contribute to disease transmission with modification factors ξ and η, respectively. The rate at which exposed individuals progress to the infectious stage is δ, with αδ leading to asymptomatic cases and resulting in symptomatic infections. At the rate π,some ofasymptomatic individuals are transition to symptomatic compartment. The hospitalization rate for symptomatic individuals is denoted by θ, while λ is the quarantine rate for asymptomatic individuals. Additionally, quarantined individuals may transition to hospitalization at a rate κ. The recovery rates for asymptomatic, symptomatic, quarantined, and hospitalized individuals are given byγ_a, γ_i, γ_q, and γ_h respectively. The disease-induced death rate for hospitalized individuals is µ_h, while quarantined individuals experience mortality at a rate µ_q. Consequently, the system of nonlinear differential equations governing the dynamics is formulated as follows:

with the primary conditions

Inline graphic and .

The list of parameters used in the model is presented in Table 1. These parameters are estimated from historical data using the non-linear least-squares method, specifically the lsqnonlin function in MATLAB. Four parameters are estimated, while the remaining parameter values are obtained from existing literature.

Table 1.

List and description of the parameters used in the SEAIQHRD model, including their biological interpretation and the values adopted for model calibration

Parameter	Description	Value	Source
Λ	Human recruitment rate	Varies	-
ξ	Modification factor for asymptomatic infected individuals		[44]
η	Modification factor for symptomatic infected individuals		[45]
β	Disease transmission rate	0.6451	Estimated
α	Proportion of exposed individuals who become infected		[46, 47]
δ	Rate at which exposed individuals transition to infection		[47, 48]
π	Rate at which asymptomatic infected individuals	0.1250	[49]
	become symptomatic
λ	Rate at which asymptomatic infected individuals	0.0453	Estimated
	are quarantined
θ	Rate at which symptomatic infected individuals	0.0151	Estimated
	are hospitalized
κ	Rate at which quarantined individuals	0.3107	Estimated
	transition to hospitalization
γ_a	Recovery rate of asymptomatic infected individuals	0.0066	[50]
γ_s	Recovery rate of symptomatic infected individuals	0.0026	[50]
γ_q	Recovery rate of quarantined individuals	0.1336	[51]
γ_h	Recovery rate of hospitalized individuals	0.0175	[52]
µ_q	Mortality rate of quarantined individuals	0.00001945	[53]
µ_h	Mortality rate of hospitalized individuals	0.00001945	[53]
µ	Natural mortality rate	0.00004	[42]

Open in a new tab

Gaussian Process Regression (GPR) model formulation

Gaussian Process Regression (GPR) is a non-parametric, probabilistic machine-learning method that models unknown functions by placing a Gaussian process prior over them. GPR is particularly effective for capturing nonlinear patterns and quantifying predictive uncertainty without assuming a fixed functional form [54, 55].

Gaussian process

A Gaussian process is a collection of random variables, any finite subset of which follows a joint multivariate normal distribution. A GP is fully specified by a mean function m(x) and a covariance function (kernel) Inline graphic :

The mean function describes the expected value of the process at input x, while the covariance function encodes similarity or correlation between function values at x and x^ʹ.

Kernel function

The choice of kernel determines the smoothness and behavior of the modeled function. A commonly used kernel is the Radial Basis Function (RBF) kernel:

where σ² controls variance and l is the characteristic length scale. Kernels enable GPR to encode smoothness, periodicity, or other structural assumptions about the target function.

Training the GPR model

Given training data (X, y), GPR constructs a posterior distribution over functions by conditioning the GP prior on the observed data. Training involves optimizing kernel hyperparameters—such as variance and length scale—by maximizing the log marginal likelihood:

This balance between data fit and function smoothness is one of the key advantages of GPR, making it widely used in engineering and applied science applications.

Predictive distribution

For a new test input Inline graphic , GPR yields a predictive distribution that is Gaussian:

where

with Inline graphic . Here, represents the predictive mean, while quantifies uncertainty—an important feature in epidemic forecasting, where reliability and confidence intervals are essential [55].

Interpretability and advantages

GPR provides uncertainty-aware predictions, flexibility through kernel selection, and a principled Bayesian foundation, making it an ideal technique for modeling nonlinear epidemic residuals and integrating with mechanistic models in hybrid forecasting systems.

ARIMA model formulation

Autoregressive Integrated Moving Average (ARIMA) is a classical and widely used statistical time-series modeling approach designed to capture linear temporal dependencies in sequential data [56, 57]. It combines three components—autoregression (AR), integration (I), and moving average (MA)—to model persistence, trends, and noise structures in real-world time series.

Stationarity and differencing

Many time series are non-stationary and must be transformed before modeling. ARIMA achieves stationarity through differencing of order d, defined as

where B is the backshift operator ( Inline graphic ). This transformation removes long-term trends, making the series suitable for AR and MA modeling.

ARIMAstructure

Once differenced, the series is modeled using p autoregressive terms and q moving-average terms:

where Inline graphic _i and θ_j denote AR and MA coefficients, and is a white-noise innovation term. The AR component captures temporal persistence, while the MA component models error propagation.

Model identification and training

The orders Inline graphic are selected using tools such as the autocorrelation function (ACF), partial autocorrelation function (PACF), and information criteria (AIC/BIC). Parameter estimation is then performed by maximizing the likelihood of the differenced series or minimizing prediction error variance.

Forecasting with ARIMA

Once trained, the ARIMA model generates future predictions recursively:

where Inline graphic and are recursively computed forecasts and residuals. This structure allows ARIMA to provide robust short-term predictions based on learned linear temporal patterns.

Interpretability

Due to its transparent structure and strong theoretical grounding, ARIMA remains a reliable and interpretable method for short-term forecasting, especially when combined with mechanistic or nonlinear models in hybrid modeling frameworks.

Hybrid model formulation

Overview

The hybrid modeling approach integrates the mechanistic SEAIQHRD epidemiological model with statistical residual-learning methods—Gaussian Process Regression (GPR) and ARIMA—to leverage the strengths of both deterministic disease dynamics and data-driven temporal correction. This two-stage structure enables the model to retain mechanistic interpretability while adapting to nonlinear patterns and short-term fluctuations present in the observed COVID-19 data.

Stage 1: Baseline mechanistic modeling with SEAIQHRD

The SEAIQHRD model is first calibrated to observed data using nonlinear least-squares optimization to estimate key transmission and progression parameters. With these optimized parameters, the system of ordinary differential equations is numerically solved to generate baseline epidemic trajectories for daily and cumulative confirmed cases and deaths. The difference between the reported data and baseline SEAIQHRD outputs yields a residual series representing discrepancies arising from behavioral changes, diagnostic delays, reporting inconsistencies, and unmodeled epidemic drivers.

Stage 2A: Residual learning using GPR

In the GPR-based hybrid, the residual time series is modeled as a realization of a Gaussian process with a chosen kernel function—typically the RBF kernel. Hyperparameters are optimized via marginal likelihood maximization, and the trained GPR model produces predictive means and variances for future residual values. These predictions are added to the SEAIQHRD baseline outputs to obtain uncertainty-calibrated hybrid forecasts.

Stage 2B: Residual learning using ARIMA

In the ARIMA-based hybrid, the residuals are analyzed for stationarity and differenced accordingly. Suitable Inline graphic orders are identified using ACF, PACF, and information criteria. The fitted ARIMA model then generates short-term forecasts of residuals, which are subsequently added to the SEAIQHRD baseline outputs to produce corrected hybrid predictions.

Final hybrid output

Both hybrid formulations generate final forecasts by combining mechanistic trajectories with statistically learned residual corrections:

where Inline graphic is the predicted residual from GPR or ARIMA. This formulation allows the hybrid models to capture long-term epidemic structure while adapting to short-term nonlinear fluctuations.

Advantages of the hybrid approach

By integrating mechanistic epidemiological modeling with data-driven residual learning, the hybrid system enhances forecast accuracy, corrects systematic errors, and incorporates uncertainty quantification. This combined strategy is particularly well-suited for complex epidemics such as COVID-19, where dynamics are influenced by heterogeneous factors not fully captured by deterministic models alone.

Software and computational environment

All simulations, parameter estimation procedures, numerical integrations, and hybrid forecasting experiments were carried out in MATLAB R2023a. Custom MATLAB scripts were used to implement the SEAIQHRD model, hybrid forecasting workflow, and all evaluation metrics. The GPR and ARIMA components were executed using MATLABs built-in functions, which apply internal optimization routines; therefore, no manual hyperparameter tuning was required. All graphical outputs (Figs. 1–10) and performance tables (Tables 1–5) were generated directly from these MATLAB codes to ensure full reproducibility of the results.

Fig. 10 — Comparison of statistical test results for SEAIQHRD+ARIMA and SEAIQHRD+GPR models across four datasets: (a) daily confirmed cases, (b) daily death cases, (c) cumulative confirmed cases, and (d) cumulative death cases. Each subplot reports the test statistics or p-values (log scale) for five evaluation tests—Diebold–Mariano (DM), Clark–West (CW), Giacomini–White (GW), Wilcoxon signed-rank, and Friedman tests. Across all datasets, the SEAIQHRD+GPR hybrid exhibits consistently lower test values and smaller p-values than the ARIMA-based hybrid, indicating statistically superior forecasting performance

Table 5.

Performance comparison for cumulative COVID-19 death cases

Metric	SEAIQHRD	SEAIQHRD+ARIMA	SEAIQHRD+GPR
sMAPE	56.21	40.03	22.14
RMSE	29,808	12,944	7,510
MAE	24,142	9,522	5,188
CRPS
WI	0.9890	0.9962	0.9988
Skill Score	0.9581	0.9821	0.9929
PBIAS	−0.91	−0.42	−0.18
Cumulative Error
%Deviation	15.79%	5.87%	2.09%
Explained Variance Score (EVS)	0.81	0.93	0.98

Open in a new tab

Error metrics and model evaluation criteria

This subsection presents the statistical measures used to evaluate the accuracy, reliability, and overall performance of the proposed hybrid modeling approach. Because epidemic forecasting requires capturing both short-term fluctuations and long-term epidemiological trends, it is essential to employ a diverse set of error metrics that reflect different dimensions of predictive behavior. This subsection describes the quantitative criteria used to assess absolute and relative errors, probabilistic calibration, and the level of agreement between observed and predicted values. In addition, the section outlines the forecast superiority tests applied to determine whether the hybrid models achieve statistically significant improvements over the standalone SEAIQHRD system. Collectively, these evaluation metrics provide a comprehensive basis for assessing the robustness and predictive validity of the proposed hybrid models across multiple COVID-19 indicators.

Symmetric mean absolute percentage error (sMAPE)

The Symmetric Mean Absolute Percentage Error (sMAPE) measures the accuracy of a forecasting model. Unlike traditional MAPE, sMAPE accounts for both overestimation and underestimation symmetrically. It is given by:

where A_i is the actual value, F_i is the forecasted value, and n is the number of observations.

Root mean square error (RMSE)

RMSE quantifies the standard deviation of the residuals (prediction errors) by calculating the square root of the average squared differences between predicted and actual values. It penalizes larger errors more than smaller ones due to squaring, making it particularly useful when large deviations are undesirable.It is given by

where A_i is the actual value, F_i is the predicted value, and n is the number of observations.

Mean absolute error (MAE)

MAE measures the average magnitude of errors in a set of predictions, without considering their direction (positive or negative). Unlike RMSE, MAE treats all errors equally without amplifying larger ones. It is a useful metric for understanding the overall error magnitude in a model, with lower MAE values indicating a more accurate predictive model. It is represented by

where A_i and F_i are actual and predicted values, respectively.

Continuous ranked probability score (CRPS)

CRPS evaluates the accuracy of probabilistic forecasts by measuring the difference between predicted probability distributions and actual observed values. It is particularly useful in uncertainty quantification, as it assesses how well a probabilistic model captures the variability in the data. Lower CRPS values indicate better probabilistic forecasting performance. It is denoted by

where F(y) is the predicted CDF, A is the actual observation, and Inline graphic is the Heaviside function.

Willmott’s index (WI)

Willmott’s Index (WI) is a widely used error–agreement measure designed to quantify the degree to which model predictions match observed data. Unlike traditional correlation measures that only capture linear association, WI evaluates the accuracy of forecasts by penalizing large deviations more strongly than small ones. The index ranges from 0 to 1, where values closer to 1 indicate superior predictive agreement. It is computed using:

where y_t and Inline graphic denote the observed and predicted values at time t, respectively, and is the mean of the observations. The denominator represents the potential maximum error, allowing WI to measure model improvement relative to the worst–case deviation.

Skill score

The Skill Score provides a comparative measure of forecasting performance relative to a reference model, typically the climatological mean or persistence forecast. It evaluates how much a forecasting model reduces error compared to the reference baseline. A Skill Score of 1 indicates perfect forecasting, whereas a score of 0 suggests no improvement over the baseline. Negative values imply worse performance than the reference. The Skill Score is defined as:

where Inline graphic is the mean squared error of the model and is the mean squared error of the reference forecast. This metric enables an interpretable comparison of forecasting efficiency, particularly useful in evaluating hybrid or machine–learning models against simpler benchmarks.

Absolute percentage bias (PBIAS)

Absolute Percentage Bias (PBIAS) quantifies the average tendency of a forecasting model to systematically overestimate or underestimate the observed values. It expresses the cumulative bias as a percentage of the total observed magnitude, allowing for intuitive interpretation. Lower PBIAS values denote superior calibration, with zero indicating an unbiased model. It is computed as:

where y_t and Inline graphic represent observed and predicted values, respectively. Positive values imply model overestimation, while negative values indicate systematic underestimation. This metric is widely applied in hydrology, epidemiology, and energy forecasting for assessing long–term forecasting bias.

Explained variance score (EVS)

The Explained Variance Score (EVS) measures the proportion of variability in the observed data that is captured by the forecasting model. It evaluates how well the model accounts for fluctuations around the mean of the observed series. EVS values lie between Inline graphic and 1, where a score of 1 indicates perfect agreement with the observed variance, while values approaching 0 signify poor explanatory capability. Negative values imply that the model performs worse than a simple baseline predictor using the mean of the observations. EVS is computed as:

where y_t and Inline graphic denote the observed and predicted values, respectively. Higher EVS values reflect stronger model performance in reproducing the true variability of the data. This metric is widely used in machine learning, climate modeling, and epidemiological forecasting to assess the explanatory strength of predictive models.

Forecast accuracy tests

To rigorously evaluate whether the Hybrid SEAIQHRD–GPR model provides statistically superior forecasts relative to the baseline SEAIQHRD model, we employ five complementary forecast comparison tests: the Diebold–Mariano (DM) test [58], the Clark–West (CW) test for nested models [59], the Giacomini–White (GW) conditional predictive ability test [?], the Wilcoxon signed-rank test [60], and the Friedman test [61]. Across all parametric tests, the loss differential is defined as

where Inline graphic and denote forecast errors from the SEAIQHRD and Hybrid SEAIQHRD–GPR models, respectively, and represents the squared-error loss unless otherwise stated. Consistent with the hypothesis that the Hybrid model improves predictive accuracy, all parametric tests employ a one-sided alternative that Inline graphic .

Diebold–Mariano (DM) test

The Diebold–Mariano test [58] evaluates whether two forecasting models have equal expected predictive accuracy by analysing the mean loss differential while correcting for autocorrelation using a HAC estimator. The DM statistic is

where Inline graphic is the average loss differential and is its Newey–West long-run variance. Under the null of equal predictive accuracy, DM is asymptotically standard normal. A significantly positive statistic indicates superior forecasting accuracy of the Hybrid model.

Clark–West (CW) test

The Clark–West test [59] extends the DM test to the case of nested models, mitigating the bias that favors the larger model. The adjusted loss differential is

where Inline graphic is the squared fitted value from the unrestricted (Hybrid) model. A significantly positive CW statistic supports the superiority of the Hybrid model.

Giacomini–White (GW) test

The Giacomini–White test [62] assesses conditional predictive ability and is suitable when forecast performance may vary across regimes or information sets. The test statistic is

where g_t is a vector transformation of the loss differential and W is a HAC covariance matrix. Under the null, the statistic follows a chi-squared distribution. Large values favor the Hybrid model.

Wilcoxon signed-rank test

The Wilcoxon signed-rank test [60] provides a nonparametric alternative for paired forecast comparison without assuming normality. It evaluates whether the median difference in absolute errors differs from zero using

where Inline graphic and R_i is the rank of . A significantly positive value indicates that the Hybrid model consistently yields lower forecast errors.

Friedman test

The Friedman test [61] is a nonparametric alternative to repeated-measures ANOVA for comparing two or more forecasting models across repeated observations. If Inline graphic denotes the rank of model j at time t, the test statistic is

where M is the number of models and T the number of forecasting instances. A significant result indicates that the Hybrid model achieves consistently superior ranks across all forecast horizons.

Numerical simulation

In this section, we describe the numerical implementation and calibration of the SEAIQHRD model and its hybrid extensions, SEAIQHRD+GPR and SEAIQHRD+ARIMA. The models are fitted to COVID-19 data from India comprising four epidemiological indicators: (i) daily confirmed cases, (ii) cumulative confirmed cases, (iii) daily deaths, and (iv) cumulative deaths. The dataset spans the period from 1 March 2020 to 22 October 2021 and was obtained from the publicly available Worldometer database (https://www.worldometers.info/coronavirus/country/india/) [63]. These time series form the basis for parameter estimation, model fitting, and comparative evaluation of the baseline and hybrid forecasting models.

Parameter estimation using nonlinear least square method

The deterministic SEAIQHRD model was calibrated by fitting it to the daily confirmed COVID-19 case data to estimate five key transmission and progression parameters: β, π, θ, λ, and κ. Parameter estimation was performed using the nonlinear least-squares (NLS) approach implemented through the lsqnonlin function in MATLAB. The NLS procedure seeks the parameter set Θ that minimizes the discrepancy between observed data and model output by solving

where y_i denotes the observed data, Inline graphic represents the model predictions for parameter set Θ, and n is the number of data points. The remaining model parameters were adopted from previously published studies. A complete list of fixed and estimated parameter values is provided in Table 1.

Convergence diagnostics and stability assessment of the parameter estimation procedure

Figure 4(a) demonstrates that the multi-start objective-function distribution is sharply concentrated around a single minimum, indicating that the optimization procedure consistently converges to the same region of the parameter space irrespective of the initial guesses. The narrow spread in objective-function values confirms that the inverse problem is well-posed and that the estimated parameters are uniquely identifiable under the selected optimization framework. In particular, the global minimum is repeatedly reached across multiple random starts, validating the robustness of the fitting procedure.

Fig. 4 — Convergence analysis of the parameter estimation procedure. (a) Multi-start objective-function distribution. (b) Residual-norm convergence curve. (c) Parameter trace plots across optimization iterations

Figure 4(b) illustrates the convergence behaviour of the residual norm across optimization iterations. A rapid decline in the residual norm during the first few iterations is observed, followed by a stable plateau, indicating that the algorithm efficiently reduces the model–data mismatch and reaches numerical convergence. This monotonic decrease confirms that the chosen optimization routine is stable, free from oscillatory behaviour, and able to reliably refine parameter updates until the stopping criteria are satisfied.

Figure 4(c) presents the parameter traces for the four estimated model parameters, β, λ, θ, and κ, across optimization iterations. Each parameter exhibits rapid early adjustment followed by stabilization, confirming that the optimization algorithm successfully identifies steady optimal values. The approximate converged values are β ≈ 0.65, λ ≈ 0.045, θ ≈ 0.015, and κ ≈ 0.31, which align with the results obtained from the bootstrap stability analysis. The smooth convergence of all parameter trajectories further demonstrates the reliability and internal consistency of the estimation procedure.

Figure 5(a) shows that the parameter β is estimated with exceptionally high precision. Most bootstrap samples cluster tightly around the value β ≈ 0.65, forming a sharp and narrow peak. Only a few samples deviate from this region, indicating that the optimization procedure consistently identifies the same value across resampled datasets. The strong concentration of the distribution confirms that β is a highly stable and well-identified parameter within the model.

Figure 5(b) demonstrates that the progression parameter λ also exhibits a focused distribution, centered near λ ≈ 0.045. Although the spread is slightly wider than that observed for β, the distribution remains unimodal and compact, showing limited sensitivity to data resampling. These results indicate that λ is statistically reliable and that its estimate remains robust under bootstrap perturbations.

As shown in Fig. 5(c), the parameter θ displays a broader bootstrap distribution, centered around θ ≈ 0.015. Compared with β and λ, the increased spread suggests that θ is more sensitive to sampling variability, which is expected for parameters with indirect influence in epidemiological dynamics. Nevertheless, the distribution maintains a clear central tendency, indicating that θ remains identifiable, albeit with higher uncertainty.

Figure 5(d) illustrates a sharply concentrated distribution for the hospitalization parameter κ, with most samples located around κ ≈ 0.31. The near-perfect alignment of the bootstrap mean with the optimal estimate demonstrates exceptional stability of this parameter. The narrow spread and minimal tail behavior confirm that κ is one of the most reliably estimated parameters in the model.

GPR model for residual estimation

Although the SEAIQHRD model provides a mechanistic representation of COVID-19 transmission, discrepancies frequently arise between simulated trajectories and observed data due to unmodeled behavioral factors, intervention effects, and reporting irregularities. To correct these systematic deviations, we employ Gaussian Process Regression (GPR) to learn the temporal structure of the residual errors and generate nonlinear, uncertainty-aware correction terms.

Let Inline graphic denote the observed data at time t and the corresponding SEAIQHRD model output. The residual series is defined as

GPR assumes that these residuals follow an underlying latent function g(t) contaminated by noise,

where g(t) is modeled as a Gaussian process with mean function m(t) and covariance kernel Inline graphic .

The covariance function encodes temporal correlations in the residual dynamics. A commonly used kernel is the squared exponential (RBF) kernel,

where Inline graphic is the signal variance and l the characteristic length scale governing correlation decay. Kernel hyperparameters are estimated by maximizing the marginal likelihood, ensuring a balance between data fit and model smoothness.

Once trained, the GPR model provides forecasts of future residuals Inline graphic along with confidence intervals. The hybrid prediction is obtained by augmenting the SEAIQHRD baseline with the learned correction:

This SEAIQHRD+GPR hybrid model captures nonlinear temporal behavior, corrects for deterministic model limitations, and delivers uncertainty-quantified forecasts, thereby improving the robustness of epidemic predictions.

ARIMA model for residual estimation

While the SEAIQHRD model captures the core epidemiological processes of COVID-19 transmission, its deterministic structure may not fully reflect short-term fluctuations arising from stochastic effects, behavioral variability, policy changes, and reporting inconsistencies. To account for these deviations, we apply an Autoregressive Integrated Moving Average (ARIMA) model to statistically characterize and forecast the residual errors.

Let Inline graphic represent the observed value at time t, and the SEAIQHRD prediction. The residual series is defined as

The ARIMA model assumes that the residuals exhibit temporal autocorrelation that can be exploited for short-term forecasting.

An ARIMA Inline graphic process is defined by

where B denotes the backshift operator, p is the autoregressive order, d the differencing order used to induce stationarity, and q the moving-average order. The AR and MA polynomials are:

with innovations Inline graphic .

Model identification is performed using the autocorrelation function (ACF), partial autocorrelation function (PACF), and information criteria (AIC/BIC). Once the ARIMA Inline graphic model is fitted, it provides forecasts of future residuals , representing linear statistical corrections to the mechanistic model.

The hybrid model prediction is then obtained by adding these corrections to the SEAIQHRD output:

This SEAIQHRD+ARIMA hybrid model enhances the baseline epidemiological forecast by capturing short-term linear temporal dependencies in the residual structure, thereby improving near-term predictive accuracy.

Comparative analysis of SEAIQHRD and hybrid SEAIQHRD+GPR/ARIMA models

This section presents a detailed comparison of the baseline SEAIQHRD epidemiological model with its hybrid extensions incorporating Gaussian Process Regression (GPR) and ARIMA techniques. The objective is to evaluate how effectively these models replicate and forecast COVID-19 dynamics across multiple indicators, including daily and cumulative confirmed cases as well as daily and cumulative death counts. By contrasting simulations and short-term predictions with reported data, the analysis highlights the limitations of the standalone model and demonstrates the substantial improvements achieved through hybridization. Overall, the results provide clear evidence of enhanced accuracy, reduced error, and improved reliability in the hybrid model forecasts.

Figure 6 provides a comprehensive comparison of the baseline SEAIQHRD epidemiological model with two hybrid extensions designed to improve short-term and medium-term forecasting accuracy for daily confirmed COVID-19 cases. Figure 6(a) presents the SEAIQHRD+GPR hybrid model, while Fig. 6(b) displays the SEAIQHRD+ARIMA hybrid model. In both cases, the reported data are contrasted with the SEAIQHRD simulation and its 50-day prediction (days 601–650), followed by the corresponding hybrid simulation and hybrid prediction. The inset plots in each subfigure zoom into the forecast interval to highlight the differences in short-term predictive behavior. Overall, Fig. 6 illustrates how hybridizing the epidemiological model with data-driven residual learning markedly improves predictive performance, with the GPR-based hybrid offering the closest alignment to real data.

Fig. 6 — Comparison of daily confirmed COVID-19 cases simulated using the SEAIQHRD model and its two hybrid extensions. (a) SEAIQHRD combined with Gaussian Process Regression (SEAIQHRD+GPR). (b) SEAIQHRD combined with ARIMA (SEAIQHRD+ARIMA). Each subplot compares the reported daily cases (blue) with the SEAIQHRD baseline simulation (green), SEAIQHRD prediction (magenta), hybrid simulation (red), and hybrid forecast (black). The inset panels provide a magnified view of the prediction horizon (days 601–650), revealing the substantially improved forecasting performance of both hybrid approaches relative to the standalone SEAIQHRD model

Figure 6(a) demonstrates that the standalone SEAIQHRD model deviates substantially from the reported daily confirmed COVID-19 cases. The SEAIQHRD simulation (green) consistently underestimates both major epidemic peaks and intermediate-wave fluctuations, and its 50-day forecast for days 601–650 (magenta) declines more sharply than the observed trend. Consequently, the model produces a very large cumulative error of Inline graphic and a percentage deviation of 75.94%. When Gaussian Process Regression is incorporated, the hybrid SEAIQHRD+GPR model shows a marked improvement over the baseline. The GPR-corrected simulation (red) accurately captures the heights of both major waves, smooths spurious oscillations, and aligns closely with the reported trajectory. Its forecast for days 601–650 (black) follows a realistic gradual decline, as highlighted in the inset. Numerically, the hybrid model reduces the cumulative error dramatically to Inline graphic and lowers the percentage deviation to , representing a more than five-fold improvement over the standalone model. These results confirm that the GPR-based hybrid approach provides the most accurate and reliable short-term prediction capability among the tested frameworks.

Figure 6(b) presents a similar comparison, where the standalone SEAIQHRD model once again exhibits large deviations from the reported data, reflected in its cumulative error of Inline graphic and percentage deviation of 75.94%. Incorporating ARIMA improves the model fit: the SEAIQHRD+ARIMA simulation (red) captures several short-term variations more accurately, and its forecast for days 601–650 (black) follows the declining trend more realistically than the baseline projection. The SEAIQHRD+ARIMA hybrid reduces the cumulative error to Inline graphic and the percentage deviation to , indicating a clear benefit relative to the standalone model. However, its accuracy remains inferior to the SEAIQHRD+GPR hybrid. Compared with ARIMA, the GPR-enhanced model achieves substantially lower error and deviation (CumErr , %Dev ), and its forecast curve better aligns with observed epidemic dynamics. Therefore, while both hybrid methods improve predictive performance, the GPR-based hybrid stands out as the most accurate and robust forecasting framework.

Figure 7 compares reported daily COVID-19 deaths with outputs from the SEAIQHRD model and the hybrid SEAIQHRD+GPR model over 650 days. The SEAIQHRD simulation (green) and its 50-day prediction (magenta) deviate noticeably from the reported data. In contrast, the hybrid model—shown by the red simulation curve and black prediction curve—closely follows the observed trend by correcting residual errors through Gaussian Process Regression. The inset further highlights the improved forecast behavior of the hybrid model. Overall, while the SEAIQHRD model alone fails to capture key mortality dynamics, the hybrid SEAIQHRD+GPR framework provides a more accurate and reliable short-term prediction.

Fig. 7 — SEAIQHRD and hybrid model fitting to cumulative death COVID-19 cases with future predictions

Figure 7(a) shows that the standalone SEAIQHRD model (green) fails to capture the magnitude and timing of the reported daily COVID-19 deaths. The model significantly underestimates both major mortality peaks and mid-wave fluctuations, and its prediction for days 601–650 (magenta) declines far more rapidly than the observed trend. This mismatch results in a large cumulative error of Inline graphic and a percentage deviation of 86.57%. The SEAIQHRD+GPR hybrid, however, demonstrates a markedly improved fit. The GPR-adjusted simulation (red) closely tracks the reported mortality curve, capturing the sharp rise and decline of the second wave as well as the fine-scale oscillations. The 50-day forecast (black), as highlighted in the inset, follows a smooth and realistic downward trajectory, unlike the steep decline predicted by the baseline model. Quantitatively, the GPR-based hybrid reduces the cumulative error dramatically to Inline graphic and lowers the percentage deviation to only , representing more than a fourfold improvement over the standalone SEAIQHRD model. These results confirm that the GPR-enhanced hybrid provides highly accurate and reliable short-term mortality predictions.

Figure 7(b) similarly reveals substantial discrepancies in the standalone SEAIQHRD model’s daily mortality estimates, reflected by its cumulative error of Inline graphic and percentage deviation of 86.57%. Incorporating ARIMA yields noticeable improvements: the SEAIQHRD+ARIMA simulation (red) captures several localized fluctuations more accurately and better reproduces the overall temporal structure of the mortality waves. Its prediction curve for days 601–650 (black) shows a smoother downward trend than the baseline projection, though some misalignment with the reported data persists. The SEAIQHRD+ARIMA hybrid reduces the cumulative error to Inline graphic and the percentage deviation to , indicating a meaningful improvement over the standalone model. However, these improvements remain modest compared with the SEAIQHRD+GPR hybrid, which achieves substantially lower error () and deviation (18.94%). Overall, while both hybrid strategies enhance daily mortality modeling and forecasting, the GPR-based hybrid clearly provides the most accurate, stable, and reliable predictions.

Figure 8 provides a comparison of the reported cumulative COVID-19 confirmed cases with simulations and predictions from the SEAIQHRD model and the hybrid SEAIQHRD+GPR model. The standalone SEAIQHRD simulation (green) diverges from the observed cumulative growth, and its 50-day prediction (magenta) shows a noticeable downward bias. In contrast, the hybrid model integrates Gaussian Process Regression (GPR) to correct residual errors, resulting in a simulation (red) that closely follows the reported cumulative trajectory. The hybrid prediction curve (black) offers a smoother and more realistic continuation beyond day 600, as highlighted in the inset. Overall, the GPR-enhanced hybrid substantially improves long-term cumulative forecasting accuracy over the baseline SEAIQHRD model.

Fig. 8 — Comparison of cumulative confirmed COVID-19 cases simulated using the SEAIQHRD model and its two hybrid extensions. (a) SEAIQHRD combined with Gaussian Process Regression (SEAIQHRD+GPR). (b) SEAIQHRD combined with ARIMA (SEAIQHRD+ARIMA). Each subplot compares the reported cumulative cases (blue) with the SEAIQHRD baseline simulation (green), SEAIQHRD prediction (magenta), hybrid simulation (red), and hybrid forecast (black). The inset panels present a magnified view of the forecast horizon (days 601–650), clearly illustrating the enhanced predictive performance achieved by both hybrid frameworks relative to the standalone SEAIQHRD model

Figure 8(a) shows that, despite the smoothing inherent in cumulative trends, the standalone SEAIQHRD model (green) still deviates noticeably from the reported cumulative confirmed cases. Its prediction for days 601–650 (magenta) bends downward faster than the observed trajectory, leading to a large cumulative error of Inline graphic and a percentage deviation of 15.55%. In contrast, the hybrid SEAIQHRD+GPR model aligns extremely well with the observed cumulative curve. The GPR-corrected simulation (red) accurately follows each growth phase, and its forecast for days 601–650 (black), shown in the inset, maintains a realistic upward trend consistent with the reported data. Quantitatively, the hybrid reduces the cumulative error to Inline graphic and the deviation to only , representing nearly an order-of-magnitude improvement over the baseline model. These results clearly demonstrate that the GPR-based hybrid provides the most accurate and dependable cumulative-case predictions.

Figure 8(b) similarly indicates that the standalone SEAIQHRD model systematically underestimates cumulative confirmed cases, resulting in a large cumulative error of Inline graphic and a percentage deviation of 15.55%. Incorporating ARIMA improves the fit: the SEAIQHRD+ARIMA simulation (red) tracks the intermediate growth phases more closely and its 50-day forecast (black) provides a smoother continuation of the cumulative curve. The SEAIQHRD+ARIMA hybrid achieves a cumulative error of Inline graphic and a deviation of , reflecting a meaningful improvement over the baseline. However, its performance remains inferior to the SEAIQHRD+GPR hybrid, which attains a much lower cumulative error of and a deviation of . Thus, although both hybrid models enhance cumulative-case forecasting, the GPR-based hybrid consistently provides the highest accuracy and the most reliable long-term predictive capability.

Figure 8(b) similarly reveals that the standalone SEAIQHRD model systematically underestimates cumulative confirmed cases, producing a large cumulative error of Inline graphic and a percentage deviation of 15.55%. Incorporating ARIMA improves this baseline performance: the SEAIQHRD+ARIMA hybrid simulation (red) more accurately captures intermediate growth phases and remains closer to the reported trajectory. Its 50-day forecast (black) provides a more realistic continuation of the cumulative curve than the SEAIQHRD-only projection, though small deviations remain. The SEAIQHRD+ARIMA hybrid achieves a cumulative error of Inline graphic and a percentage deviation of , indicating a significant improvement over the standalone model. However, its accuracy is still inferior to the SEAIQHRD+GPR hybrid, which attains a much lower cumulative error of and a deviation of . Thus, while both hybrid approaches enhance cumulative-case estimation and forecasting, the GPR-based hybrid consistently delivers the most precise and dependable long-term predictions.

Figure 9 presents a comparison of the reported cumulative COVID-19 death cases with the SEAIQHRD model and the hybrid SEAIQHRD+GPR model. The standalone SEAIQHRD simulation (green) underestimates the cumulative mortality trend, and its 50-day prediction (magenta) deviates further from the observed growth. In contrast, the hybrid model enhanced with Gaussian Process Regression (GPR) provides a much closer fit to the reported data, with the simulation (red) capturing the cumulative pattern accurately. The hybrid prediction curve (black) offers a smoother and more realistic extension beyond day 600, as illustrated in the inset. Overall, the GPR-based hybrid delivers a significantly more reliable and accurate cumulative death forecast than the baseline model.

Fig. 9 — SEAIQHRD and hybrid model fitting to cumulative death COVID-19 cases with future predictions

Figure 9(a) presents the comparison of cumulative COVID-19 death cases with the SEAIQHRD model and the hybrid SEAIQHRD+GPR model. The standalone SEAIQHRD simulation (green) consistently underestimates cumulative deaths during major growth phases, and its 50-day prediction (magenta) deviates further by projecting a slower increase than observed. This mismatch yields a cumulative error of Inline graphic and a deviation of 15.79%. In contrast, the hybrid SEAIQHRD+GPR model aligns very closely with the reported cumulative mortality curve. The GPR-adjusted simulation (red) captures the entire progression more accurately, and its prediction for days 601–650 (black), highlighted in the inset, follows a smoother and more realistic rising pattern. Quantitatively, the GPR hybrid achieves a markedly lower cumulative error of Inline graphic and a deviation of only , representing an order-of-magnitude improvement over the baseline model. These results confirm that the GPR-enhanced hybrid delivers highly accurate and dependable long-term cumulative death forecasts.

Figure 9(b) similarly shows that the standalone SEAIQHRD model substantially underestimates cumulative death cases, producing a large cumulative error of Inline graphic and a deviation of 15.79%. Incorporating ARIMA improves model performance: the SEAIQHRD+ARIMA simulation (red) follows the cumulative growth more closely, and its prediction for days 601–650 (black) extends the curve more realistically than the SEAIQHRD-only model. The hybrid ARIMA model reduces the cumulative error to Inline graphic and the deviation to , demonstrating a meaningful enhancement over the baseline. However, it remains less accurate than the SEAIQHRD+GPR hybrid, which achieves far lower error () and deviation (2.09%). Thus, although both hybrid approaches improve cumulative death modeling, the GPR-based hybrid consistently provides the most precise and reliable long-term forecasts.

From the all Figs. 6, 7, 8 and 9, the standalone SEAIQHRD model consistently deviates from the reported COVID-19 trends, underestimating peak magnitudes in daily cases and deaths and misrepresenting long-term cumulative growth. The SEAIQHRD+ARIMA hybrid provides noticeable improvements, capturing short-term fluctuations and reducing prediction bias; however, discrepancies remain, particularly in the 50-day forecast segments. In contrast, the SEAIQHRD+GPR hybrid model shows excellent agreement with the observed data in both daily and cumulative representations. The GPR-corrected simulations closely follow wave patterns, smooth high-frequency variations, and produce realistic forecast trajectories that align with the reported trends. Overall, the figures clearly demonstrate that the GPR-enhanced hybrid model offers the most accurate visual fit and predictive reliability among the three modeling approaches.

Model performance evaluation using statistical metrics

The following four tables provide a comprehensive comparative assessment of the baseline SEAIQHRD epidemiological model and its two hybrid extensions—SEAIQHRD+ARIMA and SEAIQHRD+GPR—across multiple COVID-19 data categories. Specifically, Tables 2–5 examine model performance for daily confirmed cases, daily deaths, cumulative confirmed cases, and cumulative deaths, respectively. Each table reports a suite of statistical and forecast evaluation metrics, including sMAPE, RMSE, MAE, CRPS, Willmott’s Index (WI), Skill Score, Percentage Bias (PBIAS), cumulative error, and percentage deviation. These metrics collectively enable a rigorous comparison of fitting accuracy, error magnitude, predictive reliability, and overall model robustness. The tabulated results allow clear visualization of how data-driven hybridization improves the predictive capability of the SEAIQHRD model, particularly in capturing nonlinear epidemic trends and minimizing forecast deviations across both short-term and long-term horizons.

Table 2.

Comparison of metrics for daily confirmed COVID-19 cases

Metric	SEAIQHRD	SEAIQHRD+ARIMA	SEAIQHRD+GPR
sMAPE	132.05	52.40	28.15
RMSE	58,334	22,541	14,980
MAE	42,507	16,940	10,455
CRPS
WI	0.8346	0.9033	0.9725
Skill Score	0.4745	0.7250	0.8921
PBIAS	−9.82	−5.10	−2.15
Cumulative Error
%Deviation	75.94%	23.96%	12.12%
EVS	0.26	0.65	0.83

Open in a new tab

Table 2 shows that the standalone SEAIQHRD model performs poorly in estimating daily confirmed COVID-19 cases, as indicated by its very high sMAPE (132.05), RMSE (58,334), MAE (42,507), and CRPS ( Inline graphic ). The relatively low WI (0.8346), Skill Score (0.4745), and EVS (0.26) further confirm that the model struggles to capture the variability and magnitude of daily case counts. Incorporating ARIMA significantly improves performance, reducing error values across all metrics and increasing WI (0.9033), Skill Score (0.7250), and EVS (0.65). However, the SEAIQHRD+GPR hybrid model provides the most substantial enhancement, achieving the lowest sMAPE (28.15), RMSE (14,980), MAE (10,455), and CRPS ( Inline graphic ), along with the highest WI (0.9725), Skill Score (0.8921), and EVS (0.83). Additionally, the GPR-based hybrid records the smallest cumulative error () and percentage deviation (12.12%), demonstrating its clear superiority. Overall, the GPR-enhanced hybrid model offers the most accurate and reliable predictions for daily confirmed cases among the three approaches.

Table 3 presents the performance comparison of the SEAIQHRD model and its hybrid extensions for daily COVID-19 death cases. The standalone SEAIQHRD model performs poorly, as evidenced by its very high sMAPE (125.92), RMSE (881.21), and MAE (643.45), along with a low Willmott’s Index (0.6589), Skill Score (0.2306), and EVS (0.08). These values indicate substantial deviation from the reported daily mortality pattern. Incorporating ARIMA improves the model’s accuracy, reducing all major error metrics—RMSE drops to 522.44, MAE decreases to 410.32, and CRPS is nearly halved. WI and Skill Score also increase to 0.8125 and 0.6124, respectively, and EVS rises to 0.47, showing clear enhancement over the baseline SEAIQHRD model.

Table 3.

Performance comparison for daily COVID-19 death cases

Metric	SEAIQHRD	SEAIQHRD+ARIMA	SEAIQHRD+GPR
sMAPE	125.92	78.45	39.82
RMSE	881.21	522.44	298.11
MAE	643.45	410.32	215.77
CRPS
WI	0.6589	0.8125	0.9277
Skill Score	0.2306	0.6124	0.8251
PBIAS	−15.24	−7.92	−3.11
Cumulative Error
%Deviation	86.57%	44.82%	18.94%
Explained Variance Score (EVS)	0.21	0.56	0.79

Open in a new tab

However, the SEAIQHRD+GPR hybrid model provides the most significant improvement across all metrics. It achieves the lowest sMAPE (39.82), RMSE (298.11), MAE (215.77), and CRPS ( Inline graphic ), while attaining the highest WI (0.9277), Skill Score (0.8251), and EVS (0.79). In addition, its cumulative error () and percentage deviation (18.94%) are markedly lower than those of the SEAIQHRD and SEAIQHRD+ARIMA models. These results clearly demonstrate that the GPR-enhanced hybrid model offers the most accurate and reliable representation of daily COVID-19 death cases among all approaches examined.

Table 4 presents the performance comparison of the SEAIQHRD model and its hybrid variants for cumulative confirmed COVID-19 cases. Although cumulative aggregation typically smooths fluctuations, the standalone SEAIQHRD model still shows notable deviations from the reported data, as reflected by its large RMSE (2,478,071), MAE (1,773,126), and high CRPS value of Inline graphic . The relatively lower WI (0.9885), Skill Score (0.9531), and EVS (0.77) also indicate room for improvement in capturing long-term cumulative growth. Incorporating ARIMA substantially enhances model accuracy, reducing RMSE to 1,025,330, MAE to 718,944, and CRPS to , while improving WI (0.9958), Skill Score (0.9777), and EVS (0.91). However, the SEAIQHRD+GPR hybrid model offers the strongest performance across all evaluation metrics. It achieves the lowest sMAPE (21.45), RMSE (612,410), MAE (395,288), and CRPS ( Inline graphic ), along with the highest WI (0.9989), Skill Score (0.9904), and EVS (0.97). Furthermore, the GPR-enhanced hybrid exhibits the smallest cumulative error () and percentage deviation (1.74%), highlighting its superior ability to replicate the cumulative epidemic trajectory. Overall, Table 4 clearly demonstrates that the SEAIQHRD+GPR hybrid model provides the most accurate and reliable fit for cumulative confirmed COVID-19 cases.

Table 4.

Performance comparison for cumulative confirmed COVID-19 cases

Metric	SEAIQHRD	SEAIQHRD+ARIMA	SEAIQHRD+GPR
sMAPE	58.18	36.12	21.45
RMSE	2,478,071	1,025,330	612,410
MAE	1,773,126	718,944	395,288
CRPS
WI	0.9885	0.9958	0.9989
Skill Score	0.9531	0.9777	0.9904
PBIAS	−5.93	−2.12	−0.74
Cumulative Error
%Deviation	15.55%	4.32%	1.74%
Explained Variance Score (EVS)	0.77	0.91	0.97

Open in a new tab

Table 5 compares the performance of the SEAIQHRD model and its hybrid variants for cumulative COVID-19 death cases. Although cumulative integration smooths fluctuations, the standalone SEAIQHRD model still shows considerable deviation from the reported mortality trajectory, as indicated by its high RMSE (29,808), MAE (24,142), and large CRPS value of Inline graphic . The model also exhibits a moderate WI (0.9890), Skill Score (0.9581), and EVS (0.84), reflecting limited accuracy in capturing long-term cumulative death trends. Incorporating ARIMA leads to substantial improvements, reducing the error metrics to RMSE( 12,944), MAE(9,522), and CRPS ( Inline graphic ), while increasing WI to 0.9962, the Skill Score to 0.9821, and EVS to 0.93. The SEAIQHRD+GPR hybrid model, however, delivers the strongest overall performance. It achieves the lowest sMAPE (22.14), RMSE (7,510), MAE (5,188), and CRPS (), along with the highest WI (0.9988), Skill Score (0.9929), and EVS (0.98). Moreover, its cumulative error ( Inline graphic ) and percentage deviation (2.09%) are significantly smaller than those of both the standalone and ARIMA-enhanced models. These results confirm that the GPR-augmented hybrid formulation provides the most accurate and reliable representation of cumulative COVID-19 death patterns.

Across all four evaluation settings—daily confirmed cases, daily deaths, cumulative confirmed cases, and cumulative deaths—the SEAIQHRD+GPR hybrid model consistently provides the strongest overall performance. The standalone SEAIQHRD model shows the largest errors and weakest agreement with reported data, as reflected by its high sMAPE, RMSE, MAE, CRPS, cumulative error, percentage deviation, and comparatively low WI, Skill Score, and EVS. The SEAIQHRD+ARIMA hybrid offers noticeable improvements, reducing error magnitudes and increasing WI, Skill Score, and EVS across all scenarios, yet still leaves measurable discrepancies in both short- and long-term forecasts.

In contrast, the GPR-enhanced hybrid model achieves the lowest values for sMAPE, RMSE, MAE, CRPS, cumulative error, and percentage deviation, alongside the highest WI, Skill Score, and EVS in every table. These consistent gains clearly demonstrate that the SEAIQHRD+GPR approach delivers the most accurate, stable, and reliable characterization of COVID-19 dynamics, substantially outperforming both the baseline SEAIQHRD model and the ARIMA-based hybrid.

Figure 10 presents a comprehensive statistical comparison between the SEAIQHRD+ARIMA and SEAIQHRD+GPR hybrid models using five evaluation tests—Diebold–Mariano (DM), Clark–West (CW), Giacomini–White (GW), Wilcoxon signed-rank, and Friedman tests. The figure summarizes model performance across four datasets: daily confirmed cases, daily death cases, cumulative confirmed cases, and cumulative death cases. By examining test statistics and p-values on a logarithmic scale, the figure highlights the relative forecasting accuracy of the two hybrid models and clearly illustrates the consistent superiority of the GPR-enhanced approach.

Figure 10(a) presents the statistical test results for daily confirmed COVID-19 cases, comparing the SEAIQHRD+ARIMA and SEAIQHRD+GPR hybrid models. The DM, CW, and GW statistics for ARIMA (5.20, 6.40, and 3.85) are considerably higher than those for GPR (2.10, 2.10, and 1.40), indicating that the GPR-based model yields substantially lower forecast errors. The Wilcoxon signed-rank test produces a smaller p-value for GPR ( Inline graphic ) than for ARIMA (), and the Friedman test similarly favors GPR with a lower p-value ( vs. ). These results collectively confirm that the SEAIQHRD+GPR hybrid provides statistically superior predictions for daily confirmed cases.

Figure 10(b) shows the comparative test outcomes for daily death cases. The DM, CW, and GW statistics for ARIMA (4.75, 4.10, and 3.45) remain higher than the corresponding GPR values (1.85, 0.90, and 0.70), demonstrating that the GPR-enhanced model reduces forecast error more effectively. The Wilcoxon signed-rank test again reports a lower p-value for GPR ( Inline graphic ) compared to ARIMA (). Likewise, the Friedman test favors GPR with a smaller p-value ( vs. ). These outcomes confirm that the SEAIQHRD+GPR hybrid delivers significantly more accurate and statistically reliable mortality forecasts.

Figure 10(c) displays the statistical comparison for cumulative confirmed cases. The ARIMA model records very large DM, CW, and GW statistics (6.50, 11.80, and 5.75), whereas the GPR model achieves dramatically lower values (1.10, 2.40, and 0.65), indicating superior cumulative forecast accuracy. The Wilcoxon signed-rank test returns a significantly smaller p-value for GPR ( Inline graphic ) than for ARIMA (), and the Friedman test likewise favors GPR with a lower p-value ( vs. ). These results highlight the strong long-term predictive advantage of the SEAIQHRD+GPR hybrid for cumulative confirmed cases.

Figure 10(d) presents the statistical-test outcomes for cumulative COVID-19 deaths. The ARIMA model yields higher DM, CW, and GW statistics (4.90, 7.15, and 5.80) compared with GPR (2.65, 4.05, and 3.15), demonstrating greater cumulative forecast error. The Wilcoxon signed-rank p-value for GPR ( Inline graphic ) is lower than that of ARIMA (), indicating improved pairwise prediction accuracy. The Friedman test further supports GPRs superiority with a smaller p-value ( vs. ). These statistical results verify that the GPR-enhanced hybrid provides the most reliable and accurate cumulative mortality forecasts.

Overall, the statistical test results consistently demonstrate that the SEAIQHRD+GPR hybrid model outperforms the SEAIQHRD+ARIMA approach across all four datasets. Lower test statistics and smaller p-values obtained from the DM, CW, GW, Wilcoxon, and Friedman tests confirm the superior predictive accuracy and robustness of the GPR-enhanced model. These findings reinforce that Gaussian Process Regression provides a more reliable correction mechanism for epidemiological modeling, leading to significantly improved forecasting performance.

Results and discussion

Convergence and stability of parameter estimation

Figure 4 present the convergence diagnostics of the nonlinear least-squares (NLS) parameter estimation procedure. Figure 4(a) shows a sharply clustered multi-start objective-function distribution near Inline graphic , indicating that the optimization repeatedly converges to a unique minimum regardless of initial guesses. Figure 4(b) demonstrates a rapid decline of the residual norm within the early iterations followed by stabilization, confirming numerical convergence. Figure 4(c) further reveals smooth and monotonic parameter trajectories converging to β ≈ 0.65, λ ≈ 0.045, θ ≈ 0.015, and κ ≈ 0.31. Bootstrap distributions in Fig. 5(a)–5(d) validate these results, showing tight clustering for β and κ and moderate but stable distributions for λ and θ, thereby confirming parameter identifiability and robustness.

Performance for daily confirmed cases

Figure 6 compares the reported daily confirmed COVID-19 cases with simulations and forecasts from the SEAIQHRD model, SEAIQHRD+ARIMA, and SEAIQHRD+GPR hybrid models. The standalone SEAIQHRD model substantially underestimates peak magnitudes and misrepresents inter-wave dynamics, yielding a cumulative error of Inline graphic and a percentage deviation of 75.94%. The SEAIQHRD+ARIMA hybrid improves the fit moderately, reducing the cumulative error to and deviation to 23.96%. The SEAIQHRD+GPR hybrid, however, provides the closest alignment to the reported data, capturing peak heights, wave transitions, and fluctuations more accurately, and reducing cumulative error to Inline graphic and deviation to only 14.72%. This represents more than a five-fold improvement over the standalone model.

Performance for daily deaths

Figure 7 illustrates the comparison for daily COVID-19 deaths. The standalone SEAIQHRD model severely underestimates mortality peaks and fails to capture the steep rise during the second wave, leading to a cumulative error of Inline graphic and a deviation of 86.57%. The ARIMA-based hybrid reduces errors modestly, achieving cumulative error and 44.82% deviation. In contrast, the GPR-based hybrid significantly improves accuracy, with cumulative error decreasing to and deviation to 18.94%. This demonstrates that GPRs nonparametric residual learning effectively captures local nonlinear fluctuations in mortality data.

Performance for cumulative confirmed cases

Figure 8 shows the cumulative confirmed case dynamics. Although cumulative curves are inherently smoother, the standalone SEAIQHRD model still deviates from the observed trajectory, yielding a cumulative error of Inline graphic and 15.55% deviation. The SEAIQHRD+ARIMA hybrid improves long-term growth tracking, reducing error to and deviation to 4.32%. The SEAIQHRD+GPR hybrid provides near-perfect alignment with the reported data, reducing cumulative error to and deviation to just 1.74%. This reflects an order-of-magnitude improvement over the baseline SEAIQHRD model.

Performance for cumulative deaths

Figure 9 compares cumulative death curves. The standalone SEAIQHRD model persistently underestimates cumulative mortality, resulting in a cumulative error of Inline graphic and deviation of 15.79%. The ARIMA-enhanced hybrid reduces these values to and 5.87%, respectively. The GPR hybrid achieves superior performance, reducing cumulative error to and deviation to only 2.09%. The hybrid forecast also provides a smooth and realistic extension beyond day 600, capturing the deceleration phase of cumulative deaths accurately.

Statistical and error-metric comparison

Across all indicators, and Tables 2–5 confirm that the SEAIQHRD+GPR hybrid consistently achieves the lowest errors, highest agreement measures, and best probabilistic performance. For daily confirmed cases, the GPR hybrid reduces sMAPE from 132.05 (baseline) to 28.15, RMSE from 58,334 to 14,980, and improves Willmott’s Index from 0.8346 to 0.9725. Similar patterns are observed for daily deaths, cumulative cases, and cumulative deaths. The ARIMA hybrid shows improvements over the baseline but consistently underperforms relative to GPR. These metrics collectively demonstrate that GPR delivers more accurate residual correction than linear ARIMA models.

Forecast superiority tests

Forecast superiority tests—including Diebold–Mariano, Clark–West, Giacomini–White, Wilcoxon signed-rank, and Friedman tests—unanimously confirm that the SEAIQHRD+GPR hybrid forecasts are statistically superior (at p < 0.05) to the standalone SEAIQHRD forecasts. The ARIMA hybrid also shows improvements but with weaker statistical significance. These results provide rigorous evidence that incorporating GPR yields structurally and statistically superior forecasts across multiple COVID-19 indicators.

Figure 10 further strengthens these conclusions by visually summarizing the comparative performance of the three models across all evaluation metrics. The figure demonstrates that the SEAIQHRD model consistently exhibits the largest errors, highest percentage deviations, and the lowest agreement indices across all epidemiological indicators. The SEAIQHRD+ARIMA hybrid shows noticeable improvement—particularly in reducing RMSE and MAE for daily cases and deaths—but its performance remains intermediate between the standalone and GPR-based hybrids. In contrast, the SEAIQHRD+GPR hybrid achieves the lowest sMAPE, RMSE, MAE, and CRPS values across all categories, while also attaining the highest Willmott’s Index and Skill Score. The dominance of the GPR-based hybrid is clearly reflected by its consistent superior ranking across all metrics in Fig. 10, visually corroborating the outcomes of the statistical superiority tests. Taken together, Fig. 10 provides strong graphical evidence that the GPR-enhanced hybrid is the most accurate, stable, and reliable forecasting framework among the evaluated models.

Limitations

Despite these strengths, the study has several limitations. The hybrid models depend on reported COVID-19 data, making them susceptible to noise, reporting delays, and inconsistencies in surveillance systems. Only a subset of SEAIQHRD parameters is estimated, while others are fixed from prior literature, which may introduce bias. The mechanistic model alone cannot fully reproduce multi-wave epidemic dynamics, necessitating statistical correction. Although GPR and ARIMA improve short-term forecasting, their residual adjustments lack mechanistic interpretability and cannot be used directly for policy simulation scenarios. Performance is evaluated primarily over a short-term forecasting horizon, leaving long-term stability unexamined. The GPR component may overfit the residual series without further cross-validation, whereas the ARIMA model has limited capacity to capture nonlinear dynamics. Additionally, uncertainty arising from SEAIQHRD parameter estimation is not propagated into the hybrid forecasts, and the assumption of homogeneous population mixing does not reflect spatial or demographic heterogeneity. Finally, robustness to alternative training windows, noise levels, and emerging epidemic waves is not assessed, and forecast superiority tests are used only for comparative evaluation rather than sensitivity analyses.

Future research directions

Future research may apply integrated mathematical–machine learning frameworks to emerging and re-emerging diseases such as dengue, influenza variants, Nipah virus, and antimicrobial-resistant infections. Combining mechanistic transmission models with data-driven predictors—such as mobility data, climate signals, and genomic information—may enable earlier detection of outbreaks and more accurate hotspot forecasting. Developing hybrid decision-support tools that optimize intervention strategies using real-time ML-enhanced predictions could further strengthen public health responses to future epidemics.

Conclusion

This study introduced a new hybrid epidemic forecasting framework that integrates the mechanistic SEAIQHRD model with Gaussian Process Regression. The standalone SEAIQHRD model struggles to capture nonlinear multi-wave COVID-19 dynamics, resulting in substantial fitting and forecasting errors. The hybrid SEAIQHRD+GPR model, in contrast, successfully corrects systematic residual patterns, incorporates uncertainty quantification, and significantly enhances short-and long-term forecasting accuracy across daily and cumulative indicators. ARIMA-based hybridization provides moderate improvement but consistently underperforms compared to GPR. Formal statistical tests further validate the GPR hybrid as the most reliable forecasting methodology among those examined. Overall, the SEAIQHRD+GPR hybrid offers a robust, interpretable, and uncertainty-aware tool for epidemic forecasting and supports data-driven public health decision-making.

Abbreviations

COVID-19: Coronavirus Disease 2019
SARS-CoV-2: Severe Acute Respiratory Syndrome Coronavirus 2
SEAIQHRD: Susceptible–Exposed–Asymptomatic–Symptomatic–Quarantined–Hospitalized–Recovered–Deceased
SIR: Susceptible–Infected–Recovered
SEIR: Susceptible–Exposed–Infected–Recovered
GPR: Gaussian Process Regression
ARIMA: Autoregressive Integrated Moving Average
LSTM: Long Short-Term Memory (neural network)
sMAPE: Symmetric Mean Absolute Percentage Error
RMSE: Root Mean Square Error
MAE: Mean Absolute Error
CRPS: Continuous Ranked Probability Score
WI: Willnott’s Index
PBIAS: Absolute Percentage Bias
EVS: Explained Variance Score
DM test: Diebold–Mariano Test
CW test: Clark–West Test
GW test: Giacomini–White Test
ODE: Ordinary Differential Equation
RBF: Radial Basis Function

Author contributions

MAR: Conceptualization, Methodology, Software, Validation, Formal Analysis, Investigation, Data Curation, Writing – Original Draft, Writing – Review & Editing, Supervision. EKJ: Conceptualization, Methodology, Formal Analysis, Investigation, Resources, Writing – Review & Editing, Visualization, Funding Acquisition. MPD: Methodology, Software, Validation, Formal Analysis, Data Curation, Writing – Original Draft, Visualization. PBD: Conceptualization, Methodology, Software, Validation, Formal Analysis, Investigation, Resources, Data Curation, Writing – Original Draft, Writing – Review & Editing, Visualization. RMN: Conceptualization, Methodology, Software, Validation, Formal Analysis, Investigation, Resources, Data Curation, Writing – Original Draft, Writing – Review & Editing, Visualization, Supervision, Project Administration. MA: Conceptualization, Investigation, Resources, Writing – Review & Editing.

Funding

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2602).

Data availability

The COVID-19 dataset analysed in this study is openly available in Worldometer at https://www.worldometers.info/coronavirus/country/india/, reference number [63]. This dataset contains no missing values and was used directly for all simulations and forecasting experiments. The MATLAB scripts used for model simulations, hybrid forecasting, statistical analyses, and figure generation are available from the corresponding author upon reasonable request.

Declarations

Ethics approval and consent to participate

Not applicable. This study did not involve human participants, clinical data, or animal experiments.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Centers for Disease Control and Prevention. About COVID-19. Retrieved from 2024. https://www.cdc.gov/covid/about/index.html.
2.World Health Organization. Coronavirus disease (COVID-19): health topics. Retrieved from 2025. https://www.who.int/health-topics/coronavirus.
3.Brauer F, Castillo-Chavez C, Feng Z. Mathematical models in epidemiology. Springer; 2019. [Google Scholar]
4.Ledder G. Mathematical modeling for epidemiology and ecology. In: Lecture notes in mathematics. Vol. 50. Berlin/Heidelberg: Springer; 2023. p. 223–76. [Google Scholar]
5.Pangemanan GE, Tamalia HI, Suprianto. Mathematical modeling and soft computing in epidemiology. In: Mishra J, Agarwal R, Atangana A, editors. CRC Press. 2024. [Google Scholar]
6.Tomov L, Chervenkov L, Miteva DG, Batselova H, Velikova T. Applications of time series analysis in epidemiology. World J Clin Cases. 2023;11(29):6974. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Bamana AB, Kamalabad MS, Oberski DL. A systematic literature review of time series methods applied to epidemic prediction. Inf Med Unlocked. 2024;50:101571. [Google Scholar]
8.Dorais S. Time series analysis in preventive intervention research. J Couns Devel. 2024;102(2):239–50. [Google Scholar]
9.Hussien HH, Genawi KR, Hagabdulla NH. Advances in statistical modelling of epidemics. Asian J Adv Res Rep. 2025;19(8):282–306. [Google Scholar]
10.Oluwafemi GO, Faith R, Badmus J, Luz H. Hybrid models combining machine learning and traditional epidemiological models. Int J Circumpolar Health. 2024.
11.Amadi M. Hybrid modelling methods for epidemiological studies. 2022.
12.Chibawe G, Nyirenda M. Enhancing disease outbreak forecasting using environmental and policy factors with ML. In: ICT for intelligent systems. Springer; 2026. p. 1511. [Google Scholar]
13.Kiarie J, Mwalili S, Mbogo R. Forecasting COVID-19 in Kenya using SEIR and ARIMA. Infect Dis Model. 2022;7(2):179–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Ala’raj M, Majdalawieh M, Nizamuddin N. Hybrid SEIRD–ARIMA forecasting of COVID-19. Infect Dis Model. 2021;6:98–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Zhao W, Sun Y, Li Y, Guan W. Hybrid modeling approaches for COVID-19. Front Public Health. 2022;10:923978. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Jin Y, Wang R, Zhuang X, Wang K, Wang H, Wang C, et al. ARIMA–LSTM hybrid model for COVID-19 prediction. Mathematics. 2022;10(21):4001. [Google Scholar]
17.Kong L, Guo Y, Lee CW. Hybrid SIRD–eRNN for COVID-19 forecasting. AppliedMath. 2024;4(2):427–41. [Google Scholar]
18.Cheng C, Aruchunan E, Noor Aziz MH. Hybrid SEIRV–DNNs for COVID-19 forecasting. Sci Rep. 2025;15(1):2043. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Heredia Cacha I, Sáinz-Pardo Díaz J, Castrillo M, López García Á. Ensemble ML and classical models for COVID-19 forecasting. Sci Rep. 2023;13(1):6750. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Saleem F, Al-Ghamdi ASAM, Alassafi MO, AlGhamdi SA. ML/DL and mathematical models for COVID-19: a systematic review. IJERPH. 2022;19(9):5099. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Yang Z-Y, Cao X, Xu R-Z, Hong W-C, Sun S-L. Applications of chaotic quantum adaptive satin bower bird optimizer algorithm in berth-tugboat-quay crane allocation optimization. Expert Syst Appl. 2024;237:121471. [Google Scholar]
22.Li M-W, Xu R-Z, Geng J, Hong W-C, Li H. A ship motion forecasting approach based on fourier transform, regularized Bi-LSTM and chaotic quantum adaptive WOA. Ocean Eng. 2024;313:119560. [Google Scholar]
23.Al-Musaylh MS, Gharineiat Z, Al-Daffaie K, Jasim KF, Sharma E, Nahi AA. Predicting near-real-time total water level with an artificial intelligence model based on Australia’s tidal wave energy belt dataset. J Ocean Eng Mar Energy. 2025;1–21.
24.Al-Musaylh MS, Al-Daffaie K, Downs N, Ghimire S, Ali M, Yaseen ZM, et al. Multi-step solar ultraviolet index prediction: integrating convolutional neural networks with long short-term memory for a representative case study in Queensland, Australia. Modeling Earth Syst Environ. 2025;11(1):77. [Google Scholar]
25.Ghimire S, Al-Musaylh MS, Nguyen-Huy T, Deo RC, Acharya R, Casillas-Perez D, et al. Explainable deeply-fused nets electricity demand prediction model: factoring climate predictors for accuracy and deeper insights with probabilistic confidence interval and point-based forecasts. Appl Energy. 2025;378:124763. [Google Scholar]
26.Al-Daffaiea K, Al-Musaylh MS, Al-Faisal QRM, Shallal QM. Monthly exchange rate prediction based on artificial intelligence models and Iraqi Dinar against United States dollar. In: AIP Conference Proceedings. (Vol. 3232, No. 1). AIP Publishing LLC; 2024 October, p. 030001.
27.Al-Musaylh MS, Al-Daffaie K, Prasad R. Gas consumption demand forecasting with empirical wavelet transform based machine learning model: a case study. Int J Energy Res. 2021;45(10):15124–38. [Google Scholar]
28.Mohanad SAM, Ravinesh CD, Yan L. Particle swarm optimized–support vector regression hybrid model for daily horizon electricity demand forecasting using climate dataset. In: E3S Web of Conferences. (Vol. 64). EDP Sciences; 2018, p. 08001.
29.Zhang Z, Hong WC, Li J. Electric load forecasting by hybrid self-recurrent support vector regression model with variational mode decomposition and improved cuckoo search algorithm. IEEE Access. 2020;8:14642–58. [Google Scholar]
30.Hong WC, Li MW, Geng J, Zhang Y. Novel chaotic bat algorithm for forecasting complex motion of floating platforms. Appl Math Modell. 2019;72:425–43. [Google Scholar]
31.Zhang Z, Hong WC. Electric load forecasting by complete ensemble empirical mode decomposition adaptive noise and support vector regression with quantum-based dragonfly algorithm. Nonlinear Dyn. 2019;98(2):1107–36. [Google Scholar]
32.Dong Y, Zhang Z, Hong WC. A hybrid seasonal mechanism with a chaotic cuckoo search algorithm with a support vector regression model for electric load forecasting. Energies. 2018;11(4):1009. [Google Scholar]
33.Ghimire S, Nguyen-Huy T, Al-Musaylh MS, Deo RC, Casillas-Pérez D, Salcedo-Sanz S. Integrated multi-head self-attention transformer model for electricity demand prediction incorporating local climate variables. Energy AI. 2023;14:100302. [Google Scholar]
34.Kang S, Zheng R. Distribution of the causes of fever of unknown origin in China, 2013–2022. J Transl Intern Med. 2024;12(3):299–307. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Zhang F, Li T, Bai Y, Liu J, Qin J, Wang A, et al. Treatment strategies with combined agency against severe viral pneumonia in patients with advanced cancer. J Transl Intern Med. 2024;12(3):317–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Ewing AG, Salamon S, Pretorius E, Joffe D, Fox G, Bilodeau S, et al. Review of organ damage from COVID and Long COVID: a disease with a spectrum of pathology. Med Rev. 2025;5(1):66–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Guo M, Shang S, Li M, Cai G, Li P, Chen X, et al. Understanding autoimmune response after SARS-CoV-2 infection and the pathogenesis/mechanisms of long COVID. Med Rev. 2024;4(5):367–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Ariyasingha NM, Samoilenko A, Chowdhury MRH, Nantogma S, Oladun C, Birchall JR, et al. Developing hyperpolarized butane gas for ventilation lung imaging. Chem Biomed Imag. 2024;2(10):698–710. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Chowdhury MRH, Oladun C, Ariyasingha NM, Samoilenko A, Bawardi T, Burueva DB, et al. Rapid lung ventilation MRI using parahydrogen-induced polarization of propane gas. Analyst. 2024;149(24):5832–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Ariyasingha NM, Oladun C, Samoilenko A, Chowdhury MRH, Nantogma S, Shi Z, et al. Parahydrogen-hyperpolarized Propane-d₆ gas contrast agent: T₁ relaxation dynamics and pilot millimeter-scale ventilation MRI. J Phys Chem A. 2025;129(19):4275–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Ankamma Rao M, Jaradat EK, Padma Devi M, Dhandapani PB, Nalule RM, Al-Hmoud M. Modeling COVID-19 pneumonia and COVID-associated pulmonary aspergillosis: sensitivity analysis and optimal control. BMC Infect Dis. 2025;25(1):1–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Ambalarajan V, Mallela AR, Sivakumar V, Dhandapani PB, Leiva V, Martin-Barreiro C, et al. A six-compartment model for COVID-19 with transmission dynamics and public health strategies. Sci Rep. 2024;14(1):22226. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Ambalarajan V, Mallela AR, Dhandapani PB, Sivakumar V, Leiva V, Castro C. Multi-strain COVID-19 dynamics with vaccination strategies: mathematical modeling and case study. Alexandria Eng J. 2025;119:665–84. [Google Scholar]
44.Gumel AB, Ruan S, Day T, Watmough J, Brauer F, Driessche PVD, et al. Modelling strategies for controlling SARS outbreaks. Proc R Soc Lond B. 2004;271:2223–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Nadim SK, Ghosh I, Chattopadhyay J. Short-term predictions and prevention strategies for COVID-2019: a model based study. 2020, arXiv:2003.08150. [DOI] [PMC free article] [PubMed]
46.Fergusonm N, et al. Report 9: impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. 2020. 10.25561/77482. [DOI] [PMC free article] [PubMed]
47.Li R, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368(6490):489–93. 10.1126/science.abb3221. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Lauer SA, et al. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med. 2020;172(9):577–82. 10.7326/m20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Ghosh SK, Ghosh S. A mathematical model for COVID-19 considering waning immunity, vaccination and control measures. Sci Rep. 2023;13(1):3610. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Mondal J, Khajanchi S. Mathematical modeling and optimal intervention strategies of the COVID-19 outbreak. Nonlinear Dyn. 2022;109(1):177–202. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Bandekar SR, Ghosh M. Mathematical modeling of COVID-19 in India and Nepal with optimal control and sensitivity analysis. Eur Phys J Plus. 2021;136(10):1058. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Venkatesh A, Rao MA. Mathematical model for COVID-19 pandemic with implementation of intervention strategies and cost-effectiveness analysis. Results Control Optim. 2024;14:100345. [Google Scholar]
53.Khajanchi S, Sarkar K, Mondal J, Nisar KS, Abdelwahab SF. Mathematical modeling of the COVID-19 pandemic with intervention strategies. Results Phys. 2021;25:104285. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Kong D, Chen Y, Li N. Gaussian process regression for tool wear prediction. Mech Syst Signal Process. 2018;104:556–74. [Google Scholar]
55.Wang J. An intuitive tutorial to Gaussian process regression. Comput Sci Eng. 2023;25(4):4–11. [Google Scholar]
56.Ho SL, Xie M. The use of ARIMA models for reliability forecasting and analysis. Comput Ind Eng. 1998;35(1–2):213–16. [Google Scholar]
57.Shumway RH, Stoffer DS. ARIMA models. In: time series analysis and its applications: with R examples. Cham: Springer International Publishing; 2017. p. 75–163. [Google Scholar]
58.Diebold FX, Mariano RS. Comparing predictive accuracy. J Bus Econ Stat. 2002;20:134–44. [Google Scholar]
59.Clark TE, West KD. Approximately normal tests for equal predictive accuracy in nested models. J Econom. 2007;138(1):291–311. [Google Scholar]
60.Taheri SM, Hesamian GR. A generalization of the Wilcoxon signed-rank test and its applications. Stat Papers. 2013;54(2):457–70. [Google Scholar]
61.Pereira DG, Afonso A, Medeiros FM. Overview of Friedman’s test and post-hoc analysis. Commun Stat Simul Comput. 2015;44(10):2636–53. [Google Scholar]
62.Giacomini R, White H. Tests of conditional predictive ability. Econometrica. 2006;74(6):1545–78. [Google Scholar]
63.https://www.worldometers.info/coronavirus/country/india/.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[CR1] 1.Centers for Disease Control and Prevention. About COVID-19. Retrieved from 2024. https://www.cdc.gov/covid/about/index.html.

[CR2] 2.World Health Organization. Coronavirus disease (COVID-19): health topics. Retrieved from 2025. https://www.who.int/health-topics/coronavirus.

[CR3] 3.Brauer F, Castillo-Chavez C, Feng Z. Mathematical models in epidemiology. Springer; 2019. [Google Scholar]

[CR4] 4.Ledder G. Mathematical modeling for epidemiology and ecology. In: Lecture notes in mathematics. Vol. 50. Berlin/Heidelberg: Springer; 2023. p. 223–76. [Google Scholar]

[CR5] 5.Pangemanan GE, Tamalia HI, Suprianto. Mathematical modeling and soft computing in epidemiology. In: Mishra J, Agarwal R, Atangana A, editors. CRC Press. 2024. [Google Scholar]

[CR6] 6.Tomov L, Chervenkov L, Miteva DG, Batselova H, Velikova T. Applications of time series analysis in epidemiology. World J Clin Cases. 2023;11(29):6974. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Bamana AB, Kamalabad MS, Oberski DL. A systematic literature review of time series methods applied to epidemic prediction. Inf Med Unlocked. 2024;50:101571. [Google Scholar]

[CR8] 8.Dorais S. Time series analysis in preventive intervention research. J Couns Devel. 2024;102(2):239–50. [Google Scholar]

[CR9] 9.Hussien HH, Genawi KR, Hagabdulla NH. Advances in statistical modelling of epidemics. Asian J Adv Res Rep. 2025;19(8):282–306. [Google Scholar]

[CR10] 10.Oluwafemi GO, Faith R, Badmus J, Luz H. Hybrid models combining machine learning and traditional epidemiological models. Int J Circumpolar Health. 2024.

[CR11] 11.Amadi M. Hybrid modelling methods for epidemiological studies. 2022.

[CR12] 12.Chibawe G, Nyirenda M. Enhancing disease outbreak forecasting using environmental and policy factors with ML. In: ICT for intelligent systems. Springer; 2026. p. 1511. [Google Scholar]

[CR13] 13.Kiarie J, Mwalili S, Mbogo R. Forecasting COVID-19 in Kenya using SEIR and ARIMA. Infect Dis Model. 2022;7(2):179–88. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Ala’raj M, Majdalawieh M, Nizamuddin N. Hybrid SEIRD–ARIMA forecasting of COVID-19. Infect Dis Model. 2021;6:98–111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Zhao W, Sun Y, Li Y, Guan W. Hybrid modeling approaches for COVID-19. Front Public Health. 2022;10:923978. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Jin Y, Wang R, Zhuang X, Wang K, Wang H, Wang C, et al. ARIMA–LSTM hybrid model for COVID-19 prediction. Mathematics. 2022;10(21):4001. [Google Scholar]

[CR17] 17.Kong L, Guo Y, Lee CW. Hybrid SIRD–eRNN for COVID-19 forecasting. AppliedMath. 2024;4(2):427–41. [Google Scholar]

[CR18] 18.Cheng C, Aruchunan E, Noor Aziz MH. Hybrid SEIRV–DNNs for COVID-19 forecasting. Sci Rep. 2025;15(1):2043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Heredia Cacha I, Sáinz-Pardo Díaz J, Castrillo M, López García Á. Ensemble ML and classical models for COVID-19 forecasting. Sci Rep. 2023;13(1):6750. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Saleem F, Al-Ghamdi ASAM, Alassafi MO, AlGhamdi SA. ML/DL and mathematical models for COVID-19: a systematic review. IJERPH. 2022;19(9):5099. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Yang Z-Y, Cao X, Xu R-Z, Hong W-C, Sun S-L. Applications of chaotic quantum adaptive satin bower bird optimizer algorithm in berth-tugboat-quay crane allocation optimization. Expert Syst Appl. 2024;237:121471. [Google Scholar]

[CR22] 22.Li M-W, Xu R-Z, Geng J, Hong W-C, Li H. A ship motion forecasting approach based on fourier transform, regularized Bi-LSTM and chaotic quantum adaptive WOA. Ocean Eng. 2024;313:119560. [Google Scholar]

[CR23] 23.Al-Musaylh MS, Gharineiat Z, Al-Daffaie K, Jasim KF, Sharma E, Nahi AA. Predicting near-real-time total water level with an artificial intelligence model based on Australia’s tidal wave energy belt dataset. J Ocean Eng Mar Energy. 2025;1–21.

[CR24] 24.Al-Musaylh MS, Al-Daffaie K, Downs N, Ghimire S, Ali M, Yaseen ZM, et al. Multi-step solar ultraviolet index prediction: integrating convolutional neural networks with long short-term memory for a representative case study in Queensland, Australia. Modeling Earth Syst Environ. 2025;11(1):77. [Google Scholar]

[CR25] 25.Ghimire S, Al-Musaylh MS, Nguyen-Huy T, Deo RC, Acharya R, Casillas-Perez D, et al. Explainable deeply-fused nets electricity demand prediction model: factoring climate predictors for accuracy and deeper insights with probabilistic confidence interval and point-based forecasts. Appl Energy. 2025;378:124763. [Google Scholar]

[CR26] 26.Al-Daffaiea K, Al-Musaylh MS, Al-Faisal QRM, Shallal QM. Monthly exchange rate prediction based on artificial intelligence models and Iraqi Dinar against United States dollar. In: AIP Conference Proceedings. (Vol. 3232, No. 1). AIP Publishing LLC; 2024 October, p. 030001.

[CR27] 27.Al-Musaylh MS, Al-Daffaie K, Prasad R. Gas consumption demand forecasting with empirical wavelet transform based machine learning model: a case study. Int J Energy Res. 2021;45(10):15124–38. [Google Scholar]

[CR28] 28.Mohanad SAM, Ravinesh CD, Yan L. Particle swarm optimized–support vector regression hybrid model for daily horizon electricity demand forecasting using climate dataset. In: E3S Web of Conferences. (Vol. 64). EDP Sciences; 2018, p. 08001.

[CR29] 29.Zhang Z, Hong WC, Li J. Electric load forecasting by hybrid self-recurrent support vector regression model with variational mode decomposition and improved cuckoo search algorithm. IEEE Access. 2020;8:14642–58. [Google Scholar]

[CR30] 30.Hong WC, Li MW, Geng J, Zhang Y. Novel chaotic bat algorithm for forecasting complex motion of floating platforms. Appl Math Modell. 2019;72:425–43. [Google Scholar]

[CR31] 31.Zhang Z, Hong WC. Electric load forecasting by complete ensemble empirical mode decomposition adaptive noise and support vector regression with quantum-based dragonfly algorithm. Nonlinear Dyn. 2019;98(2):1107–36. [Google Scholar]

[CR32] 32.Dong Y, Zhang Z, Hong WC. A hybrid seasonal mechanism with a chaotic cuckoo search algorithm with a support vector regression model for electric load forecasting. Energies. 2018;11(4):1009. [Google Scholar]

[CR33] 33.Ghimire S, Nguyen-Huy T, Al-Musaylh MS, Deo RC, Casillas-Pérez D, Salcedo-Sanz S. Integrated multi-head self-attention transformer model for electricity demand prediction incorporating local climate variables. Energy AI. 2023;14:100302. [Google Scholar]

[CR34] 34.Kang S, Zheng R. Distribution of the causes of fever of unknown origin in China, 2013–2022. J Transl Intern Med. 2024;12(3):299–307. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Zhang F, Li T, Bai Y, Liu J, Qin J, Wang A, et al. Treatment strategies with combined agency against severe viral pneumonia in patients with advanced cancer. J Transl Intern Med. 2024;12(3):317–20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] 36.Ewing AG, Salamon S, Pretorius E, Joffe D, Fox G, Bilodeau S, et al. Review of organ damage from COVID and Long COVID: a disease with a spectrum of pathology. Med Rev. 2025;5(1):66–75. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Guo M, Shang S, Li M, Cai G, Li P, Chen X, et al. Understanding autoimmune response after SARS-CoV-2 infection and the pathogenesis/mechanisms of long COVID. Med Rev. 2024;4(5):367–83. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Ariyasingha NM, Samoilenko A, Chowdhury MRH, Nantogma S, Oladun C, Birchall JR, et al. Developing hyperpolarized butane gas for ventilation lung imaging. Chem Biomed Imag. 2024;2(10):698–710. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Chowdhury MRH, Oladun C, Ariyasingha NM, Samoilenko A, Bawardi T, Burueva DB, et al. Rapid lung ventilation MRI using parahydrogen-induced polarization of propane gas. Analyst. 2024;149(24):5832–42. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Ariyasingha NM, Oladun C, Samoilenko A, Chowdhury MRH, Nantogma S, Shi Z, et al. Parahydrogen-hyperpolarized Propane-d₆ gas contrast agent: T₁ relaxation dynamics and pilot millimeter-scale ventilation MRI. J Phys Chem A. 2025;129(19):4275–87. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Ankamma Rao M, Jaradat EK, Padma Devi M, Dhandapani PB, Nalule RM, Al-Hmoud M. Modeling COVID-19 pneumonia and COVID-associated pulmonary aspergillosis: sensitivity analysis and optimal control. BMC Infect Dis. 2025;25(1):1–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR42] 42.Ambalarajan V, Mallela AR, Sivakumar V, Dhandapani PB, Leiva V, Martin-Barreiro C, et al. A six-compartment model for COVID-19 with transmission dynamics and public health strategies. Sci Rep. 2024;14(1):22226. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Ambalarajan V, Mallela AR, Dhandapani PB, Sivakumar V, Leiva V, Castro C. Multi-strain COVID-19 dynamics with vaccination strategies: mathematical modeling and case study. Alexandria Eng J. 2025;119:665–84. [Google Scholar]

[CR44] 44.Gumel AB, Ruan S, Day T, Watmough J, Brauer F, Driessche PVD, et al. Modelling strategies for controlling SARS outbreaks. Proc R Soc Lond B. 2004;271:2223–32. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Nadim SK, Ghosh I, Chattopadhyay J. Short-term predictions and prevention strategies for COVID-2019: a model based study. 2020, arXiv:2003.08150. [DOI] [PMC free article] [PubMed]

[CR46] 46.Fergusonm N, et al. Report 9: impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. 2020. 10.25561/77482. [DOI] [PMC free article] [PubMed]

[CR47] 47.Li R, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368(6490):489–93. 10.1126/science.abb3221. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Lauer SA, et al. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med. 2020;172(9):577–82. 10.7326/m20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Ghosh SK, Ghosh S. A mathematical model for COVID-19 considering waning immunity, vaccination and control measures. Sci Rep. 2023;13(1):3610. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR50] 50.Mondal J, Khajanchi S. Mathematical modeling and optimal intervention strategies of the COVID-19 outbreak. Nonlinear Dyn. 2022;109(1):177–202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Bandekar SR, Ghosh M. Mathematical modeling of COVID-19 in India and Nepal with optimal control and sensitivity analysis. Eur Phys J Plus. 2021;136(10):1058. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Venkatesh A, Rao MA. Mathematical model for COVID-19 pandemic with implementation of intervention strategies and cost-effectiveness analysis. Results Control Optim. 2024;14:100345. [Google Scholar]

[CR53] 53.Khajanchi S, Sarkar K, Mondal J, Nisar KS, Abdelwahab SF. Mathematical modeling of the COVID-19 pandemic with intervention strategies. Results Phys. 2021;25:104285. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Kong D, Chen Y, Li N. Gaussian process regression for tool wear prediction. Mech Syst Signal Process. 2018;104:556–74. [Google Scholar]

[CR55] 55.Wang J. An intuitive tutorial to Gaussian process regression. Comput Sci Eng. 2023;25(4):4–11. [Google Scholar]

[CR56] 56.Ho SL, Xie M. The use of ARIMA models for reliability forecasting and analysis. Comput Ind Eng. 1998;35(1–2):213–16. [Google Scholar]

[CR57] 57.Shumway RH, Stoffer DS. ARIMA models. In: time series analysis and its applications: with R examples. Cham: Springer International Publishing; 2017. p. 75–163. [Google Scholar]

[CR58] 58.Diebold FX, Mariano RS. Comparing predictive accuracy. J Bus Econ Stat. 2002;20:134–44. [Google Scholar]

[CR59] 59.Clark TE, West KD. Approximately normal tests for equal predictive accuracy in nested models. J Econom. 2007;138(1):291–311. [Google Scholar]

[CR60] 60.Taheri SM, Hesamian GR. A generalization of the Wilcoxon signed-rank test and its applications. Stat Papers. 2013;54(2):457–70. [Google Scholar]

[CR61] 61.Pereira DG, Afonso A, Medeiros FM. Overview of Friedman’s test and post-hoc analysis. Commun Stat Simul Comput. 2015;44(10):2636–53. [Google Scholar]

[CR62] 62.Giacomini R, White H. Tests of conditional predictive ability. Econometrica. 2006;74(6):1545–78. [Google Scholar]

[CR63] 63.https://www.worldometers.info/coronavirus/country/india/.

PERMALINK

Hybrid modeling and forecasting of COVID-19: integrating SEAIQHRD and GPR for improved predictions

Mallela Ankamma Rao

Emad K Jaradat

Medisetty Padma Devi

Prasantha Bharathi Dhandapani

Rebecca Muhumuza Nalule

Mohannad Al-Hmoud

Abstract

Introduction

Materials and methods

Overview of the proposed hybrid research framework

Fig. 1.

Fig. 2.

SEAIQHRD model formulation

Fig. 3.

Table 1.

Gaussian Process Regression (GPR) model formulation

Gaussian process

Kernel function

Training the GPR model

Predictive distribution

Interpretability and advantages

ARIMA model formulation

Stationarity and differencing

ARIMAstructure

Model identification and training

Forecasting with ARIMA

Interpretability

Hybrid model formulation

Overview

Stage 1: Baseline mechanistic modeling with SEAIQHRD

Stage 2A: Residual learning using GPR

Stage 2B: Residual learning using ARIMA

Final hybrid output

Advantages of the hybrid approach

Software and computational environment

Fig. 10.

Table 5.

Error metrics and model evaluation criteria

Symmetric mean absolute percentage error (sMAPE)

Root mean square error (RMSE)

Mean absolute error (MAE)

Continuous ranked probability score (CRPS)

Willmott’s index (WI)

Skill score

Absolute percentage bias (PBIAS)

Explained variance score (EVS)

Forecast accuracy tests

Diebold–Mariano (DM) test

Clark–West (CW) test

Giacomini–White (GW) test

Wilcoxon signed-rank test

Friedman test

Numerical simulation

Parameter estimation using nonlinear least square method

Convergence diagnostics and stability assessment of the parameter estimation procedure

Fig. 4.

Fig. 5.

GPR model for residual estimation

ARIMA model for residual estimation

Comparative analysis of SEAIQHRD and hybrid SEAIQHRD+GPR/ARIMA models

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

Model performance evaluation using statistical metrics

Table 2.

Table 3.

Table 4.

Results and discussion

Convergence and stability of parameter estimation

Performance for daily confirmed cases

Performance for daily deaths

Performance for cumulative confirmed cases

Performance for cumulative deaths

Statistical and error-metric comparison

Forecast superiority tests

Limitations

Future research directions