LSTCNet: timely prediction of influenza in China using Baidu index and LSTM-TCN hybrid network

Wei He; Xuanfeng Li; Haining He; Yiping Li; Ning Sun; Chitin Hon

doi:10.1186/s12889-026-26459-5

. 2026 Feb 6;26:855. doi: 10.1186/s12889-026-26459-5

LSTCNet: timely prediction of influenza in China using Baidu index and LSTM-TCN hybrid network

Wei He ^1,^2,^#, Xuanfeng Li ^1,^2,^#, Haining He ^1,³, Yiping Li ^1,⁴, Ning Sun ^1,², Chitin Hon ^1,^2,^✉

PMCID: PMC12977721 PMID: 41652372

Abstract

Background

Seasonal influenza poses a significant public health burden, particularly in China, where regional outbreaks vary widely and timely surveillance is constrained by reporting delays. While traditional surveillance systems rely on virological data, population-level internet search behavior has emerged as a complementary signal for early epidemic detection.

Objective

This study aims to develop a hybrid deep learning model, LSTCNet, that integrates online search data from Baidu Index with influenza surveillance records for accurate and timely forecasting of influenza trends in both Southern and Northern China.

Methods

We developed a hybrid deep learning model, LSTCNet, to predict influenza trends in real time by integrating Baidu Index search data and virological surveillance records from Northern and Southern China. LSTCNet combines a Bidirectional Long Short-Term Memory (BiLSTM) network for modeling long-term dependencies and a Temporal Convolutional Network (TCN) for capturing short-term patterns, further enhanced by temporal and channel-wise attention mechanisms. Feature selection was conducted using LightGBM and SHAP to identify the top 10 region-specific Baidu search terms most strongly associated with influenza positivity rates. The model was trained on weekly data from 2011 to 2022 and tested on data from mid-2022 to 2024. Its performance was evaluated against LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer baselines using RMSE, MAE, MAPE, and R² metrics. Extensive ablation studies were also performed to assess the contribution of each LSTCNet component.

Results

LSTCNet consistently outperformed baseline models across multiple forecasting horizons and both geographic regions. For Northern China, LSTCNet achieved the highest accuracy for 1-week forecasts with an RMSE of 0.0655 and R² of 0.931, while maintaining robust performance for 4-week forecasts (RMSE = 0.1382, R² = 0.695). In Southern China, LSTCNet obtained an RMSE of 0.0822 and R² of 0.925 for 1-week forecasts, with a moderate decline at 4 weeks (RMSE = 0.1990, R² = 0.560). Compared to traditional models like LSTM, and Transformer, LSTCNet demonstrated superior accuracy and stability, particularly in capturing epidemic peaks and trend inflection points. Ablation studies further confirmed the importance of integrating BiLSTM, TCN, and attention mechanisms; removing any component significantly reduced performance. Region-specific feature selection using SHAP-enhanced Baidu Index terms also improved model interpretability and predictive precision.

Conclusion

LSTCNet demonstrates strong potential for timely influenza forecasting by leveraging region-specific online behavioral data and advanced hybrid architectures, offering a scalable and interpretable tool for epidemic monitoring and supporting more responsive public health interventions.

Keywords: Influenza forecasting, Baidu index, Temporal convolutional network (TCN), Attention mechanisms, Feature selection

Introduction

Influenza is an acute respiratory infection caused by influenza viruses, which circulate worldwide [1]. As a global public health challenge, it causes approximately 1 billion infections annually and up to 650,000 deaths attributable to respiratory diseases associated with seasonal influenza [2]. During the coronavirus disease 2019 (COVID-19) pandemic, influenza activity declined globally but has intensified in China since 2021 [3]. Its seasonal epidemics impose substantial medical and economic burdens, particularly among high-risk populations such as children, the elderly, and immunocompromised individuals [4–6]. Therefore, accurate prediction is critical for optimizing resource allocation, guiding vaccination strategies, and enabling timely public health interventions.

In mainland China, the influenza surveillance system operated by the Chinese National Influenza Center (CNIC) relies on laboratory-confirmed case reports from sentinel hospitals [7]. However, the release of official data is typically delayed by one to two weeks due to laboratory testing and administrative procedures [8]. This temporal lag hinders timely epidemic response, highlighting the urgent need for predictive tools that can provide early warning ahead of surveillance updates. In recent years, search engine data have emerged as a valuable population-level proxy for monitoring public health trends. Search queries related to influenza prevention, symptoms and treatment have been shown to exhibit strong temporal correlations with laboratory-confirmed case trends [9, 10], offering potential lead-time advantages over traditional surveillance systems. Various online search engines, including Google, Twitter, Weibo, and Baidu, have been explored in combination with different modeling methods to predict epidemic trends [10–13]. For instance, Olukanmi et al. utilized Google Search Data with deep learning to forecast influenza-like illnesses in South Africa [14], while Fang et al. provided evidence that the Baidu Index serves as a reliable indicator for epidemic forecasting in China [15]. Beyond influenza, recent advances in AI-enabled health analytics further highlight the value of leveraging heterogeneous, real-world data streams for timely risk assessment. For example, deep learning has been applied to computational linguistics–based text emotion analysis during the COVID-19 pandemic [16], and machine learning pipelines have been developed for COVID-19 prognosis analysis and visualization [17]. Recent studies have also explored AI-based timely monitoring and decision support in clinical settings [18] and lightweight deep models for medical image-based diagnosis [19], while privacy-preserving paradigms such as federated learning have been investigated to enhance data confidentiality in healthcare prediction tasks [20, 21]. Motivated by these developments, we focus on influenza forecasting and develop a model that integrates Baidu Index signals with influenza positivity rates to predict seasonal influenza trends in both Southern and Northern China.

Traditional influenza prediction models, such as autoregressive integrated moving average (ARIMA) models and machine learning approaches like support vector regression (SVR), have achieved only limited success due to their reliance on linear assumptions and susceptibility to noisy data [22, 23]. Recent advances in deep learning have introduced architectures such as Long Short-Term Memory (LSTM) networks and Temporal Convolutional Networks (TCN), which excel at capturing nonlinear temporal dependencies [24]. Research has increasingly focused on hybrid architectures to leverage multi-source data [25]. For instance, Ma et al. [26] jointly forecasted COVID-19 and ILI trends by optimizing Google Trends queries, demonstrating the value of internet signals. Similarly, Wang et al. [27] proposed a parallel LSTM architecture incorporating spatiotemporal features (SPH) to predict COVID-19 hospitalizations. While these dual-branch and hybrid approaches represent significant progress, they often rely heavily on recurrent units, which may struggle to capture high-frequency local variations, or lack interpretable feature selection mechanisms for region-specific search behaviors. These findings suggest that distinct neural networks possess unique capabilities, and combining complementary advantages by integrating the long-term memory of LSTMs with the local feature extraction of TCNs may further enhance predictive performance.

In this study, we proposed a hybrid LSTCNet model that integrates LSTM networks, TCN, and a squeeze-and-excitation (SE) attention mechanism. The Baidu Index was combined with virological surveillance data in a co-training framework, enabling both timely prediction of influenza trends and quantitative assessment of the accuracy improvements afforded by search data. By utilizing both sequential and convolutional features, the LSTCNet model captured temporal dependencies better in influenza data. We further conducted comparative experiments against five widely used models, including LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer. Through this systematic evaluation, we aim to identify the most effective predictive model for optimizing early warning systems and strengthening public health preparedness for seasonal influenza outbreaks.

Modeling methodology

To predict influenza positivity rates, several models were implemented and compared, including LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer, with a particular focus on the proposed LSTCNet model. In this research, we introduce LSTCNet, a hybrid neural network architecture engineered to predict influenza positivity rates by modeling both short-term fluctuations and long-term trends inherent in time series data. As shown in Fig. 1, LSTCNet is an end-to-end deep neural network tailored for influenza trend forecasting using multi-source sequential inputs. The model is designed to effectively capture both local and global temporal dependencies by integrating the advantages of recurrent and convolutional structures, further enhanced by temporal and channel-wise attention mechanisms.

Fig. 1 — Overview of the LSTCNet architecture, which combines a BiLSTM branch for long-term temporal modeling and a TCN branch for short-term feature extraction. The model incorporates temporal and channel-wise attention mechanisms to enhance influenza trend prediction

The architecture consists of three main modules: Input Processing, Feature Extraction, and Prediction Output. In the Input Processing Module, raw time series data—comprising Baidu Index values and virological surveillance records—is first passed through a linear projection layer, followed by position encoding to preserve temporal sequence information. The Feature Extraction Module consists of two parallel branches: a Bidirectional LSTM (BiLSTM) path and a Temporal Convolutional Network (TCN) path. The BiLSTM path focuses on capturing long-term dependencies by modeling sequential context in both forward and backward directions, while the TCN path is designed to extract local temporal patterns through dilated causal convolutions.

To enhance the model’s representation capacity, Squeeze-and-Excitation (SE) Attention and Temporal Attention mechanisms are incorporated within the TCN branch. These modules enable adaptive reweighting of feature maps, allowing the model to focus on critical channels and dimensions. After parallel processing, outputs from both branches are fused, normalized, and passed through an additional SE attention layer.

Finally, the Prediction Output Module consists of two fully connected layers that transform the aggregated features into final predictions of influenza positivity rates.

By integrating both long-range temporal memory and short-term convolutional patterns, and by leveraging the expressive power of attention-based feature refinement, LSTCNet demonstrates enhanced accuracy and generalization in modeling influenza surveillance indicators.

Input processing

Let Inline graphic denote the -dimensional multivariate input at week , formed by concatenating the selected Baidu Index features and virological surveillance covariates. Given a lookback window length and forecasting horizon , we construct training samples as

In implementation, a mini-batch is denoted by Inline graphic , where is the batch size. We adopt a strict chronological split; any normalization parameters are fitted on the training period only and then applied to validation/test to avoid information leakage.

To enhance representational capacity, the input batch Inline graphic is first projected into a higher-dimensional latent space via a linear transformation:

where Inline graphic and are trainable parameters, and is the embedding dimension shared by the BiLSTM and TCN branches. This shallow embedding layer (i) maps heterogeneous covariates into a shared latent space and (ii) aligns the input dimensionality with the hidden sizes used in both branches, which simplifies the dual-branch design and empirically stabilizes optimization.

To preserve temporal ordering, we incorporate sinusoidal positional encodings Inline graphic (broadcast along the batch dimension):

The positional encoding is defined as

where Inline graphic is the time index and indexes the embedding dimension, ensuring that the model captures relative positional information. The resulting is then fed into the BiLSTM and TCN branches for long- and short-range feature extraction.

LSTM module

To clarify the core mechanism underlying the BiLSTM block in Fig. 1, we illustrate in Fig. 2 the internal structure of a standard LSTM cell, which serves as the fundamental computational unit of the BiLSTM architecture. The LSTM unit includes three key gating components—forget gate, input gate, and output gate—that regulate how information is retained, updated, and passed forward across time steps. These gates are controlled by nonlinear activation functions, enabling the model to dynamically manage memory and preserve long-term temporal dependencies.

Given the embedded input sequence Inline graphic , the BiLSTM processes the sequence bidirectionally to capture contextual information from both past and future within the lookback window. Let denote the input vector at time step (). The forward and backward hidden states are computed as:

and concatenated along the feature dimension:

Accordingly, the BiLSTM outputs a hidden-state sequence Inline graphic .

We use a Inline graphic -layer BiLSTM with hidden size per direction and dropout between stacked layers. Following common practice in sequence forecasting, we use the final-step representation as the long-range feature vector for subsequent fusion with the TCN branch.

TCN module

To complement the long-range temporal modeling of the BiLSTM path, we incorporate a Temporal Convolutional Network (TCN) to capture short-term fluctuations and localized temporal patterns in the input sequence. This parallel path is designed to extract features across multiple time scales, enhancing the model’s responsiveness to abrupt changes in epidemic signals.

The input sequence Inline graphic is first transposed to a shape of to match the input format expected by one-dimensional convolutional layers. For reproducibility, we use L = 4 TemporalBlocks with channel sizes [64, 64, 64, 64] and dropout p_tcn = 0.2 in each block. The TCN consists of a stack of TemporalBlocks, each containing several key components:

Multi-Scale Dilated Convolutions. Each block applies two parallel 1D convolutional layers with kernel sizes of 3 and 5, and a dilation rate defined as Inline graphic , where is the layer index. In our setting, the dilation factors across the 4 blocks are {1, 2, 4, 8}. Padding is applied to maintain consistent output length. To avoid using future information in forecasting, we implement 1D convolutions in a causal manner (left padding only), so the output at time t depends only on inputs up to time t.

The operations are defined as:

Each output undergoes batch normalization and GELU activation, followed by concatenation:

Attention mechanisms

To further refine the extracted features, two complementary attention modules are incorporated:

Squeeze-and-Excitation (SE) Attention: This recalibrates the importance of each channel. The recalibration vector 𝑠 is generated via global average pooling followed by a two-layer fully connected network with a sigmoid activation. We use an SE reduction ratio r = 8 for the two-layer gating network:

Temporal Attention (time-step reweighting): To highlight critical time steps, temporal attention aggregates average- and max-pooled features along the ch4annel dimension and applies a 1D convolution to generate an attention map over the temporal axis. Note that this module operates on a 1D sequence; the term “temporal” indicates reweighting across time steps rather than spatial locations:

Residual connection

To ensure stable gradient flow and efficient training, a residual connection is added. If the input and output channel dimensions differ, a 1 × 1 convolution is used to align them:

where Inline graphic is a 1 × 1 convolution if channel dimensions differ.

The TCN output is reshaped to Inline graphic (where is the number of channels in the final TemporalBlock), and the feature vector from the last time step is retained.

Feature fusion

After attention-based refinement, the outputs from both the BiLSTM and TCN branches are fused, normalized, and further processed by an additional SE attention layer to emphasize globally informative representations. The LSTM and TCN outputs at the final time step, Inline graphic and , are concatenated:

A fusion layer integrates these features:

where Inline graphic and are trainable parameters, and is the fusion dimension. Layer normalization is applied over the feature dimension. In our experiments, we set and the fusion dropout rate to . An additional SE attention layer (reduction ratio ) further refines by prioritizing salient channels.

Prediction head

Finally, the Prediction Output Module consists of two fully connected layers that transform the aggregated features into final predictions of influenza positivity rates. The fused feature vector Inline graphic is processed by a multi-layer perceptron (MLP) to forecast influenza positivity rates over future time steps:

where Inline graphic , , , and are MLP parameters. For reproducibility, we set the MLP hidden dimension to = 128 and apply dropout with rate p_head = 0.2 to the hidden layer during training. The output is a K-step vector in .

Training and optimization

LSTCNet is trained by minimizing the Mean Squared Error (MSE) between the predicted and ground-truth influenza positivity rates for Inline graphic -step forecasting:

We optimize the network using AdamW with learning rate Inline graphic and weight decay , batch size , and a maximum of 200 epochs. Early stopping (patience=20) is applied based on validation RMSE, and we reduce the learning rate on plateau (factor=0.5, patience=5, min lr=). Hyperparameters (e.g., , TCN channels/dilations, and dropout rates) are selected on the validation set.

We initialize linear layers with Xavier uniform, LSTM weights with orthogonal initialization, and convolutional layers with Kaiming normal initialization. All experiments use a fixed random seed (seed = 42). Experiments are implemented in Python 3.10 and PyTorch 2.2.0 on Ubuntu 22.04 with an NVIDIA GeForce RTX 3090 Ti GPU (24 GB VRAM) using CUDA 12.1.

For clarity and reproducibility, the overall training and inference workflow of LSTCNet under the chronological hold-out protocol is summarized in Algorithm 1, and the corresponding end-to-end formulation is specified by Eqs. (1)–(21), covering sample construction and input embedding (1–4), BiLSTM long-range modeling (5–7), TCN short-range modeling with attention and residual refinement (8–17), feature fusion and K-step prediction (18–20), and the MSE training objective (21).

Algorithm 1. LSTCNet training and inference (chronological hold-out) graphic file with name 12889_2026_26459_Figa_HTML.jpg

Experimental design and evaluation

Dataset

The dataset used in this study consists of influenza surveillance data from CNIC and internet search data from Baidu Index. Due to the vast difference in the influenza epidemic situation among different regions, the CNIC releases the Influenza Positivity Rate for northern and southern mainland China separately. The provinces in northern mainland China include Beijing, Tianjin, Hebei, Shanxi, Shaanxi, Inner Mongolia, Liaoning, Jilin, Heilongjiang, Shandong, Henan, Tibet, Gansu, Qinghai, Ningxia, and Xinjiang, and the provinces in southern mainland China include Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Hubei, Hunan, Guangdong, Guangxi, Hainan, Chongqing, Sichuan, Guizhou, and Yunnan. To enhance model input relevance, Baidu Index features were selected using SHAP (Shapley Additive Explanations) in combination with LightGBM. For each region, the top 10 most influential search terms associated with influenza prediction were identified and used as input features for the forecasting models. To avoid information leakage, the LightGBM+SHAP feature selection was performed using the training period only, and the selected top-10 terms were then fixed and applied unchanged to the test period.The full dataset spans from January 3, 2011, to April 29, 2024, and was divided into a training set (January 3, 2011–June 19, 2022) and a test set (June 20, 2022–April 29, 2024).

Data preprocessing and target definition

The primary prediction target in this study is the Influenza Positivity Rate. To facilitate model convergence and prevent features with larger magnitudes from dominating the learning process, all input features and target variables were normalized. We applied Min-Max Scaling to transform the data into the range [0,1]. To prevent data leakage, the minimum and maximum values used for scaling were computed exclusively from the training set and then applied to the test set. Similarly, all feature-selection procedures were restricted to the training period, with no test-period information used. The target variable was inversely transformed (denormalized) to its original scale for final evaluation and visualization.

Forecasting protocol

We adopted a strict chronological hold-out protocol to evaluate the model’s performance in a realistic surveillance scenario. The model is trained once on historical data (up to June 2022) and then evaluated in a purely out-of-sample manner on the subsequent test period (June 2022–April 2024) without any re-training. The forecasting task is defined as predicting the positivity rate at a future horizon t + k utilizing only information available up to time t. This setup ensures zero data leakage and mimics a timely deployment environment where the model generates forecasts for incoming weeks without accessing future ground truth.

Evaluation metrics

The performance of the models was evaluated using several standard metrics: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), R² score, and Mean Absolute Percentage Error (MAPE). These metrics provide insight into the prediction accuracy, error distribution, and explanatory power of the models. For each region (North and South), and for each prediction horizon (ranging from 1 week to the maximum forecast horizon), these metrics were computed to assess the models’ ability to generalize to unseen data.

Model selection

The selection of models was based on their ability to handle time-series forecasting tasks, particularly for the prediction of influenza positivity rates. The LSTCNet model was chosen due to its hybrid structure combining the strengths of both LSTM and TCN layers. Comparative models, such as LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer, were trained and evaluated for their forecasting performance, and the final model choice was based on its overall prediction accuracy across all horizons and regions.

Ablation study

To systematically assess the effectiveness and necessity of each architectural component in LSTCNet, we designed a comprehensive set of ablation variants. The baseline LSTCNet model maintains a full dual-branch structure, incorporating a bidirectional LSTM (BiLSTM) branch, a multi-scale TCN branch, spatial and channel-wise attention mechanisms, and a nonlinear feature fusion module. Structural ablation experiments include LSTM Only (retaining only the LSTM branch), TCN Only (retaining only the TCN branch), and Simplified TCN (reducing TCN channel complexity), aimed at evaluating the independent modeling capacity of each neural network component. To investigate architectural complexity, we constructed No Attention (removing attention mechanisms) and Shallow (reducing overall network depth) variants. Crucially, to verify the necessity of explicit temporal indexing in recurrent architectures, we included a “No Positional Encoding” variant, which removes the sinusoidal injection from the input processing module. The Basic Fusion variant replaces the nonlinear feature fusion with a simple linear weighting scheme. Additionally, LSTM Attention and TCN Attention variants integrate attention mechanisms into a single branch to compare single-branch attention with full dual-branch integration. These experiments collectively evaluate component independence, architectural depth, and fusion strategy.

Results

Feature selection

A comparison of regression model performance revealed that LightGBM consistently outperformed both Random Forest and XGBoost in fitting the relationship between Baidu Index covariates and influenza positivity rates across northern and southern China (Table 1). According to the regression metrics, LightGBM achieved the lowest mean squared error (MSE), root mean squared error (RMSE), and mean absolute error (MAE), along with the highest coefficient of determination (R²) in both regions. In the northern region, it achieved an MSE of 0.0016 and an R² of 0.9050, while in the southern region, it achieved an MSE of 0.0030 and an R² of 0.8401 (Table 1). Importantly, Table 1 corresponds to a one-step regression fitting task used only for SHAP-based feature selection on the training period, rather than the strict chronological multi-step forecasting evaluation reported in Table 2; therefore, the R² values in Tables 1 and 2 are not directly comparable. These results indicate LightGBM’s strong capacity to capture complex, nonlinear associations between online search behaviors and virological surveillance indicators. Therefore, LightGBM was selected as the base model for subsequent SHAP-based feature selection, providing a reliable framework for interpreting variable importance.

Table 1.

One-step regression fitting performance (MSE, RMSE, MAE, R²) of traditional machine learning models on the training set (feature selection stage; not comparable to the multi-step forecasting evaluation in Table 2) for Northern and Southern China

Model	Region	MSE	RMSE	MAE
Random Forest	Northern	0.0018	0.0426	0.0310	0.8931
Random Forest	Southern	0.0038	0.0619	0.0406	0.7946
XGBoost	Northern	0.0020	0.0452	0.0292	0.8794
XGBoost	Southern	0.0036	0.0600	0.0352	0.8071
LightGBM	Northern	0.0016	0.0401	0.0282	0.9050
LightGBM	Southern	0.0030	0.0546	0.0353	0.8401

Open in a new tab

Table 2.

Average forecasting performance of each model across North and South regions

Model	Region	MSE	RMSE	MAE
LSTM	North	0.0309	0.1697	0.1046	0.5056
	South	0.0459	0.2073	0.1488	0.4891
BiLSTM	North	0.014	0.1157	0.0696	0.7762
	South	0.0229	0.151	0.1028	0.7454
CNN-LSTM	North	0.0376	0.1937	0.1371	0.3969
	South	0.0620	0.2487	0.1953	0.3096
LSTCNet	North	0.0106	0.099	0.0546	0.8308
	South	0.0191	0.1317	0.0695	0.7869
LSTM-Attn.	North	0.0277	0.1656	0.1415	0.5559
	South	0.0248	0.1576	0.1321	0.6523
Transformer	North	0.0187	0.1336	0.0805	0.7016
	South	0.0221	0.1446	0.1019	0.7797
TCN	North	0.0375	0.1909	0.1343	0.3990
	South	0.0395	0.1944	0.1566	0.5608

Open in a new tab

Bold values indicate the best performance among all compared models for each region

Then we employed Shapley Additive Explanations (SHAP) to identify the most influential Baidu Index search terms contributing to influenza trend prediction (Figs. 4and 5). SHAP feature importance analyses for the northern and southern regions are presented in Fig. 4 and Fig. 5, respectively. The SHAP analysis revealed that search queries associated withinfluenza symptoms (e.g., fever, sore throat), diagnostic actions (e.g., “flu test”), and treatment (e.g., “Tamiflu”) had the highest positive impact on model outputs. Compared with Random Forest and XGBoost, the SHAP values from LightGBM exhibited greater consistency and interpretability. Accordingly, the top features with the highest SHAP values were selected separately for the northern and southern regions and used as input variables for downstream deep learning models. This region-specific feature refinement enhanced both the efficiency and interpretability of the influenza forecasting framework.

Fig. 4 — SHAP feature importance beeswarm plots for influenza positivity rate prediction in the Northern Region using ensemble models: (a) Random Forest, (b) XGBoost, and (c) LightGBM. Each dot represents a sample's SHAP value for a feature, with red indicating high feature values and blue indicating low values. Positive SHAP values increase the predicted influenza rate

Fig. 5 — SHAP feature importance beeswarm plots for influenza positivity rate prediction in the Southern Region using ensemble models: (a) Random Forest, (b) XGBoost, and (c) LightGBM. Each dot represents a sample’s SHAP value for a feature, with red indicating high feature values and blue indicating low values. Positive SHAP values increase the predicted influenza rate

Evaluation of model performance

To evaluate the effectiveness of LSTCNet, we compared its forecasting performance with several baseline models, including LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer. To ensure a fair comparison, all baseline models were rigorously tuned (e.g., grid search for hidden dimensions and dropout rates) to ensure convergence. As shown in Table 2, LSTCNet consistently outperformed all baseline models in both Northern and Southern China. In the Northern region, it achieved an average R² of 0.8308, surpassing LSTM (R² = 0.5056), Transformer (R² = 0.7016), and CNN-LSTM (R² = 0.3969). In the Southern region, LSTCNet reached an average R² of 0.7869, maintaining a significant performance lead over baselines such as LSTM (R² = 0.4891), Transformer (R² = 0.7797), and CNN-LSTM (R² = 0.3096). These results demonstrate LSTCNet’s superior ability to model both short- and long-term dependencies, particularly in complex regional influenza trends. To make the performance gap more interpretable, we further quantify the magnitude of improvements using relative metrics.

Beyond absolute error metrics, we additionally report the relative RMSE reduction and a normalized skill score against the strongest baseline under the same evaluation protocol, where the strongest baseline is defined (for each region) as the model with the lowest RMSE in Table 2 under the same data split and evaluation protocol. In the North region, LSTCNet improves upon BiLSTM (RMSE: 0.1157 → 0.099), corresponding to a 14.43% RMSE reduction and a skill score of 0.144. In the South region, LSTCNet improves upon Transformer (RMSE: 0.1446 → 0.1317), corresponding to an 8.92% RMSE reduction. Consistent gains are also observed in MAE, with reductions of 21.55% in the North (0.0696 → 0.0546) and 31.80% in the South (0.1019 → 0.0695). Together with the improved Inline graphic values in Table 2, these results indicate that LSTCNet’s advantage is consistently reflected across complementary error metrics.

All models are evaluated under the same strict chronological hold-out setting to avoid temporal leakage. Given a lookback window of length Inline graphic , we perform multi-step forecasting for horizons weeks ahead. Performance is summarized using MSE, RMSE, MAE, and on the original (inverse-transformed) positivity-rate scale. All baselines are trained and tested on the identical data split and target definition to ensure fair comparability.

Multi-step prediction experiments (1 to 4 weeks ahead) further demonstrated LSTCNet’s robust performance. In the Northern region, the model performed best at the 1-week horizon (RMSE = 0.0655, R² = 0.931) and maintained reasonable accuracy even at the 4-week horizon (RMSE = 0.1382, R² = 0.695), as illustrated in Fig. 6. In the Southern region, it achieved similarly promising results, with RMSE = 0.0822 and R² = 0.925 at the 1-week horizon. Although accuracy declined modestly at 4 weeks (RMSE = 0.1990, R² = 0.560), the model still effectively captured epidemic dynamics, as shown in Fig. 7. Taken together, these results confirm that LSTCNet delivers reliable forecasting performance across different temporal horizons.

Fig. 6 — LSTCNet multi-step forecasting performance in Northern China (1- to 4-week ahead). The Y-axis shows the Min–Max normalized influenza positivity rate (scaled using training-period statistics)

Fig. 7 — LSTCNet multi-step forecasting performance in Southern China (1- to 4-week ahead). The Y-axis shows the Min–Max normalized influenza positivity rate (scaled using training-period statistics)

To facilitate a direct observation of the prediction trend lines for each comparative model, we visualized the 1-week ahead forecast trajectories in Fig. 8. This detailed comparison reveals that LSTCNet achieves the most accurate alignment with the ground truth across both Northern and Southern regions, effectively capturing the temporal evolution of the epidemic. In contrast, the baseline models exhibit varying degrees of deviation from the observed trends, further validating the robustness of the proposed hybrid architecture in tracking dynamic influenza patterns.

Fig. 8 — Comparison of prediction trend lines for LSTCNet and baseline models. The plots visualize the 1-week ahead forecast trajectories against the observed influenza positivity rates in Northern and Southern China

Ablation study results

As shown in Table 3, the proposed LSTCNet, which integrates LSTM and TCN branches with an attention mechanism and a nonlinear multi-layer fusion strategy, achieves the lowest MSE/RMSE/MAE and the highest R² in both regions, clearly demonstrating its superior forecasting accuracy and robustness. The LSTM Only and TCN Only models, which isolate the recurrent and convolutional branches respectively, perform consistently worse, indicating that neither long-term dependency modeling nor local pattern extraction alone is sufficient to characterize influenza dynamics. The No Attention variant, which preserves both branches but removes the attention module, further reveals that simply combining LSTM and TCN without an explicit mechanism to emphasize informative time steps cannot match the full model. Although introducing attention to a single branch in LSTM Attention and TCN Attention yields improvements over their respective single-branch baselines, their performance still falls short of LSTCNet, highlighting the importance of jointly enhancing both branches. The No Positional Encoding variant shows a moderate performance decline, suggesting that positional information provides useful support for modeling temporal evolution patterns. On the architectural side, Basic Fusion, which employs a simpler fusion scheme, as well as the depth-reduced Shallow model and the structurally simplified Simplified TCN, all lead to further performance degradation. Taken together, the consistent inferiority of these variants confirms the design rationale of LSTCNet: the dual-branch structure, attention mechanism, positional information, and nonlinear multi-layer fusion collectively contribute to achieving the best forecasting performance.

Table 3.

Ablation study results: average forecasting performance of LSTCNet and its variants across North and South regions

Model	LSTM	TCN	Attention	Simplified TCN	Region	MSE	RMSE	MAE
LSTM Only	✔				North	0.0309	0.1697	0.1046	0.5056
	✔				South	0.0459	0.2073	0.1488	0.4891
TCN_Only		✔			North	0.0375	0.1909	0.1343	0.3990
		✔			South	0.0395	0.1944	0.1566	0.5608
No Attention	✔	✔			North	0.0189	0.1352	0.0752	0.6982
	✔	✔			South	0.022	0.1432	0.0835	0.6231
LSTM Attention	✔		✔		North	0.0137	0.1137	0.0647	0.7814
	✔		✔		South	0.0216	0.1423	0.0931	0.7598
TCN Attention		✔	✔		North	0.0250	0.1545	0.1015	0.5991
		✔	✔		South	0.0293	0.1669	0.1183	0.6737
LSTCNet	✔	✔	✔		North	0.0106	0.099	0.0546	0.8308
	✔	✔	✔		South	0.0191	0.1317	0.0695	0.7869
No Pos. Encoding	✔	✔	✔		North	0.0135	0.1162	0.063	0.8015
	✔	✔	✔		South	0.0238	0.1542	0.089	0.755
Basic Fusion	✔	✔	✔		North	0.0204	0.1383	0.0781	0.6736
	✔	✔	✔		South	0.0281	0.1639	0.1118	0.6878
Shallow	✔	✔	✔		North	0.0163	0.1267	0.0715	0.7315
	✔	✔	✔		South	0.0219	0.1475	0.0967	0.7565
Simplified TCN	✔	✔	✔	✔	North	0.0163	0.1259	0.0701	0.7317
	✔	✔	✔	✔	South	0.0272	0.1539	0.1107	0.6922

Open in a new tab

Bold values denote the best performance across all ablation variants for each region

Discussion

This study presented LSTCNet, a hybrid deep learning model that integrates Baidu Index with virological surveillance records to improve timely influenza trend prediction in Northern and Southern China. By combining bidirectional long short-term memory (BiLSTM) networks, temporal convolutional networks (TCN), and attention mechanisms, LSTCNet outperformed established models including LSTM, BiLSTM, CNN-LSTM, LSTM with Attention (LSTM-Attn.), TCN, and Transformer. It was particularly effective in capturing epidemic peaks and trend inflection points, offering consistent improvements in both short-term (1-week) and mid-term (2–4 week) predictions. This advantage is aligned with the model design: the TCN branch captures localized surges via multi-scale dilated convolutions, while the BiLSTM branch stabilizes seasonal and long-range context, and attention-based feature refinement suppresses spurious fluctuations in noisy behavioral signals.

The core innovation of LSTCNet is rooted in its dual-branch architecture, which captures both long-term and short-term dependencies in influenza time series. The BiLSTM models seasonal trends and long-range dynamics through bidirectional sequence processing, while the TCN uses multi-scale dilated convolutions to identify rapid local fluctuations such as sudden outbreaks or search behavior shifts. Attention mechanisms, including channel-wise (SE) and temporal (time-step) attention, further strengthen the model’s focus on relevant features, thereby improving stability under volatile or noisy conditions. Ablation experiments confirmed the necessity of each component. The LSTM-only and TCN-only variants exhibited pronounced performance degradation; removing attention mechanisms impaired long-horizon accuracy, and replacing the multi-layer fusion strategy with a simple linear fusion led to marked declines in predictive performance. These ablation patterns suggest that attention contributes more strongly at longer horizons where error accumulation becomes more pronounced in multi-step forecasting, and that non-linear fusion better reconciles heterogeneous representations from the two branches.

Feature selection using LightGBM+SHAP was pivotal for both predictive accuracy and interpretability, consistently highlighting Baidu queries related to influenza symptoms (e.g., “fever,” “sore throat”), diagnosis-seeking behaviors (e.g., “flu test”), and treatment (e.g., “Tamiflu”) as the most informative signals. By selecting the top 10 features separately for Northern and Southern China, LSTCNet better accommodated region-specific search behaviors and achieved slightly higher overall performance in the North (R² = 0.8308) than in the South (R² = 0.7869), while the South showed a sharper degradation at longer horizons. This North–South asymmetry aligns with established epidemiological characteristics in China [28, 29]. Northern China typically exhibits a single, concentrated winter peak, whereas Southern China (including southwestern regions like Chongqing) often manifests more irregular epidemic profiles or semi-annual peaks [30], which reduce multi-week predictability compared with the more concentrated seasonal peaks observed in the North. These characteristics also help explain why model families can exhibit opposite regional patterns: Transformer baselines may remain competitive when long-range dependencies dominate, whereas LSTCNet benefits from explicitly coupling BiLSTM-based long-range context with multi-scale causal TCN extraction of short-term fluctuations, with ablations further indicating that removing the TCN branch or attention mechanisms disproportionately harms performance in the more volatile Southern setting.

Compared with previous studies on internet-based influenza prediction, LSTCNet offers substantial advancements in both architecture and data integration. Traditional statistical models such as ARIMA and SVR are constrained by their linear assumptions [21], while earlier deep learning models often focus on single-source or single-scale inputs [23]. LSTCNet combines multi-scale modeling with multi-source data fusion, improving robustness and lead-time performance. Its attention mechanisms enhance the model’s resilience to noise, especially from behavioral data, surpassing Transformer-based models in longer-term forecasts. Unlike previous systems based on Google search data [14], which lack regional granularity, LSTCNet achieves improved specificity by integrating SHAP-based feature selection and behaviorally driven inputs. The use of timely Baidu Index data also compensates for the 1–2 week delay in official virological reporting, enabling more timely public health responses [11, 15]. Overall, these findings highlight the practical value of jointly modeling short- and long-range temporal patterns when integrating behavioral and surveillance signals.

From a practical standpoint, LSTCNet offers a robust framework for real-world epidemic surveillance and emergency preparedness. By facilitating early vaccine deployment and optimizing healthcare resource allocation, the model serves as a critical decision-support tool. Its modular and scalable architecture further permits adaptation to other infectious diseases, such as hand-foot-and-mouth disease or dengue, requiring minimal input adjustment or retraining. Crucially, computational efficiency underpins its applicability for timely monitoring; with sub-10 ms per-step inference latency on an NVIDIA RTX 3090 Ti GPU in our implementation, LSTCNet enables instantaneous risk assessment, making it viable for deployment even in resource-constrained environments.

Despite these contributions, several limitations warrant discussion. First, internet search signals can be influenced by external shocks such as intensive media coverage, and model robustness may be affected when data are incomplete or inconsistently reported. Second, although the proposed framework performs well at the macro-regional level, its effectiveness at finer spatial resolutions, such as individual cities, remains to be validated. Third, the current model provides deterministic point forecasts and does not explicitly quantify predictive uncertainty. Future work will incorporate probabilistic forecasting approaches, such as quantile regression or Bayesian neural networks, to produce uncertainty intervals and improve decision support [22]. Further improvements may also be achieved by integrating additional data sources, including meteorological, mobility, and social media indicators, and by leveraging transfer learning to enhance generalizability and spatial granularity. In addition, while we adopt a strict chronological hold-out to prevent temporal leakage, practical deployment may benefit from walk-forward retraining or sliding-window updating to adapt to distribution shifts, which we leave for future investigation. Finally, feature selection should be strictly nested within the training data to avoid information leakage, and future studies will implement a fully nested selection-and-training pipeline.

Conclusion

This study proposed LSTCNet, a hybrid deep learning framework that integrates Baidu Index signals with virological surveillance records to forecast influenza positivity rates in Northern and Southern China. The model combines a BiLSTM branch to capture long-range temporal dependencies with a TCN branch to extract short-term fluctuations, and employs attention-based feature refinement and a fusion module to enhance robustness under noisy behavioral signals. Extensive comparisons against representative baselines (LSTM, BiLSTM, CNN-LSTM, LSTM-Attn., TCN, and Transformer) show that LSTCNet achieves the best overall performance under a strict chronological evaluation protocol, with consistent improvements across complementary error metrics. Region-specific SHAP-based feature selection further improves interpretability by highlighting symptom-, diagnosis-, and treatment-related queries that are predictive of influenza activity, and supports modeling regional behavioral differences. From an application perspective, LSTCNet is computationally efficient and suitable for timely monitoring, enabling earlier situational awareness and decision support for epidemic preparedness. Future work will extend the evaluation to walk-forward updating schemes, integrate additional exogenous covariates (e.g., meteorological and mobility data), and develop probabilistic forecasting to provide calibrated uncertainty estimates, with the goal of improving robustness and spatial granularity in real-world deployment.

Acknowledgements

The authors would like to thank the Science and Technology Development Fund of Macau SAR for its financial support.

Abbreviations

AI: Artificial intelligence
ARIMA: Auto regressive integrated moving average
BiLSTM: Bidirectional long short-term memory
CNIC: Chinese national influenza center
CNN: Convolutional neural network
CNN-LSTM: Convolutional neural network–long short-term memory
COVID-19: Coronavirus disease 2019
GPU: Graphics processing unit
GRU: Gated recurrent unit
LightGBM: Light gradient boosting machine
LSTCNet: Long short-term memory–temporal convolutional network
LSTM: Long short-term memory
LSTM-Attn.: Long short-term memory with attention
MAE: Mean absolute error
MAPE: Mean absolute percentage error
MSE: Mean squared error
R²: Coefficient of determination
RMSE: Root mean squared error
SARS-CoV-2: Severe acute respiratory syndrome coronavirus 2
SE: Squeeze-and-excitation
SHAP: Shapley additive explanations
SVR: Support vector regression
TCN: Temporal convolutional network

Authors’ contributions

W.H. and X.L. contributed equally to this work. W.H. and X.L. conceived and designed the study, developed the LSTCNet model, performed all experiments, analyzed the results, prepared Figs. 1, 2, 4, 5, 5, 7 and 8, and wrote the main manuscript text. H.H. and Y.L. assisted with data collection, preprocessing, and visualization. N.S. contributed to the interpretation of results and the discussion of epidemiological implications. C.H. supervised the overall project, provided critical feedback, and finalized the manuscript. All authors reviewed and approved the final version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (Grant No. 2024YFE0214800); the Science and Technology Development Fund of the Macao SAR (Grant No. FDCT 0002/2024/RDP); the Science and Technology Development Program of Guangdong Province (Grant No. 2025B1212030002); the Engineering Technology Research (Development) Center of Ordinary Colleges and Universities in Guangdong Province (Grant No. 2024GCZX010); and the Guangdong Engineering Technology Research Center (Grant No. 2024A137).

Data availability

All data used in this study are publicly available. Weekly influenza positivity rates were obtained from the publicly released surveillance reports of the Chinese National Influenza Center (CNIC). Baidu Index search query series were retrieved from the Baidu Index platform by exporting weekly index values for the selected keywords during the study period (2011–2024). No individual-level or proprietary data were used. The data can be accessed directly from the original sources without special permission.

Declarations

Ethics approval and consent to participate

Not applicable. The study was based solely on publicly available, aggregated surveillance and search index data and did not involve human participants or identifiable personal information.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Wei He and Xuanfeng Li contributed equally to this work.

References

1.World Health Organization. Influenza (Seasonal). Geneva: World Health Organization. 2023. Available from: https://www.who.int/news-room/fact-sheets/detail/influenza-(seasonal).Cited 2026 Jan 03.
2.Iuliano AD, et al. Estimates of global seasonal influenza-associated respiratory mortality: a modelling study. Lancet. 2018;391(10127):1285–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Huang W-j, et al. Epidemiological and virological surveillance of influenza viruses in China during 2020–2021. Infect Dis Poverty. 2022;11(1):74. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Hanage WP, Schaffner W. Burden of acute respiratory infections caused by influenza virus, respiratory syncytial virus, and SARS-CoV-2 with consideration of older adults: A narrative review. Infect Dis Therapy. 2025;14(Suppl 1):5–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.de Courville C, et al. The economic burden of influenza among adults aged 18 to 64: A systematic literature review. Influenza Other Respir Viruses. 2022;16(3):376–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Bhounsule P, et al. Influenza burden among chronic condition and immunocompromised patients in the united States. Curr Med Res Opin. 2025;41(4):607–15. [DOI] [PubMed] [Google Scholar]
7.Xue L, Zeng G. A comprehensive evaluation on emergency response in China. Springer; 2019.
8.Moore M, et al. Strategies to improve global influenza surveillance: a decision tool for policymakers. BMC Public Health. 2008;8(1):186. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Jabour AM et al. Examining the correlation of Google influenza trend with hospital data: retrospective study. J Multidisciplinary Healthc, 2021: pp. 3073–81. [DOI] [PMC free article] [PubMed]
10.Yang L, et al. Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study. J Med Internet Res. 2023;25:e45085. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Wei S, et al. The prediction of influenza-like illness using National influenza surveillance data and Baidu query data. BMC Public Health. 2024;24(1):513. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Guo S, et al. Improving Google flu trends for COVID-19 estimates using Weibo posts. Data Sci Manage. 2021;3:13–21. [Google Scholar]
13.Amin S, et al. Early detection of seasonal outbreaks from Twitter data using machine learning approaches. Complexity. 2021;2021(1):5520366. [Google Scholar]
14.Olukanmi SO, Nelwamondo FV, Nwulu NI. Utilizing Google search data with deep learning, machine learning and time series modeling to forecast influenza-like illnesses in South Africa. IEEE Access. 2021;9:126822–36. [Google Scholar]
15.Fang J, et al. Baidu index and COVID-19 epidemic forecast: evidence from China. Front Public Health. 2021;9:685141. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Alotaibi Y, et al. Computational linguistics based text emotion analysis using enhanced beetle antenna search with deep learning during COVID-19 pandemic. PeerJ Comput Sci. 2023;9:e1714. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Swaminathan A et al. Analysis and Prognosis of Covid-19 Using Machine Learning Algorithms and Visualization Using Tableau. in 2022 7th International Conference on Communication and Electronics Systems. 2022. IEEE.
18.Tamilvizhi T, Data Science (ICCDS). An AI-Based Framework for timely Patient Monitoring and Intelligent Treatment Recommendation in Critical Care Units. 2025 2nd International Conference on Computing and. 2025. IEEE: pp. 1–5.
19.Jemina SL, Thanarajan T. An intelligent brain tumor detection model using lightweight hybrid twin attentive pyramid convolutional network. Sci Rep. 2025;15:40177. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Jayalakshmi R, Tamilvizhi T. Privacy preservation in diabetic disease prediction using federated learning based on efficient cross stage recurrent model. Sci Rep. 2025;15(1):37258. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Jayalakshmi R et al. Comprehensive Evaluation of Federated Learning Based Models for Disease Detection in Healthcare. in 2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT). 2024. IEEE.
22.Annadurai K et al. Early detection and forecasting of influenza epidemics using a hybrid ARIMA-GRU model. Int J Adv Comput Sci Appl, 2025. 16(5).
23.Ali NSM, Mohammed FA. The use of ARIMA, ANN and SVR models in time series hybridization with practical application. Int J Nonlinear Anal Appl. 2023;14(3):87–102. [Google Scholar]
24.Adugna A, et al. Deep learning architectures for influenza dynamics and treatment optimization: a comprehensive review. Front Artif Intell. 2025;8:1521886. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Zhao S et al. Multi-source domain adaptation in the deep learning era: A systematic survey. arXiv preprint arXiv:2002.12169, 2020.
26.Ma S, Ning S, Yang S. Joint COVID-19 and influenza-like illness forecasts in the united States using internet search information. Commun Med. 2023;3:39. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Wang Z et al. Integrating Spatiotemporal features in LSTM for spatially informed COVID-19 hospitalization forecasting. Int J Geogr Inf Sci, 2025. (Online ahead of print).
28.Yu H, Alonso WJ, Feng L, et al. Characterization of regional influenza seasonality patterns in China and implications for vaccination strategies: spatio-temporal modeling of surveillance data. PLoS Med. 2013;10(11):e1001552. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Tamerius JD, Shaman J, Alonso WJ, et al. Environmental predictors of seasonal influenza epidemics across temperate and tropical climates. PLoS Pathog. 2013;9(3):e1003194. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Fu X, et al. Epidemic patterns of the different influenza virus types and subtypes/lineages for 10 years in Chongqing, China, 2010–2019. Volume 20. Human Vaccines & Immunotherapeutics; 2024. p. 2363076. 1. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[CR1] 1.World Health Organization. Influenza (Seasonal). Geneva: World Health Organization. 2023. Available from: https://www.who.int/news-room/fact-sheets/detail/influenza-(seasonal).Cited 2026 Jan 03.

[CR2] 2.Iuliano AD, et al. Estimates of global seasonal influenza-associated respiratory mortality: a modelling study. Lancet. 2018;391(10127):1285–300. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Huang W-j, et al. Epidemiological and virological surveillance of influenza viruses in China during 2020–2021. Infect Dis Poverty. 2022;11(1):74. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Hanage WP, Schaffner W. Burden of acute respiratory infections caused by influenza virus, respiratory syncytial virus, and SARS-CoV-2 with consideration of older adults: A narrative review. Infect Dis Therapy. 2025;14(Suppl 1):5–37. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.de Courville C, et al. The economic burden of influenza among adults aged 18 to 64: A systematic literature review. Influenza Other Respir Viruses. 2022;16(3):376–85. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Bhounsule P, et al. Influenza burden among chronic condition and immunocompromised patients in the united States. Curr Med Res Opin. 2025;41(4):607–15. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Xue L, Zeng G. A comprehensive evaluation on emergency response in China. Springer; 2019.

[CR8] 8.Moore M, et al. Strategies to improve global influenza surveillance: a decision tool for policymakers. BMC Public Health. 2008;8(1):186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Jabour AM et al. Examining the correlation of Google influenza trend with hospital data: retrospective study. J Multidisciplinary Healthc, 2021: pp. 3073–81. [DOI] [PMC free article] [PubMed]

[CR10] 10.Yang L, et al. Influenza epidemic trend surveillance and prediction based on search engine data: deep learning model study. J Med Internet Res. 2023;25:e45085. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Wei S, et al. The prediction of influenza-like illness using National influenza surveillance data and Baidu query data. BMC Public Health. 2024;24(1):513. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Guo S, et al. Improving Google flu trends for COVID-19 estimates using Weibo posts. Data Sci Manage. 2021;3:13–21. [Google Scholar]

[CR13] 13.Amin S, et al. Early detection of seasonal outbreaks from Twitter data using machine learning approaches. Complexity. 2021;2021(1):5520366. [Google Scholar]

[CR14] 14.Olukanmi SO, Nelwamondo FV, Nwulu NI. Utilizing Google search data with deep learning, machine learning and time series modeling to forecast influenza-like illnesses in South Africa. IEEE Access. 2021;9:126822–36. [Google Scholar]

[CR15] 15.Fang J, et al. Baidu index and COVID-19 epidemic forecast: evidence from China. Front Public Health. 2021;9:685141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Alotaibi Y, et al. Computational linguistics based text emotion analysis using enhanced beetle antenna search with deep learning during COVID-19 pandemic. PeerJ Comput Sci. 2023;9:e1714. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Swaminathan A et al. Analysis and Prognosis of Covid-19 Using Machine Learning Algorithms and Visualization Using Tableau. in 2022 7th International Conference on Communication and Electronics Systems. 2022. IEEE.

[CR18] 18.Tamilvizhi T, Data Science (ICCDS). An AI-Based Framework for timely Patient Monitoring and Intelligent Treatment Recommendation in Critical Care Units. 2025 2nd International Conference on Computing and. 2025. IEEE: pp. 1–5.

[CR19] 19.Jemina SL, Thanarajan T. An intelligent brain tumor detection model using lightweight hybrid twin attentive pyramid convolutional network. Sci Rep. 2025;15:40177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Jayalakshmi R, Tamilvizhi T. Privacy preservation in diabetic disease prediction using federated learning based on efficient cross stage recurrent model. Sci Rep. 2025;15(1):37258. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Jayalakshmi R et al. Comprehensive Evaluation of Federated Learning Based Models for Disease Detection in Healthcare. in 2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT). 2024. IEEE.

[CR22] 22.Annadurai K et al. Early detection and forecasting of influenza epidemics using a hybrid ARIMA-GRU model. Int J Adv Comput Sci Appl, 2025. 16(5).

[CR23] 23.Ali NSM, Mohammed FA. The use of ARIMA, ANN and SVR models in time series hybridization with practical application. Int J Nonlinear Anal Appl. 2023;14(3):87–102. [Google Scholar]

[CR24] 24.Adugna A, et al. Deep learning architectures for influenza dynamics and treatment optimization: a comprehensive review. Front Artif Intell. 2025;8:1521886. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Zhao S et al. Multi-source domain adaptation in the deep learning era: A systematic survey. arXiv preprint arXiv:2002.12169, 2020.

[CR26] 26.Ma S, Ning S, Yang S. Joint COVID-19 and influenza-like illness forecasts in the united States using internet search information. Commun Med. 2023;3:39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Wang Z et al. Integrating Spatiotemporal features in LSTM for spatially informed COVID-19 hospitalization forecasting. Int J Geogr Inf Sci, 2025. (Online ahead of print).

[CR28] 28.Yu H, Alonso WJ, Feng L, et al. Characterization of regional influenza seasonality patterns in China and implications for vaccination strategies: spatio-temporal modeling of surveillance data. PLoS Med. 2013;10(11):e1001552. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Tamerius JD, Shaman J, Alonso WJ, et al. Environmental predictors of seasonal influenza epidemics across temperate and tropical climates. PLoS Pathog. 2013;9(3):e1003194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Fu X, et al. Epidemic patterns of the different influenza virus types and subtypes/lineages for 10 years in Chongqing, China, 2010–2019. Volume 20. Human Vaccines & Immunotherapeutics; 2024. p. 2363076. 1. [DOI] [PMC free article] [PubMed]

PERMALINK

LSTCNet: timely prediction of influenza in China using Baidu index and LSTM-TCN hybrid network

Wei He

Xuanfeng Li

Haining He

Yiping Li

Ning Sun

Chitin Hon

Abstract

Background

Objective

Methods

Results

Conclusion

Introduction

Modeling methodology

Fig. 1.

Input processing

LSTM module

Fig. 2.

TCN module

Attention mechanisms

Residual connection

Feature fusion

Prediction head

Training and optimization

Experimental design and evaluation

Dataset

Data preprocessing and target definition

Forecasting protocol

Evaluation metrics

Model selection

Ablation study

Results

Feature selection

Table 1.

Table 2.

Fig. 4.

Fig. 5.

Evaluation of model performance

Fig. 6.

Fig. 7.

Fig. 8.

Ablation study results

Table 3.

Discussion

Conclusion

Acknowledgements

Abbreviations

Authors’ contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases