Application of a novel approach for dementia prevalence prediction in Taiwan

Cheng-Hong Yang; Po-Hung Chen; Cheng-San Yang; Ting-Jen Hseuh; Stephanie Yang

doi:10.1038/s41598-025-34592-1

. 2026 Jan 10;16:4490. doi: 10.1038/s41598-025-34592-1

Application of a novel approach for dementia prevalence prediction in Taiwan

Cheng-Hong Yang ^1,^2,^3,^4,^✉, Po-Hung Chen ^2,^#, Cheng-San Yang ^5,^#, Ting-Jen Hseuh ^2,^#, Stephanie Yang ^6,^✉,^#

PMCID: PMC12864913 PMID: 41519877

Abstract

Amid the rapidly aging global population, dementia cases are rising at an alarming rate. Dementia has become a major public health challenge, exerting profound impacts on socioeconomic systems and overall human well-being. The condition progressively deteriorates cognitive abilities such as memory, judgment, comprehension, and language, eventually resulting in the loss of independent daily functioning. In addition, patients often experience neuropsychiatric symptoms that severely diminish their quality of life. This study proposes an optimized machine learning model the Flying Geese Optimization Algorithm Support Vector Regression (FGOASVR) to effectively predict trends in dementia prevalence. The empirical analysis utilizes annual dementia diagnostic data from 1998 to 2023, obtained from Taiwan’s National Health Insurance Research Database (NHIRD). To validate model performance, FGOASVR was compared against three categories of forecasting models: Statistical models: Autoregressive Integrated Moving Average (ARIMA) and Holt-Winters Exponential Smoothing (HWETS); Deep learning model: Long Short-Term Memory (LSTM); Hybrid models: Support Vector Regression (SVR), Particle Swarm Optimization SVR (PSOSVR), Differential Evolution SVR (DESVR), Whale Optimization Algorithm SVR (WOASVR), and Harris Hawk Optimization SVR (HHOSVR). Performance was assessed using standard forecasting metrics Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE). The FGOASVR model achieved the highest accuracy, with average MAPE values of 3.17 and 3.42, and RMSE values of 0.69 and 0.96 for males and females, respectively. These results confirm that FGOASVR delivers superior precision and stability in forecasting dementia trends in Taiwan, demonstrating its strong potential for advancing data-driven public health prediction and policy development.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-34592-1.

Keywords: Dementia forecasting, Machine learning, Flying geese optimization algorithm, Support vector regression

Subject terms: Computational biology and bioinformatics, Diseases, Health care, Mathematics and computing, Neuroscience

Introduction

Dementia is a chronic and progressive neurological disorder characterized by declines in memory, cognition, and daily functioning¹. It most commonly results from Alzheimer’s disease, vascular dementia, Lewy body dementia, and frontotemporal dementia, which are associated with neurodegeneration, vascular pathology, or abnormal protein accumulation². As the disease advances, individuals experience worsening cognitive and behavioral impairments, eventually losing independence and requiring long-term care³. Beyond its impact on patients, dementia imposes substantial emotional, physical, and financial burdens on caregivers and places significant strain on healthcare systems and social welfare structures, making it a major global public health concern⁴.

Global Burden of Disease analyses have shown a steady rise in dementia incidence, prevalence, mortality, and disability-adjusted life years since 1990, highlighting the urgency of early screening and intervention, particularly among older populations⁵. The World Health Organization estimates that more than 55 million people worldwide currently live with dementia, a figure projected to reach 139 million by 2050, with annual global economic costs exceeding USD 1 trillion². In response, many countries have implemented national strategies, including public awareness programs, community-based care models, and integrated dementia services^6–9. In Taiwan, where over 17% of the population is aged 65 or older, more than 310,000 individuals currently have dementia, and this number is expected to exceed 500,000 by 2041, underscoring the growing domestic burden of the disease¹⁰.

Time series analysis provides an effective framework for examining long-term dementia trends and supporting evidence-based policy planning. Traditional statistical methods such as Autoregressive Integrated Moving Average (ARIMA) and Holt–Winters Exponential Smoothing (HWETS) have been widely applied but are limited by linear assumptions, restricted seasonal structures, and sensitivity to non-stationarity and heteroscedasticity, which can reduce forecasting accuracy in complex real-world data^11,12.

Advances in machine learning and deep learning have introduced more flexible alternatives for time series forecasting. Support Vector Regression (SVR) effectively captures nonlinear relationships through high-dimensional feature mapping and often outperforms classical statistical models^13–15. Long Short-Term Memory (LSTM) networks further enhance predictive capability by modeling temporal dependencies in sequential health data and have demonstrated strong performance in clinical risk prediction and care planning^16,17. However, deep learning models typically involve high model complexity and extensive parameterization. In contrast, SVR offers a balance between model simplicity and nonlinear approximation power but remains highly sensitive to hyperparameter selection¹⁸. To address this limitation, hybrid SVR models incorporating intelligent optimization algorithms such as Particle Swarm Optimization, Differential Evolution, Whale Optimization Algorithm, and Harris Hawks Optimization have been proposed to improve prediction accuracy, robustness, and generalization performance^19–21.

In this study, we introduce the Flying Geese Optimization Algorithm (FGOA) as an effective strategy for optimizing Support Vector Regression (SVR) by identifying optimal hyperparameter combinations. Inspired by bio-heuristic optimization principles, FGOA incorporates novel search mechanisms that mitigate premature convergence and enhance exploration, making it particularly suitable for solving complex, high-dimensional optimization problems. The algorithm efficiently explores the search space, extracts informative patterns from large-scale data, and identifies highly discriminative hyperparameter configurations. Based on this framework, an FGOA-optimized SVR model (FGOASVR) is developed to forecast dementia patient numbers, resulting in improved parameter optimization and enhanced predictive stability. Comparative experiments against eight benchmark models demonstrate that FGOASVR achieves the lowest mean absolute percentage error (MAPE) and root mean square error (RMSE).

Literature review

Time series analysis has long been employed for forecasting disease incidence and supporting public health decision-making. Classical statistical approaches, such as Autoregressive Integrated Moving Average (ARIMA) and Holt–Winters Exponential Smoothing (HWETS), have demonstrated reliable performance in modeling linear trends and seasonal patterns. For example, Juang et al. applied an ARIMA model to predict emergency department visit volumes in southern Taiwan, identifying ARIMA (0,0,1) as the best-fitting specification and achieving a MAPE of 8.91%, which proved valuable for healthcare resource allocation and overcrowding mitigation²². Similarly, HWETS has been shown to effectively capture seasonal disease fluctuations. In a study of foodborne disease incidence in Chongqing, the Holt–Winters model outperformed SARIMA and ETS models in terms of MSE, MAE, and RMSE, while maintaining close agreement between predicted and observed trends despite a relatively higher MAPE²³.

Although these traditional models provide strong baselines, their predictive performance may be limited when dealing with nonlinear and non-stationary time series commonly observed in epidemiological data. Consequently, machine learning methods have increasingly been adopted to address these challenges. Support Vector Regression (SVR) has been widely used for epidemic forecasting due to its strong generalization ability and robustness to high-dimensional data. An application to COVID-19 data in India demonstrated that SVR achieved over 97% accuracy in forecasting cumulative cases and deaths and outperformed linear and polynomial regression models, although its accuracy for daily new cases was slightly lower due to data irregularity²⁴.

A key limitation of SVR is its sensitivity to hyperparameter selection, particularly the penalty parameter (C), the insensitive loss parameter (ε), and the kernel width (σ). To overcome this issue, researchers have proposed hybrid SVR frameworks that integrate metaheuristic optimization algorithms for automated parameter tuning. Studies have shown that Particle Swarm Optimization–based SVR (PSOSVR) substantially reduces prediction errors compared to standard SVR, with MAPE reductions exceeding 45% in dementia incidence forecasting across both genders²⁰. Further improvements have been achieved using Differential Evolution (DESVR), Whale Optimization Algorithm (WOASVR), and Harris Hawks Optimization (HHOSVR), which have demonstrated superior accuracy, stability, and generalization in applications such as blood glucose prediction, renewable energy forecasting, and hydrological modeling^21,25. These findings underscore the critical role of intelligent optimization in enhancing the predictive capability of SVR-based time series models and provide a strong methodological foundation for their application to complex disease forecasting problems.

Methodology

The data for this study were obtained from Taiwan’s National Health Insurance Research Database (NHIRD), managed by the Ministry of Health and Welfare. The dataset includes annual counts of male and female dementia patients aged 60 and above, which were used to develop a predictive model for dementia prevalence. The data were then split into training (80%) and testing (20%) sets, with tenfold cross-validation applied. Time series methods were employed to train the model and optimize hyperparameters, with the goal of minimizing prediction errors and improving forecasting accuracy for both genders. The system architecture for predicting the number of dementia patients is shown in Fig. 1.

Fig. 1 — System architecture diagram for forecasting the number of dementia patients.

Statistical method

Autoregressive integrated moving average (ARIMA)

The Autoregressive Integrated Moving Average (ARIMA) model was developed in 1976 by Box and Jenkins and is also known as the Box–Jenkins model²⁶. The ARIMA was divided into three parts, a combination of the autoregression AR(p), moving average MA(q), and differencing degree d²⁷. ARIMA models are a family of time series models designed to capture dependencies between observations based on their temporal lags. They are particularly well suited for stationary time series, in which the mean and variance remain stable over time. ARIMA combines autoregressive terms, differencing operations, and moving average components to represent the underlying structure of time series data²⁸. The model is specified by three parameters, denoted as (p, d, q), where p indicates the number of lagged observations included in the model, d represents the degree of non-seasonal differencing applied to achieve stationarity, and q corresponds to the order of the moving average component.

Holt-winters exponential smoothing (HWETS)

The Holt–Winters Exponential Smoothing (HWETS) method, introduced by Holt and Winters, is a classic and widely recognized approach in time series forecasting²⁹. It operates by applying weighted exponential smoothing to the three core components of a time series: level, trend, and seasonality. HWETS is particularly valued for its ability to capture both long-term trends and short-term seasonal fluctuations, making it well-suited for data with pronounced seasonal patterns. By simultaneously modeling trend and seasonal components, the HWETS approach effectively identifies underlying structural changes in the series, enabling accurate and reliable predictions of future values. Consequently, it is extensively used in fields such as epidemic monitoring, sales forecasting, and public health³⁰.

Machine learning

Support vector regression (SVR)

Support Vector Regression (SVR), first introduced by Vapnik and colleagues in 1997, extends the principles of Support Vector Machines (SVM) to regression problems in supervised learning³¹. SVR can address both linear and nonlinear regression tasks and is especially effective for prediction problems involving high-dimensional or complex datasets³². The fundamental concept of SVR is to perform linear regression in a high-dimensional feature space generated through kernel functions. By projecting the input data into this feature space, SVR identifies a linear hyperplane that fits the data points accurately while preserving strong generalization and predictive performance³³.

To improve the model’s ability to generalize to unseen data, SVR incorporates a crucial control parameter the regularization coefficient C. This parameter governs the balance between minimizing the training error and maintaining model simplicity, effectively reducing the risk of overfitting³⁴. The ε parameter defines the tolerance margin for the deviation between predicted and actual values; penalties are applied only when the prediction error exceeds ε. This mechanism allows small, insignificant deviations to be disregarded, thereby enhancing model robustness²⁷. For nonlinear problems, SVR employs different types of kernel functions such as linear, polynomial, and Gaussian radial basis function (RBF) kernels to implicitly map input data into higher-dimensional spaces and capture complex nonlinear relationships among features³³. The mathematical formulation of the SVR model is as follows:

Equation (2) yields the predicted result Inline graphic , which is obtained by taking the inner product between the weight vector ω and the feature vector and adding the bias term . Equation (3), a regularization term is introduced to control the model’s complexity; is the regularization constant, and and are slack variables used to quantify the degree of difference between the data and the ε tube. The regularization term is used to minimize prediction errors while controlling the complexity of the model to prevent overfitting.

Particle swarm optimization (PSO)

Particle Swarm Optimization (PSO) is a population-based iterative optimization algorithm inspired by the collective flocking behavior of birds, first introduced by Eberhart and Kennedy³⁵. The optimization process starts with a randomly initialized population of candidate solutions, known as particles, where each particle represents a potential solution within a D-dimensional search space. The fitness of each particle is assessed by a problem-specific function f(x), which evaluates the quality of its current position. By sharing information, particles influence one another’s movement, enabling them to adjust their trajectories toward more promising areas of the search space. After the individual particle evaluations, each particle is updated with its own historical best position (pbest) as well as the domain-wide best position (gbest) found so far for the whole population. In each iteration, particles are updated and accelerated toward both pbest_i and gbest. This updating mechanism effectively balances exploration and exploitation by guiding particles toward regions with higher fitness values. The process continues iteratively until the predefined termination criterion typically the maximum number of iterations is satisfied.

Differential evolution (DE)

The Differential Evolution (DE) algorithm, introduced by Storn and Price in 1997, is a population-based stochastic optimization method that relies on three fundamental operations: mutation, crossover, and selection³⁶. In the mutation step, three distinct individuals are randomly chosen, and their difference vectors are used to create a mutation vector, with the scaling factor F (typically between 0.5 and 1.0) controlling the step size and exploration extent. During crossover, the mutation vector is combined with the target vector to produce a trial vector Ui, where the crossover probability CR (usually 0.8–1.0) determines which components are inherited from the mutation vector, promoting solution diversity. Finally, in the selection step, the trial and target vectors are compared based on their fitness values, and the individual with the better fitness is retained for the next generation.

Whale optimization algorithm (WOA)

The Whale Optimization Algorithm (WOA) is a population-based metaheuristic inspired by the bubble-net hunting behavior of humpback whales, developed by Mirjalili and Lewis in 2016³⁷. It models two primary hunting mechanisms encircling prey and spiral bubble-net feeding to effectively balance exploration and exploitation during the optimization process. In the encircling prey phase, each whale updates its position by moving toward the current best solution (leader whale). The movement is governed by randomly generated coefficients and a linearly decreasing parameter, which together facilitate gradual convergence toward the leader, simulating the coordinated encircling behavior observed in real whales. When transitioning to the bubble-net attacking phase, whales follow a logarithmic spiral trajectory around the leader instead of a linear path. This mechanism enhances global exploration, allowing the population to escape local optima and preventing premature convergence. As iterations progress, the degree of random exploration decreases while cooperative interactions among whales guide the population toward the optimal region. The algorithm terminates when the maximum iteration count is reached or when improvements in the objective function become negligible, yielding the best solution represented by the leader whale.

Harris hawk optimization (HHO)

Harris Hawks Optimization (HHO), proposed by Heidari et al. in 2019, is a population-based metaheuristic inspired by the cooperative hunting behavior of Harris’s hawks, which adapt their attack strategies according to the prey’s movement and energy³⁸. HHO models this behavior to balance exploration and exploitation through two phases. In the exploration phase, each hawk represents a candidate solution and randomly explores the search space using stochastic position updates influenced by other hawks and the current best solution, maintaining diversity and preventing premature convergence. In the exploitation phase, the prey’s escaping energy E controls the hunting strategy: a soft besiege is applied when ∣E∣ ≥ 1 to maintain partial exploration, while a hard besiege is used when ∣E∣ < 1 for intensive local refinement. Lévy flight–based random jumps are also incorporated to enhance global search ability. By adaptively switching between these strategies, HHO achieves robust convergence and effective optimization performance.

Deep learning

Long short-term memory (LSTM)

Long short-term memory (LSTM) is an advanced variant of recurrent neural networks (RNNs) specifically designed to address the vanishing gradient problem that limits the ability of conventional RNNs to learn long-range temporal dependencies. LSTM was first introduced by Hochreiter and Schmidhuber (1997) and incorporates a gated architecture that regulates the flow of information through the network³⁹. This architecture enables LSTM models to selectively retain, update, or discard information over extended sequences, thereby improving their capacity to model complex temporal patterns. In the context of time series analysis, observations are structured as ordered sequences, with each data point representing a discrete time step. LSTM networks are trained in a supervised learning framework, using historical observations as inputs to predict subsequent values in the sequence. By incorporating multiple lagged observations into the input window, LSTM models can effectively capture both short-term dynamics and long-term dependencies present in the data. The superior ability of LSTM to model nonlinear relationships and temporal dependencies has led to its widespread application in time series forecasting across diverse domains. Empirical studies have demonstrated that LSTM models often outperform traditional statistical methods and other machine learning approaches⁴⁰.

Flying geese optimization algorithm (FGOA)

The Flying Geese Theory, proposed by Japanese economist Akamatsu in 1935⁴¹, illustrates how cooperation and shared leadership enhance collective performance. It describes how geese flying in a V-formation gain aerodynamic advantages each bird’s wing flapping creates uplift for those behind, allowing the flock to conserve up to 71% of its energy. The geese at the rear honk to encourage the leaders to maintain their pace, and if a goose becomes ill or injured and falls behind, two others remain with it until it recovers or dies before rejoining the group. When the leading goose becomes fatigued, it returns to the formation while another takes the lead, ensuring that leadership responsibilities are distributed among all members.

In this study, we developed a novel algorithm, the Flying Geese Optimization Algorithm (FGOA), inspired by the key principles of the Flying Geese Theory. Four core mechanisms were mathematically modeled and incorporated into the optimization strategy: team collaboration geese flying in a V-formation conserve significantly more energy than when flying alone⁴²; mutual encouragement geese at the rear honk to motivate those in front to maintain their speed; peer support if a goose becomes sick or injured and falls behind, two others stay with it until it recovers or can continue flying⁴³; and rotational leadership when the lead goose tires, it moves back into the formation, and another takes its place at the front⁴⁴. In the FGOA, the algorithm iterates through these behaviors, and upon reaching the maximum number of iterations or satisfying stopping criteria, the individual with the best fitness is selected as the optimal solution. A simplified version of the FGOA pseudocode is presented below, while the complete version is provided in the supplementary materials (Supplementary Algorithm 1).

Team collaboration

This team collaboration mechanism is designed to distribute computational effort between exploration and exploitation via a hierarchical coupling approach, improving global convergence efficiency and reducing sensitivity to local optima. During the FGOA updraft phase, each individual adopts the wingtip upwash effect from a leading neighbor with a 50% probability; otherwise, random environmental updrafts are applied. In the subsequent whiffling exploitation phase, individuals are guided along the direction vector toward the global best Inline graphic , while incorporating scaled random perturbations, slight angular deviations, and noise. Boundary clipping ensures solution feasibility, balancing convergence speed with the risk of premature stagnation. The formula is as follows:

The interpretation is that each goose moves toward Inline graphic with a degree of stochastic variation. To ensure that the updated position remains within the feasible search space, normalization is applied using the clip function. Let denote the fitness of goose i. The vector represents the direction toward , and serves as a random scaling factor controlling the magnitude of the perturbation. The term Inline graphic , where , introduces a small stochastic noise component to prevent strictly deterministic alignment, with γ ranging from 0.5 to 0.8. The angle θ is uniformly sampled from , and denotes a normalized random unit direction vector in the search space.

The goose selects a “frontal guiding neighbor” based on the angle and distance between the formation heading and the neighbor’s direction, then applies small stochastic perturbations to the velocity to enhance local guidance; the latter goose randomly spawns 1–3 Gaussian updraft centers (Supplementary Method 1) in the search space and assigns centripetal acceleration proportional to the individual center distance to boost exploration efficiency. The formula is as follows:

where

Here, Inline graphic is introduced to prevent division by zero. The term denotes a small random perturbation vector that avoids strictly deterministic alignment. The vector represents the direction from to n, and normalizing it yields the corresponding unit direction vector. After receiving the updraft adjustment, the goose proceeds with the exploration phase (see Supplementary Method 2).

Mutual Encouragement

An incentive mechanism is incorporated to adaptively refine the best-performing solution based on the population’s performance distribution. By allowing underperforming individuals to exert influence on the global best update, this mechanism reduces the risk of premature convergence and promotes a more comprehensive exploration of the search space. The incentive threshold is calculated as follows:

where Inline graphic denotes the fitness of goose , and represents the current global best solution.

The update rule for the incentive factor Inline graphic is defined as follows: a geese receives an increased factor (1.2) if its fitness surpasses the current global best, a reduced factor (0.8) if its fitness is substantially worse (i.e., more than 1.5 times the global best), and the default value (1.0) otherwise. The values 0.8, 1.0, and 1.2 act as small multiplicative adjustments (± 20%), providing enough variation to influence selection pressure while maintaining numerical stability. Using 1.0 as the neutral baseline preserves the mean search intensity, whereas the symmetric ± 20% perturbations increase population diversity without biasing the overall search direction. The 1.5 threshold serves as a scale-free criterion to ensure that incentives are only applied when performance differences are meaningful, thereby promoting diversity while maintaining stable and interpretable system dynamics.

Peer support

This peer support mechanism is designed to help underperforming geese remain part of the dynamic population and avoid falling behind during the optimization process. The strategy enhances both exploration and exploitation efficiency while maintaining the stability of the convergence process. The algorithm incorporates a support mechanism for lagging geese, ensuring that all individuals in the population stay competitive. The position of a lagging goose is updated as follows:

where Inline graphic controls the degree to which individuals are guided toward and denotes the objective goose function value of individual .

Update geese velocity and position

The update mechanism defined in Eqs. (16–17) constitutes an interpretable hybrid dynamical system. In Eq. (16), the updated velocity is composed of three components: inertia, guidance, and colony. The inertia term retains historical momentum to prevent excessive oscillation; the guidance term directs each individual toward its personal best ( Inline graphic , strengthening local exploitation; and the colony term moves individuals toward the global best ((, fostering group consensus and enhancing global convergence. The recommended parameter ranges are as follows: , 0.5 ~ 1.0; , 0.7 ~ 2.5; and , 0.7 ~ 2.5. The random coefficients ( Inline graphic , ), together with a linearly decreasing weighting schedule, progressively shift the search behavior from strong exploration in the early stages to strong exploitation in later stages. Equation (17) updates the position and incorporates scaled Gaussian noise to expand the sampling radius, improve the ability to traverse narrow basins and saddle landscapes, and reduce the likelihood of becoming trapped in local optima. To further ensure global search stability, boundary clipping is applied to maintain feasible solution ranges and prevent divergence due to cumulative deviations. The corresponding formula is given as follows:

where Inline graphic and are random coefficients, and (0,1) represents Gaussian noise.

Rotational leadership

When population stagnation is detected, a whiffling-style local search is activated: in Eq. (18), each individual Inline graphic is assigned a small uniform perturbation in the range ([-0.001, 0.001]). If the new goose position yields a better solution than the previous best, the goose position is updated accordingly to avoid premature convergence by ensuring that each goose in the population not only exploits the best solution for now but also explores. The corresponding formula is as follows:

This rotational leadership mechanism dynamically updates the leading position, allowing the strongest geese to take turns at the front to sustain flight speed and avoid leader fatigue. By continuously assigning leadership to the highest-performing individuals, the flock adaptively adjusts its flight trajectory toward more promising areas of the search space. As shown in Eq. (19), the leader is periodically reset to prevent stagnation and maintain search progress, thereby improving convergence efficiency and ensuring a balanced trade-off between exploration and exploitation. The corresponding formula is as follows:

Algorithm 1 — The pseudo code of Flying Geese Optimization Algorithm (FGOA).

Performance criteria

MAPE, defined in Eq. (20), measures the average absolute percentage error, representing the model’s relative accuracy lower values indicate predictions that are closer to the actual observations. RMSE measures the square root of average squared errors, reflecting prediction variability, with lower values indicating more precise and stable forecasts. The calculation formula for RMSE is provided in Eq. (21), serving as a standard quantitative basis for evaluating forecasting performance⁴⁵.

Equations (20), for each observation Inline graphic , compute the prediction error as , and then divide this prediction error by the actual value to obtain the relative error. Next, take the absolute value of the relative error for all observations and calculate the mean. Finally, multiply by 100% to present the error as a percentage. Equations (21), calculate the squared prediction error for each observation, Inline graphic . Then, take the average of all squared errors, i.e., .

Results

Data description

In this study, data on the annual number of dementia patients aged 60 years and above were collected from Taiwan’s Ministry of Health and Welfare Health Insurance Database over a 26-year period (1998–2023), covering six age groups: 60–64, 65–69, 70–74, 75–79, 80–84, and 85 years and above. Tables 1 and 2 present the descriptive statistics of dementia prevalence among males and females in Taiwan, respectively, from 1998 to 2023, stratified by age group. For each age category, the tables report the standard deviation (SD), minimum (Min), maximum (Max), mean prevalence (Mean), coefficient of variation (COV), standard error (SE), and the corresponding 95% confidence intervals (CI). For analysis, the data for each region were divided into a training set (1998–2015) and a test set (2016–2023) for both male and female patients.

Table 1.

Male Dementia Prevalence in Taiwan, 1998–2023.

Years old	SD	Min	Max	Mean	COV (%)	SE	CI (95%lower)	CI (95%upper)
60 ~ 64	0.35	1.01	2.12	1.49	0.24	0.07	1.35	1.63
65 ~ 69	2.64	3.50	10.59	6.07	0.44	0.52	5.00	7.14
70 ~ 74	3.04	5.89	17.19	9.34	0.33	0.60	8.12	10.57
75 ~ 79	3.75	5.27	17.74	12.60	0.30	0.74	11.08	14.11
80 ~ 84	6.14	3.71	22.53	14.03	0.44	1.20	11.55	16.51
85 and above	12.57	1.99	35.17	17.80	0.71	2.47	12.73	22.88

Open in a new tab

SD, standard deviation; COV, coefficient of variation; SE, standard error; CI, confidence interval.

Table 2.

Female Dementia Prevalence in Taiwan, 1998–2023.

Years old	SD	Min	Max	Mean	COV (%)	SE	CI (95%lower)	CI (95%upper)
60 ~ 64	0.38	0.93	2.34	1.46	0.26	0.08	1.29	1.60
65 ~ 69	3.15	3.08	11.80	6.94	0.45	0.62	5.67	8.21
70 ~ 74	5.28	4.65	22.99	11.68	0.45	1.04	9.54	13.81
75 ~ 79	8.22	4.71	28.44	17.11	0.48	1.61	13.79	20.42
80 ~ 84	12.37	3.90	42.26	20.36	0.61	2.43	15.36	25.35
85 and above	21.32	2.57	68.17	27.10	0.79	4.18	18.49	35.71

Open in a new tab

SD, standard deviation; COV, coefficient of variation; SE, standard error; CI, confidence interval.

Parameter settings

Support Vector Regression (SVR) has been applied in the field of time series forecasting and has achieved excellent results. SVR is a promising method for time series prediction, offering advantages such as fewer parameters, strong predictive capability, and fast training speed³³. Therefore, compared to other methods, SVR maintains robust predictive performance even when handling high-dimensional, sparse, or noisy data²⁷. In this study, we employed the SVR model with hyperparameter optimization. We specified the ranges for SVR hyperparameters as follows: C = ( Inline graphic ), σ = (), and ε = () ¹³. The hyperparameter-optimized training results of the SVR model for male and female dementia patients are presented in Supplementary Tables 2 and 3, respectively.

Comparison of methods for predicting the number of dementia cases

This study evaluated a range of time series and artificial intelligence models, including traditional approaches such as Autoregressive Integrated Moving Average (ARIMA) and Holt-Winters Exponential Smoothing (HWETS), as well as optimized Support Vector Regression (SVR) algorithms enhanced by Particle Swarm Optimization (PSO), Differential Evolution (DE), Whale Optimization Algorithm (WOA), and Harris Hawks Optimization (HHO). Additionally, a Long Short-Term Memory (LSTM) network and the newly proposed Flying Geese Optimization Algorithm (FGOA) integrated with SVR were analyzed. Model performance in terms of predictive accuracy and stability was assessed using Mean Absolute Percentage Error (MAPE) and Root Mean Squared Error (RMSE).

As shown in Table 3, the forecasting results for male dementia patients indicate that the FGOASVR model substantially outperformed all other models, achieving an average MAPE of 3.17 and RMSE of 0.69. In contrast, traditional statistical models such as ARIMA and HWETS produced higher errors, with MAPEs of 6.90 and 5.24, and RMSEs of 1.13 and 1.02, respectively. A similar pattern was observed for female dementia patients in Table 4, where the FGOASVR model again demonstrated superior and consistent performance, achieving an average MAPE of 3.42% and RMSE of 0.96. In comparison, models such as DESVR and HHOSVR exhibited substantial performance variability particularly in the 70–74 age group, where MAPEs reached 37.07 and 13.29, respectively revealing their limited ability to manage data irregularities across specific age segments.

Table 3.

Prediction of Male Dementia Case Numbers Employing Different Approaches.

Years Old	Criteria	ARIMA	HWETS	SVR	PSO SVR	DE SVR	WOA SVR	HHO SVR	LSTM	FGOA SVR
60 ~ 64	MAPE (%)	11.67	4.61	10.00	9.52	8.45	8.45	11.12	17.99	3.37
60 ~ 64	RMSE	0.31	0.09	0.21	0.20	0.17	0.17	0.23	0.08	0.07
65 ~ 69	MAPE (%)	11.10	7.86	5.59	37.36	37.27	4.54	4.64	15.57	3.95
65 ~ 69	RMSE	1.12	0.90	0.74	3.85	4.09	0.65	0.67	1.84	0.51
70 ~ 74	MAPE (%)	1.93	6.48	5.02	31.71	29.01	36.14	36.14	2.09	1.49
70 ~ 74	RMSE	0.29	1.00	0.83	5.52	5.48	7.53	7.53	0.29	0.22
75 ~ 79	MAPE (%)	5.91	4.76	9.94	3.40	12.86	14.85	13.48	6.90	3.35
75 ~ 79	RMSE	1.10	0.85	1.72	0.67	2.56	2.87	2.66	1.46	0.67
80 ~ 84	MAPE (%)	2.77	3.15	7.66	3.89	9.55	11.33	4.68	6.03	2.73
80 ~ 84	RMSE	0.65	0.69	1.89	0.90	2.41	2.97	1.08	1.31	0.61
85 and above	MAPE (%)	8.01	4.57	11.85	7.63	10.65	10.65	10.66	9.79	4.10
85 and above	RMSE	3.31	2.58	4.06	3.81	3.66	3.66	3.67	3.33	2.03
Average	MAPE (%)	6.90	5.24	8.34	15.59	17.97	14.33	13.45	9.73	3.17
Average	RMSE	1.13	1.02	1.58	2.49	3.06	2.98	2.64	1.39	0.69

Open in a new tab

MAPE, mean absolute percentage error; RMSE, root mean square error; boldface, the optimal values in each row. ARIMA, Autoregressive Integrated Moving Average; HWETS, holt winters exponential smoothing; SVR, supports vector regression; PSOSVR, Particle Swarm Optimization supports vector regression; DESVR, differential evolution supports vector regression; WOASVR, whale optimization algorithm supports vector regression; HHOSVR, Harris Hawks Optimization supports vector regression; LSTM, long short-term memory; FGOASVR, Flying Geese Optimization Algorithm supports vector regression.

Table 4.

Prediction of Female Dementia Case Numbers Employing Different Approaches.

Years Old	Criteria	ARIMA	HWETS	SVR	PSO SVR	DE SVR	WOA SVR	HHO SVR	LSTM	FGOA SVR
60 ~ 64	MAPE (%)	12.35	5.90	14.13	8.76	9.87	9.86	9.84	11.32	5.30
60 ~ 64	RMSE	0.24	0.11	0.26	0.16	0.19	0.19	0.19	0.22	0.11
65 ~ 69	MAPE (%)	7.52	5.90	10.74	19.94	6.24	17.12	17.67	18.88	3.44
65 ~ 69	RMSE	0.94	0.84	1.45	2.45	0.81	2.09	2.19	2.15	0.46
70 ~ 74	MAPE (%)	3.41	6.68	12.73	5.51	37.07	26.38	13.29	5.47	2.94
70 ~ 74	RMSE	0.72	1.30	3.39	1.11	11.77	7.44	2.76	1.08	0.62
75 ~ 79	MAPE (%)	4.75	4.40	8.92	6.18	3.74	4.52	4.53	13.13	2.85
75 ~ 79	RMSE	1.64	1.27	2.64	2.17	1.43	1.68	1.69	3.98	0.97
80 ~ 84	MAPE (%)	2.41	6.13	9.48	3.621	16.81	14.61	14.86	3.55	1.94
80 ~ 84	RMSE	1.08	2.204	3.65	1.40	6.96	6.04	6.14	1.45	0.73
85 and above	MAPE (%)	7.36	8.69	9.25	7.01	5.06	5.25	4.46	4.15	4.04
85 and above	RMSE	4.79	5.01	5.91	4.04	3.42	3.49	3.06	3.04	2.84
Average	MAPE (%)	6.30	6.28	10.88	8.50	13.13	12.96	10.78	9.42	3.42
Average	RMSE	1.57	1.79	2.88	1.89	4.10	3.49	2.67	1.99	0.96

Open in a new tab

As illustrated in the subplots of Fig. 2, the accuracy and stability of various models in predicting the number of dementia patients aged 60–64 in Taiwan can be visually compared. Among these, the FGOASVR model demonstrates superior performance in terms of the R² metric, indicating a strong correlation between predicted and actual values. The slope being close to 1 further suggests that the model effectively maintains consistent trends across low, medium, and high value ranges. In contrast, although the ARIMA model benefits from autoregressive properties, its slope is substantially greater than 1 and its R² value (0.4221) is relatively low. This suggests significant overestimation at lower values and poor tracking of upward trends at higher values. The HWETS model (R² = 0.5827) performs moderately well but remains less responsive to low-frequency fluctuations compared with FGOASVR, as indicated by its slightly smaller slope and intercept. Other SVR-based models employing evolutionary or swarm intelligence optimization such as PSOSVR, DESVR, WOASVR, and HHOSVR generally yield R² values between 0.01 and 0.3, with slopes either negative or well below 1. These results imply that such methods often become trapped in local minima or fail to maintain a proper balance between exploration and exploitation during hyperparameter optimization, leading to predictions that deviate notably from the actual trend.

Analysis of age group data

This study examined dementia prevalence trends in Taiwan from 1998 to 2023 across six older age cohorts (60–64, 65–69, 70–74, 75–79, 80–84, and 85 +). Figure 3 illustrates the projected growth in dementia cases by age group for (a) males and (b) females, while Figs. 4 and 5 compare forecasts generated by six prediction models for both genders. Across all age groups, female patients consistently outnumbered males, with noticeable accelerations occurring at different times: after 2005 for ages 60–64 and 65–69, after 2015 for ages 70–74, around 2021 for ages 75–79, from 2016 for ages 80–84, and after 2016 for those aged 85 and above. Using data from 2016 to 2023 as the test set, the proposed FGOASVR model consistently produced forecasts that closely matched observed values, accurately capturing growth accelerations, peaks, and declines. In contrast, ARIMA, HWETS, and other SVR-based variants often exhibited bias or delayed responses during periods of rapid change. Overall, FGOASVR achieved the highest accuracy and robustness for both short- and long-term forecasting across all age groups, with slightly superior performance for females. Detailed results for each age group are summarized below.

Fig. 3 — Projected increase in the number of dementia patients in Taiwan, by age group (1998–2023): (a) Male; (b) Female.

Fig. 4 — Six methods are employed to predict the outcomes of dementia in males.

Fig. 5 — Six methods are employed to predict the outcomes of dementia in females.

For individuals aged 60–64, dementia prevalence in males increased from 1998, declined briefly after 2000, and then rose steadily after 2005, while females showed an early increase, a dip around 2001, followed by a sharper and more sustained rise. These patterns indicate the onset of increasing prevalence in this younger elderly group, with higher rates among females. Forecast comparisons reveal that FGOASVR most effectively tracked actual trends, whereas ARIMA and HWETS struggled during growth and decline phases, and other optimized SVR models showed moderate improvements but lower overall accuracy.

In the 65–69 age group, male cases declined slightly until 2000, increased steadily after 2005, and surged after 2014, while female cases fell modestly early on before rising rapidly after 2015. FGOASVR accurately captured sharp growth, temporary declines, and subsequent rebounds, outperforming traditional statistical models and other SVR variants, which either misestimated turning points or lagged during rapid transitions.

For ages 70–74, male prevalence declined slightly in the early years before rising steadily and accelerating after 2019, while female prevalence rebounded quickly after 2001 and surpassed male growth rates after 2005. FGOASVR closely followed observed trends and accurately identified key accelerations, whereas ARIMA and HWETS showed systematic bias and other SVR-based models lagged during rapid changes.

Among individuals aged 75–79, both males and females exhibited overall upward trends with short-term fluctuations. FGOASVR achieved near-perfect alignment with actual values, accurately capturing rapid growth, peak periods, and subsequent declines, while ARIMA and HWETS overestimated during growth phases and underestimated downturns. Other optimized SVR models reduced errors but still lagged during sharp fluctuations.

For the 80–84 age group, dementia cases increased steadily over time, with a brief decline among males and accelerated growth among females after 2016. FGOASVR provided highly accurate forecasts and successfully captured the post-2016 acceleration, outperforming ARIMA, HWETS, and other SVR variants that failed to identify key turning points or lagged during rapid changes.

Finally, in the 85 + cohort, dementia prevalence rose steadily until 2014 and then increased sharply after 2016, with females exhibiting faster growth than males. Forecast comparisons show that FGOASVR most accurately captured this pronounced acceleration, while ARIMA and HWETS performed poorly under highly nonlinear conditions and other optimized SVR models showed delayed responses. Overall, FGOASVR proved to be the most reliable and robust model for forecasting long-term dementia trends across all older age groups. The complete version is provided in the supplementary result (Supplementary Result 1).

Discussion

In this study, we analyzed annual data on dementia patients from 1998 to 2023, extracted from the National Health Insurance Database provided by Taiwan’s Ministry of Health and Welfare. To assess the effectiveness of the proposed model, its forecasting performance was compared with that of statistical models (ARIMA, HWETS), a deep learning model (LSTM), and several hybrid models (SVR, PSOSVR, DESVR, WOASVR, HHOSVR, FGOASVR). Based on evaluation metrics such as MAPE and RMSE, the results indicate that the FGOASVR model achieves superior predictive accuracy compared to the other models. The following section presents a detailed discussion of the statistical, deep learning, and hybrid modeling approaches.

Statistical models: comparison of ARIMA and HWETS

The statistical models selected for comparison in this study are the ARIMA and HWETS models. Both ARIMA and ETS are classic approaches for time series forecasting²⁰; however, they exhibit notable limitations when applied to complex, nonlinear, or structurally variable data. The ARIMA model’s primary drawback lies in its strong reliance on the assumption of stationarity. When forecasting the number of dementia patients, the data often display unstable, abrupt, or biphasic fluctuations, making it difficult to achieve stationarity through simple differencing. The HWETS model, which applies weighted moving averages to smooth data, performs well for series with clear trends and seasonal patterns. Nevertheless, its prediction mechanism is based on linear extrapolation, which prevents it from effectively capturing nonlinear variations and intricate interaction structures within the data. As shown in Tables 3 and 4, both ARIMA and HWETS operate within linear frameworks and therefore struggle to model nonlinear relationships and long-term dependencies in time series data. In the case of dementia prevalence, the data are influenced by multiple interacting factors such as policy shifts, population aging rates, and changes in healthcare-seeking behavior. These dynamic and nonlinear interactions exceed the explanatory and predictive capacities of traditional linear models like ARIMA and HWETS.

Deep learning models: comparison of ARIMA, HWETS and LSTM

The Long Short-Term Memory (LSTM) network incorporates specialized gating mechanisms that effectively mitigate the vanishing gradient problem often encountered in deep neural networks. Its memory cell architecture enables enhanced long-term dependency modeling and improved predictive capability. In this study, dementia population forecasts were stratified by age group to evaluate model adaptability. As shown in Tables 3 and 4, each model’s ability to represent temporal and nonlinear dynamics was assessed. Although LSTM theoretically surpasses traditional models in capturing sequential dependencies, empirical findings reveal mixed results. Its performance was not consistently superior across all age groups and displayed notable variability. For the 85-years-and-above group, however, LSTM demonstrated more stable predictive accuracy, achieving a MAPE of 4.15 lower than ARIMA (7.36), HWETS (8.69), and SVR (9.25). This suggests that LSTM may better capture complex nonlinear fluctuations and long-term dependencies in dementia prevalence among older populations. In scenarios characterized by high data variability or heterogeneity, deep learning architectures can provide a distinct representational advantage. Overall, LSTM exhibited selective superiority, performing best in the oldest age cohort where disease progression patterns are more nonlinear and temporally dependent. Conversely, in younger and middle-aged groups, its predictive performance was less robust showing higher MAPE values and greater fluctuation likely due to sensitivity to sample size, model architecture, and hyperparameter configuration.

Hybrid models: comparison of SVR, PSO, DE, WOA, HHO and FGOA

In this study, Support Vector Regression (SVR) was adopted as the baseline model for time series prediction. However, the experimental results revealed that without careful tuning of key hyperparameters namely the penalty parameter C, kernel scale σ, and ε-insensitive loss margin the model tends to produce higher prediction errors than traditional statistical approaches. This finding underscores the crucial role of hyperparameter optimization in enhancing SVR performance. Accordingly, this study applied metaheuristic optimization strategies to precisely identify the optimal hyperparameter combinations necessary for improving predictive accuracy²⁰.

The predictive performance of the Support Vector Regression (SVR) model integrated with the Flying Goose Optimization Algorithm (FGOA) was assessed and compared with SVR models optimized using Particle Swarm Optimization (PSO), Differential Evolution (DE), Whale Optimization Algorithm (WOA), and Harris Hawks Optimization (HHO). Based on the comprehensive evaluation results summarized in Tables 3 and 4, the FGOASVR model consistently achieved superior overall performance relative to the other four hybrid SVR-based optimization methods. Specifically, as shown in Table 3, FGOASVR delivered the best predictive accuracy for male subjects, achieving an average Mean Absolute Percentage Error (MAPE) of 3.17 and a Root Mean Square Error (RMSE) of 0.69 significantly outperforming the next best model, HHOSVR, which recorded a MAPE of 13.45 and an RMSE of 2.64. Similarly, in Table 4, FGOASVR achieved the most accurate predictions for female subjects, with an average MAPE of 3.42 and an RMSE of 0.96, outperforming the second-best model, PSOSVR, which obtained a MAPE of 8.50 and an RMSE of 1.89.

The coefficient of determination, R², indicates that the PSO, DE, WOA, and HHO algorithms performed poorly (see Fig. 2). This suggests that the input–output mapping they learned did not effectively capture the main explanatory variation in the data, leading to significant bias or directional shifts. Regression diagnostics also reveal a notable deviation of the slope from 1 and a large intercept, resulting in systematic underestimation or overestimation across the overall scale. From an algorithmic perspective, the choice of hyperparameters plays a crucial role in balancing exploration and exploitation. When the objective function is minimized solely on the training set without regularization and generalization constraints, it can lead to premature convergence. This situation results in a combination of global structure mis-fitting and local overfitting, which ultimately manifests as a low R² during the validation and testing stages. In terms of overall error reduction, precision enhancement, and prediction stability, FGOASVR consistently demonstrated a pronounced and reliable advantage. FGOA inherently leverages the collective intelligence characteristics observed in flying goose formations, specifically through the leader goose guiding the direction and the lagging geese providing support, enabling it to explore the parameter space more effectively and avoid local optimization traps⁴⁶. Consequently, FGOA exhibited robust global search capabilities.

Contribution of this paper

This study conducted a comprehensive analysis of dementia patient data from 1998 to 2023 and evaluated the predictive performance of seven different models. The key academic contributions are summarized as follows: First, by integrating time series statistical methods and signal decomposition techniques, the study effectively captured the long-term dependency patterns in patient numbers, providing a robust foundation for model development. Second, the study introduced a prediction framework based on the Flying Geese Optimization Algorithm (FGOA), which leverages biologically inspired mechanisms such as leader–flanker formation scheduling, updraft collaboration, and fatigue detection with reset to enable global search and adaptive parameter tuning within high-dimensional feature spaces. Empirical findings confirm that the FGOA-based model outperforms traditional statistical benchmarks, accurately capturing both long-term trends and extreme fluctuations in dementia prevalence, thereby demonstrating its strong potential for application in public health forecasting and policy planning. Our study demonstrates methodological improvements, and the findings may have potential applications in public health and policy, such as guiding healthcare planning or risk assessment. However, these implications are hypothetical and require further validation before being considered established outcomes.

Limitations

The findings of this study are based on secondary data from the Taiwan National Health Insurance Database covering the years 1998 to 2023. The primary outcome variable was the annual number of dementia patients aged 60 years or older, stratified by gender and age group. While health insurance data offers the benefits of nationwide coverage and a single-payer system, which effectively reflects the medical treatment landscape at the population level in Taiwan, there are significant differences in population structure, diagnostic criteria, coding practices, medical behaviors, and accessibility among different countries. These variations may limit the external validity and cross-border applicability of the results. For medical systems that differ substantially from Taiwan’s, the relevance of these findings should be evaluated and confirmed through empirical evidence. Additionally, the data used in this study is aggregated annually rather than presented as a higher-frequency time series. The absence of detailed time resolution, such as quarterly or monthly health insurance data, makes it challenging to identify short-term fluctuations or seasonal patterns. Consequently, this limitation restricts further detailed monitoring and precise predictions. Finally, confidence intervals for the prediction error metrics (MAPE and RMSE) were not reported, which may constrain the interpretation of model uncertainty.

Conclusions

This study introduces a novel swarm-based optimization algorithm, the Flying Geese Optimization Algorithm (FGOA), inspired by the core principles of the Flying Geese Theory. Building on this, an optimized machine learning model FGOA-Support Vector Regression (FGOASVR) is proposed to effectively predict trends in dementia prevalence. The FGOASVR model demonstrated superior predictive performance compared to ARIMA, HWETS, SVR, PSOSVR, DESVR, WOASVR, and HHOSVR, achieving average MAPE values of 3.17 and 3.42, and RMSE values of 0.69 and 0.96 for males and females, respectively. As populations in Taiwan and worldwide continue to age, dementia prevalence is projected to increase substantially, intensifying the strain on healthcare and social support systems. This underscores the urgent need for improved prevention, early diagnosis, treatment, and long-term care. The high predictive accuracy of the FGOASVR model offers valuable insights for health and social care planning, promoting public awareness, reducing stigma, and supporting the development of dementia-friendly communities. The FGOA also exhibited strong competitiveness among metaheuristic algorithms and outperformed other methods proposed in this study. Future research may extend its application to broader areas of public health planning, prevention, risk management, and other optimization-related domains.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1^{(40.1KB, docx)}

Author contributions

C.-H. Y.: Writing – review & editing, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation. P.-H. C. and C.-S. Y.: Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation. J.-H. T.: Writing – review & editing, Writing – original draft, Resources, Investigation, Formal analysis, Data curation, Validation, Resources. S.Y.: Writing – review & editing, Writing – original draft, Resources, Investigation, Formal analysis, Data curation, Validation, Resources.

Funding

This work was supported by the National Science and Technology Council, Taiwan (under Grant no. 111–2221-E-165–001- MY3).

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Consent for Publication

All authors have approved the manuscript and agree with its submission to the Scientific Reports.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Po-Hung Chen, Cheng-San Yang, Ting-Jen Hseuh and Stephanie Yang have contribution equal to this work.

Contributor Information

Cheng-Hong Yang, Email: chyang@nkust.edu.tw.

Stephanie Yang, Email: yangsf@gs.ncku.edu.tw.

References

1.Livingston, G. et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. The lancet.396(10248), 413–446. 10.1016/S0140-6736(20)30367-6 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Patterson C. World alzheimer report 2018. 2018.
3.Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol.3, 73. 10.3389/fneur.2012.00073 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Allen, A. P. et al. A systematic review of the psychobiological burden of informal caregiving for patients with dementia: Focus on cognitive and biological markers of chronic stress. Neurosci. Biobehav. Rev.73, 123–64. 10.1016/j.neubiorev.2016.12.006 (2017). [DOI] [PubMed] [Google Scholar]
5.Wang, F. et al. Alzheimer’s and dementia: Diagnosis, assessment, and disease monitoring global, regional, and national burden of Alzheimer’s disease and other dementias (ADODs) and their risk factors, 1990–2021: A systematic analysis for the Global Burden of Disease study 2021. Alzheimer’s Dementia: Diagnosis, Assess. Dis. Monit.17(2), e70126. 10.1002/dad2.70126 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Buckner, S. et al. Dementia friendly communities in England: A scoping study. Int. J. Geriatr. Psychiatry34(8), 1235–1243. 10.1002/gps.5123 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Tsutsui, T. Implementation process and challenges for the community-based integrated care system in Japan. Int. J. Integr. Care14, e002. 10.5334/ijic.988 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Hurd, M. D., Martorell, P., Delavande, A., Mullen, K. J. & Langa, K. M. Monetary costs of dementia in the United States. N. Engl. J. Med.368(14), 1326–1334. 10.1056/nejmsa1204629 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Kim, Y.-E. et al. Trends and patterns of burden of disease and injuries in Korea using disability-adjusted life years. J. Korean Med. Sci.34(Suppl 1), e75. 10.3346/jkms.2019.34.e75 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hyndman, R. J., Koehler, A. B., Snyder, R. D. & Grose, S. A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast.18(3), 439–454. 10.1016/S0169-2070(01)00110-8 (2002). [Google Scholar]
11.Zhang, M., Li, X. & Wang, L. An adaptive outlier detection and processing approach towards time series sensor data. IEEE Access.7, 175192–175212. 10.1109/ACCESS.2019.2957602 (2019). [Google Scholar]
12.Koehler, A. B., Snyder, R. D. & Ord, J. K. Forecasting models and prediction intervals for the multiplicative Holt-Winters method. Int. J. Forecast.17(2), 269–286. 10.1016/S0169-2070(01)00081-4 (2001). [Google Scholar]
13.Yang, C.-H., Chen, B.-H., Wu, C.-H., Chen, K.-C. & Chuang, L.-Y. Deep learning for forecasting electricity demand in Taiwan. Mathematics.10(14), 2547. 10.3390/math10142547 (2022). [Google Scholar]
14.Yang, C.-H., Chen, P.-H., Wu, C.-H., Yang, C.-S. & Chuang, L.-Y. Deep learning-based air pollution analysis on carbon monoxide in Taiwan. Eco. Inform.80, 102477. 10.1016/j.ecoinf.2024.102477 (2024). [Google Scholar]
15.Yang, C.-H., Chen, P.-H., Yang, C.-S. & Chuang, L.-Y. Analysis and Forecasting of Air Pollution on Nitrogen Dioxide and Sulfur Dioxide Using Deep Learning. IEEE Access.10.1109/ACCESS.2024.3494263 (2024).39748855 [Google Scholar]
16.Ashfaq, A., Sant Anna, A., Lingman, M. & Nowaczyk, S. Readmission prediction using deep learning on electronic health records. J. Biomed. Inf.97, 103256. 10.1016/j.jbi.2019.103256 (2019). [DOI] [PubMed] [Google Scholar]
17.Wang, L. et al. Development and validation of a deep learning algorithm for mortality prediction in selecting patients with dementia for earlier palliative care interventions. JAMA Network Open10.1001/jamanetworkopen.2019.6972 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Fan, W. et al. Support vector regression model for flight demand forecasting. Int. J. Eng. Business Manag.15, 18479790231174320. 10.1177/18479790231174318 (2023). [Google Scholar]
19.Du, X., Xu, H. & Zhu, F. Understanding the effect of hyperparameter optimization on machine learning models for structure design problems. Comput. Aided Des.135, 103013. 10.1016/j.cad.2021.103013 (2021). [Google Scholar]
20.Yang, S., Chen, H.-C., Wu, C.-H., Wu, M.-N. & Yang, C.-H. Forecasting of the prevalence of dementia using the lstm neural network in Taiwan. Mathematics.9(5), 488. 10.3390/math9050488 (2021). [Google Scholar]
21.Hamdi, T. et al. Accurate prediction of continuous blood glucose based on support vector regression and differential evolution algorithm. Biocybern. Biomed. Eng.38(2), 362–372. 10.1016/j.bbe.2018.02.005 (2018). [Google Scholar]
22.Juang, W.-C., Huang, S.-J., Huang, F.-D., Cheng, P.-W. & Wann, S.-R. Application of time series analysis in modelling and forecasting emergency department visits in a medical centre in Southern Taiwan. BMJ Open7(11), e018628. 10.1136/bmjopen-2017-018628 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Xian, X. et al. Comparison of SARIMA model, Holt-winters model and ETS model in predicting the incidence of foodborne disease. BMC Infect. Dis.23(1), 803. 10.1186/s12879-023-08799-4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Parbat, D. & Chakraborty, M. A python based support vector regression model for prediction of COVID19 cases in India. Chaos, Solitons Fractals138, 109942. 10.1016/j.chaos.2020.109942 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Malik, A., Tikhamarine, Y., Souag-Gamane, D., Kisi, O. & Pham, Q. B. Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction. Stoch. Env. Res. Risk Assess.34, 1755–1773. 10.1007/s00477-020-01874-1 (2020). [Google Scholar]
26.Box, G. & Jenkins, G. Analysis: Forecasting and Control. San francisco.10.1057/9781137291264_6 (1976). [Google Scholar]
27.Zhou, C. et al. A new model transfer strategy among spectrometers based on SVR parameter calibrating. IEEE Trans. Instrum. Meas.70, 1–13. 10.1109/TIM.2021.3119129 (2021).33776080 [Google Scholar]
28.Chaudhuri, S. & Dutta, D. Mann-Kendall trend of pollutants, temperature and humidity over an urban station of India with forecast verification using different ARIMA models. Environ. Monit. Assess.186(8), 4719–4742. 10.1007/s10661-014-3733-6 (2014). [DOI] [PubMed] [Google Scholar]
29.Winters, P. R. Forecasting sales by exponentially weighted moving averages. Manage. Sci.6(3), 324–342. 10.1007/978-3-642-51565-1_116 (1960). [Google Scholar]
30.Nolan, D. & Speed, T. Teaching statistics theory through applications. Am. Stat.53(4), 370–375. 10.1080/00031305.1999.10474492 (1999). [Google Scholar]
31.Vapnik V, Golowich S, Smola A. Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing systems. 1996;9.
32.Nieto, P. G., Combarro, E. F., del Coz, D. J. & Montañés, E. A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): A case study. Appl. Math. Comput.219(17), 8923–8937. 10.1016/j.amc.2013.03.018 (2013). [Google Scholar]
33.Castelli, M., Clemente, F. M., Popovič, A., Silva, S. & Vanneschi, L. A machine learning approach to predict air quality in California. Complexity2020(1), 8049504. 10.1155/2020/8049504 (2020). [Google Scholar]
34.Yang, C.-H., Lee, C.-F. & Chang, P.-Y. Export-and import-based economic models for predicting global trade using deep learning. Expert Syst. Appl.218, 119590. 10.1016/j.eswa.2023.119590 (2023). [Google Scholar]
35.Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN’95-international conference on neural networks: ieee; 1995. 1942–8.
36.Resende, L. & Takahashi, R. H. Contributions to dynamic analysis of differential evolution algorithms. Evol. Comput.31(3), 201–232. 10.1162/evco_a_00318 (2023). [DOI] [PubMed] [Google Scholar]
37.Mirjalili, S. & Lewis, A. The whale optimization algorithm. Adv. Eng. Softw.95, 51–67. 10.1016/j.advengsoft.2016.01.008 (2016). [Google Scholar]
38.Heidari, A. A. et al. Harris hawks optimization: Algorithm and applications. Futur. Gener. Comput. Syst.97, 849–872. 10.1016/j.future.2019.02.028 (2019). [Google Scholar]
39.Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput.9(8), 1735–1780. 10.1162/neco.1997.9.8.1735 (1997). [DOI] [PubMed] [Google Scholar]
40.Krichen, M. & Mihoub, A. Long short-term memory networks: A comprehensive survey. AI.6(9), 215. 10.3390/ai6090215 (2025). [Google Scholar]
41.Kojima, K. The, “flying geese” model of Asian economic development: origin, theoretical extensions, and regional policy implications. J. Asian Econ.11(4), 375–401. 10.1016/S1049-0078(00)00067-1 (2000). [Google Scholar]
42.Hamad, R. K. & Rashid, T. A. GOOSE algorithm: a powerful optimization tool for real-world engineering challenges and beyond. Evol. Syst.15(4), 1249–1274. 10.1007/s12530-023-09553-6 (2024). [Google Scholar]
43.El-Kenawy, E.-S.M. et al. Greylag goose optimization: nature-inspired optimization algorithm. Expert Syst. Appl.10.1016/j.eswa.2023.122147 (2024). [Google Scholar]
44.Akamatsu, K. A historical pattern of economic growth in developing countries. Dev. Econ.1, 3–25. 10.1111/j.1746-1049.1962.tb01020.x (1962). [Google Scholar]
45.Liu, X., Lin, Z. & Feng, Z. Short-term offshore wind speed forecast by seasonal ARIMA-A comparison against GRU and LSTM. Energy227, 120492. 10.1016/j.energy.2021.120492 (2021). [Google Scholar]
46.Bian, H. et al. Improved snow geese algorithm for engineering applications and clustering optimization. Sci. Rep.15(1), 4506. 10.1038/s41598-025-88080-7 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1^{(40.1KB, docx)}

Data Availability Statement

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

[CR1] 1.Livingston, G. et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. The lancet.396(10248), 413–446. 10.1016/S0140-6736(20)30367-6 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Patterson C. World alzheimer report 2018. 2018.

[CR3] 3.Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol.3, 73. 10.3389/fneur.2012.00073 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Allen, A. P. et al. A systematic review of the psychobiological burden of informal caregiving for patients with dementia: Focus on cognitive and biological markers of chronic stress. Neurosci. Biobehav. Rev.73, 123–64. 10.1016/j.neubiorev.2016.12.006 (2017). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Wang, F. et al. Alzheimer’s and dementia: Diagnosis, assessment, and disease monitoring global, regional, and national burden of Alzheimer’s disease and other dementias (ADODs) and their risk factors, 1990–2021: A systematic analysis for the Global Burden of Disease study 2021. Alzheimer’s Dementia: Diagnosis, Assess. Dis. Monit.17(2), e70126. 10.1002/dad2.70126 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Buckner, S. et al. Dementia friendly communities in England: A scoping study. Int. J. Geriatr. Psychiatry34(8), 1235–1243. 10.1002/gps.5123 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Tsutsui, T. Implementation process and challenges for the community-based integrated care system in Japan. Int. J. Integr. Care14, e002. 10.5334/ijic.988 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Hurd, M. D., Martorell, P., Delavande, A., Mullen, K. J. & Langa, K. M. Monetary costs of dementia in the United States. N. Engl. J. Med.368(14), 1326–1334. 10.1056/nejmsa1204629 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Kim, Y.-E. et al. Trends and patterns of burden of disease and injuries in Korea using disability-adjusted life years. J. Korean Med. Sci.34(Suppl 1), e75. 10.3346/jkms.2019.34.e75 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Hyndman, R. J., Koehler, A. B., Snyder, R. D. & Grose, S. A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast.18(3), 439–454. 10.1016/S0169-2070(01)00110-8 (2002). [Google Scholar]

[CR11] 11.Zhang, M., Li, X. & Wang, L. An adaptive outlier detection and processing approach towards time series sensor data. IEEE Access.7, 175192–175212. 10.1109/ACCESS.2019.2957602 (2019). [Google Scholar]

[CR12] 12.Koehler, A. B., Snyder, R. D. & Ord, J. K. Forecasting models and prediction intervals for the multiplicative Holt-Winters method. Int. J. Forecast.17(2), 269–286. 10.1016/S0169-2070(01)00081-4 (2001). [Google Scholar]

[CR13] 13.Yang, C.-H., Chen, B.-H., Wu, C.-H., Chen, K.-C. & Chuang, L.-Y. Deep learning for forecasting electricity demand in Taiwan. Mathematics.10(14), 2547. 10.3390/math10142547 (2022). [Google Scholar]

[CR14] 14.Yang, C.-H., Chen, P.-H., Wu, C.-H., Yang, C.-S. & Chuang, L.-Y. Deep learning-based air pollution analysis on carbon monoxide in Taiwan. Eco. Inform.80, 102477. 10.1016/j.ecoinf.2024.102477 (2024). [Google Scholar]

[CR15] 15.Yang, C.-H., Chen, P.-H., Yang, C.-S. & Chuang, L.-Y. Analysis and Forecasting of Air Pollution on Nitrogen Dioxide and Sulfur Dioxide Using Deep Learning. IEEE Access.10.1109/ACCESS.2024.3494263 (2024).39748855 [Google Scholar]

[CR16] 16.Ashfaq, A., Sant Anna, A., Lingman, M. & Nowaczyk, S. Readmission prediction using deep learning on electronic health records. J. Biomed. Inf.97, 103256. 10.1016/j.jbi.2019.103256 (2019). [DOI] [PubMed] [Google Scholar]

[CR17] 17.Wang, L. et al. Development and validation of a deep learning algorithm for mortality prediction in selecting patients with dementia for earlier palliative care interventions. JAMA Network Open10.1001/jamanetworkopen.2019.6972 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Fan, W. et al. Support vector regression model for flight demand forecasting. Int. J. Eng. Business Manag.15, 18479790231174320. 10.1177/18479790231174318 (2023). [Google Scholar]

[CR19] 19.Du, X., Xu, H. & Zhu, F. Understanding the effect of hyperparameter optimization on machine learning models for structure design problems. Comput. Aided Des.135, 103013. 10.1016/j.cad.2021.103013 (2021). [Google Scholar]

[CR20] 20.Yang, S., Chen, H.-C., Wu, C.-H., Wu, M.-N. & Yang, C.-H. Forecasting of the prevalence of dementia using the lstm neural network in Taiwan. Mathematics.9(5), 488. 10.3390/math9050488 (2021). [Google Scholar]

[CR21] 21.Hamdi, T. et al. Accurate prediction of continuous blood glucose based on support vector regression and differential evolution algorithm. Biocybern. Biomed. Eng.38(2), 362–372. 10.1016/j.bbe.2018.02.005 (2018). [Google Scholar]

[CR22] 22.Juang, W.-C., Huang, S.-J., Huang, F.-D., Cheng, P.-W. & Wann, S.-R. Application of time series analysis in modelling and forecasting emergency department visits in a medical centre in Southern Taiwan. BMJ Open7(11), e018628. 10.1136/bmjopen-2017-018628 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Xian, X. et al. Comparison of SARIMA model, Holt-winters model and ETS model in predicting the incidence of foodborne disease. BMC Infect. Dis.23(1), 803. 10.1186/s12879-023-08799-4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Parbat, D. & Chakraborty, M. A python based support vector regression model for prediction of COVID19 cases in India. Chaos, Solitons Fractals138, 109942. 10.1016/j.chaos.2020.109942 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Malik, A., Tikhamarine, Y., Souag-Gamane, D., Kisi, O. & Pham, Q. B. Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction. Stoch. Env. Res. Risk Assess.34, 1755–1773. 10.1007/s00477-020-01874-1 (2020). [Google Scholar]

[CR26] 26.Box, G. & Jenkins, G. Analysis: Forecasting and Control. San francisco.10.1057/9781137291264_6 (1976). [Google Scholar]

[CR27] 27.Zhou, C. et al. A new model transfer strategy among spectrometers based on SVR parameter calibrating. IEEE Trans. Instrum. Meas.70, 1–13. 10.1109/TIM.2021.3119129 (2021).33776080 [Google Scholar]

[CR28] 28.Chaudhuri, S. & Dutta, D. Mann-Kendall trend of pollutants, temperature and humidity over an urban station of India with forecast verification using different ARIMA models. Environ. Monit. Assess.186(8), 4719–4742. 10.1007/s10661-014-3733-6 (2014). [DOI] [PubMed] [Google Scholar]

[CR29] 29.Winters, P. R. Forecasting sales by exponentially weighted moving averages. Manage. Sci.6(3), 324–342. 10.1007/978-3-642-51565-1_116 (1960). [Google Scholar]

[CR30] 30.Nolan, D. & Speed, T. Teaching statistics theory through applications. Am. Stat.53(4), 370–375. 10.1080/00031305.1999.10474492 (1999). [Google Scholar]

[CR31] 31.Vapnik V, Golowich S, Smola A. Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing systems. 1996;9.

[CR32] 32.Nieto, P. G., Combarro, E. F., del Coz, D. J. & Montañés, E. A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): A case study. Appl. Math. Comput.219(17), 8923–8937. 10.1016/j.amc.2013.03.018 (2013). [Google Scholar]

[CR33] 33.Castelli, M., Clemente, F. M., Popovič, A., Silva, S. & Vanneschi, L. A machine learning approach to predict air quality in California. Complexity2020(1), 8049504. 10.1155/2020/8049504 (2020). [Google Scholar]

[CR34] 34.Yang, C.-H., Lee, C.-F. & Chang, P.-Y. Export-and import-based economic models for predicting global trade using deep learning. Expert Syst. Appl.218, 119590. 10.1016/j.eswa.2023.119590 (2023). [Google Scholar]

[CR35] 35.Kennedy J, Eberhart R. Particle swarm optimization. Proceedings of ICNN’95-international conference on neural networks: ieee; 1995. 1942–8.

[CR36] 36.Resende, L. & Takahashi, R. H. Contributions to dynamic analysis of differential evolution algorithms. Evol. Comput.31(3), 201–232. 10.1162/evco_a_00318 (2023). [DOI] [PubMed] [Google Scholar]

[CR37] 37.Mirjalili, S. & Lewis, A. The whale optimization algorithm. Adv. Eng. Softw.95, 51–67. 10.1016/j.advengsoft.2016.01.008 (2016). [Google Scholar]

[CR38] 38.Heidari, A. A. et al. Harris hawks optimization: Algorithm and applications. Futur. Gener. Comput. Syst.97, 849–872. 10.1016/j.future.2019.02.028 (2019). [Google Scholar]

[CR39] 39.Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput.9(8), 1735–1780. 10.1162/neco.1997.9.8.1735 (1997). [DOI] [PubMed] [Google Scholar]

[CR40] 40.Krichen, M. & Mihoub, A. Long short-term memory networks: A comprehensive survey. AI.6(9), 215. 10.3390/ai6090215 (2025). [Google Scholar]

[CR41] 41.Kojima, K. The, “flying geese” model of Asian economic development: origin, theoretical extensions, and regional policy implications. J. Asian Econ.11(4), 375–401. 10.1016/S1049-0078(00)00067-1 (2000). [Google Scholar]

[CR42] 42.Hamad, R. K. & Rashid, T. A. GOOSE algorithm: a powerful optimization tool for real-world engineering challenges and beyond. Evol. Syst.15(4), 1249–1274. 10.1007/s12530-023-09553-6 (2024). [Google Scholar]

[CR43] 43.El-Kenawy, E.-S.M. et al. Greylag goose optimization: nature-inspired optimization algorithm. Expert Syst. Appl.10.1016/j.eswa.2023.122147 (2024). [Google Scholar]

[CR44] 44.Akamatsu, K. A historical pattern of economic growth in developing countries. Dev. Econ.1, 3–25. 10.1111/j.1746-1049.1962.tb01020.x (1962). [Google Scholar]

[CR45] 45.Liu, X., Lin, Z. & Feng, Z. Short-term offshore wind speed forecast by seasonal ARIMA-A comparison against GRU and LSTM. Energy227, 120492. 10.1016/j.energy.2021.120492 (2021). [Google Scholar]

[CR46] 46.Bian, H. et al. Improved snow geese algorithm for engineering applications and clustering optimization. Sci. Rep.15(1), 4506. 10.1038/s41598-025-88080-7 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Application of a novel approach for dementia prevalence prediction in Taiwan

Cheng-Hong Yang

Po-Hung Chen

Cheng-San Yang

Ting-Jen Hseuh

Stephanie Yang

Abstract

Supplementary Information

Introduction

Literature review

Methodology

Fig. 1.

Statistical method

Autoregressive integrated moving average (ARIMA)

Holt-winters exponential smoothing (HWETS)

Machine learning

Support vector regression (SVR)

Particle swarm optimization (PSO)

Differential evolution (DE)

Whale optimization algorithm (WOA)

Harris hawk optimization (HHO)

Deep learning

Long short-term memory (LSTM)

Flying geese optimization algorithm (FGOA)

Team collaboration

Mutual Encouragement

Peer support

Update geese velocity and position

Rotational leadership

Algorithm 1 .

Performance criteria

Results

Data description

Table 1.

Table 2.

Parameter settings

Comparison of methods for predicting the number of dementia cases

Table 3.

Table 4.

Fig. 2.

Analysis of age group data

Fig. 3.

Fig. 4.

Fig. 5.

Discussion

Statistical models: comparison of ARIMA and HWETS

Deep learning models: comparison of ARIMA, HWETS and LSTM

Hybrid models: comparison of SVR, PSO, DE, WOA, HHO and FGOA

Contribution of this paper

Limitations

Conclusions

Supplementary Information

Author contributions

Funding

Data availability

Declarations

Competing interests

Consent for Publication

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases