Development of blood demand prediction model using artificial intelligence based on national public big data

Hi Jeong Kwon; Sholhui Park; Young Hoon Park; Seung Min Baik; Dong Jin Park

doi:10.1177/20552076231224245

. 2024 Jan 17;10:20552076231224245. doi: 10.1177/20552076231224245

Development of blood demand prediction model using artificial intelligence based on national public big data

Hi Jeong Kwon ^1,^*, Sholhui Park ^2,^*, Young Hoon Park ³, Seung Min Baik ^4,^✉, Dong Jin Park ^5,^✉

PMCID: PMC10798124 PMID: 38250146

Abstract

Objective

Modern healthcare systems face challenges related to the stable and sufficient blood supply of blood due to shortages. This study aimed to predict the monthly blood transfusion requirements in medical institutions using an artificial intelligence model based on national open big data related to transfusion.

Methods

Data regarding blood types and components in Korea from January 2010 to December 2021 were obtained from the Health Insurance Review and Assessment Service and Statistics Korea. The data were collected from a single medical institution. Using the obtained information, predictive models were developed, including eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and category boosting (CatBoost). An ensemble model was created using these three models.

Results

The prediction performance of XGBoost, LGBM, and CatBoost demonstrated a mean absolute error ranging from 14.6657 for AB+ red blood cells (RBCs) to 84.0433 for A+ platelet concentrate (PC) and a root mean squared error ranging from 18.5374 for AB+ RBCs to 118.6245 for B+ PC. The error range was further improved by creating ensemble models, wherein the department requesting blood was the most influential parameter affecting transfusion prediction performance for different blood products and types. Except for the department, the features that affected the prediction performance varied for each product and blood type, including the number of RBC antibody screens, crossmatch, nationwide blood donations, and surgeries.

Conclusion

Based on blood-related open big data, the developed blood-demand prediction algorithm can efficiently provide medical facilities with an appropriate volume of blood ahead of time.

Keywords: Transfusion, big data, prediction model, artificial intelligence, boosting model

Introduction

Ensuring a stable and sufficient blood supply is a critical concern within the healthcare system. However, this presents challenges due to the reliance on blood donations, which have a limited shelf life ranging from just a few days to a few weeks. Furthermore, as the population ages and chronic diseases become more prevalent, the demand for blood continues to rise while blood donations decrease.¹ In particular, the coronavirus disease 2019 (COVID-19) pandemic has further exacerbated the supply of blood supply situation.² Statistics from the American Red Cross reveal that overall blood donations dropped by 10% since March 2020, when the COVID-19 pandemic was declared, compared to 2019.^3,4 Similarly, in Korea, blood donations for 2020 reduced by 6.4% compared to 2019, leading to a 5.9% reduction on available blood for transfusions.⁵ Notably, the decline in blood supply in Korea predates the COVID-19 pandemic. In 2019, before the pandemic spread, total blood donations dropped by 3.2% compared to 2018, with the total blood supply decreasing by 0.1%.⁶ Even in the most recent data for 2021, which show that supplies increased by 2.6% and donations declined by 0.3% compared to 2020, there remained a shortage of blood.⁶

Since each allogeneic transfusion might result in negative side effects, the transfusion should be conducted after determining the benefits and risks of transfusion.⁷ Under the guidelines for patient blood management, transfusions are carried out by using evidence-based medical and surgical concepts apart from simply decreasing blood transfusion in many countries.^8,9 These adjustments are intended to deal with the decreasing blood supply while enhancing therapeutic results for the patient. Insufficient blood supply can be addressed by promoting blood donation, but only approximately 3% of the entire population participates in blood donation.⁴ As a result, improving the accuracy of predicting the amount of blood used in medical institutions through data analysis can contribute to more effective blood supply management, complementing existing procedures for demand forecasting.

In recent years, the use of artificial intelligence (AI) in the medical field has become remarkable, and medical AI research is being performed. There are some reports of AI studies related to blood transfusion prediction, but most of them focus on certain diseases or patient populations.^10–15 While numerous studies have been conducted in recent years to estimate the blood requirements of medical facilities, there are still regions and specific institutions where predictive accuracy remains challenging owing to various factors, emphasizing the need for continued research in this area. Globally, blood shortages have been exacerbated by a significant drop in donations. While it is crucial to address this decline, our study focused on leveraging AI to predict blood transfusion demand and assist institutions with efficient resource allocation.

The National Health Insurance System manages all medical activities in South Korea. Medical service information is reposited in the Health Insurance Review and Assessment Service (HIRA) data, except for non-covered services.¹⁶ Moreover, blood supply is managed nationally through voluntary donation. Consequently, detailed information on blood supply and usage is accessible.

The research objectives of this study are as follows: first, the main goal was to create predictive models capable of forecasting the demand for different blood components (red blood cells (RBCs), fresh frozen plasma (FFP), and platelet concentrate (PC)) based on specific blood types. We intend for these models to provide healthcare institutions with accurate estimates of their future blood supply requirements. Second, this study used open public data, such as national blood supply data and COVID-19 statistics, in combination with data from a single medical institution. The objective was to harness these diverse data sources to enhance prediction accuracy. Third, by achieving accurate blood demand predictions, this study aims to contribute to better blood supply management in healthcare systems. This objective is significant in addressing blood shortages and ensuring a stable and sufficient blood supply.

Methods

The study period spanned 12 years, from January 2010 to December 2021. This study primarily focuses on data from Korea. The data for blood transfusions were collected from a single healthcare institution, specifically the Yeouido St. Mary's Hospital.

Data collection and processing

We developed datasets by gathering open public data from the Healthcare Big Data Hub of HIRA¹⁷ and Statistics Korea¹⁸ and data from a single medical institution from January 2010 to December 2021. Monthly data obtained from the Healthcare Big Data Hub of the HIRA and Statistics Korea were as follows: blood inventory by blood components (RBCs, FFP, and PC), transfusion demand, blood supply, national blood donation, national blood donation by blood type (A+, B+, O+, and AB+), and the number of COVID-19 cases from February 2020 to December 2021. Six PC units were equivalent to one apheresis platelet unit. RhD-negative blood types are uncommon in South Korea, with a prevalence of 0.1% for each blood type. In this study, we primarily focused on the commonly encountered blood types because of the availability and comprehensiveness of the data. Notably, RhD-negative blood types were not included in this dataset. This limitation may affect the direct applicability of our prediction models to regions or countries where RhD-negative blood types have a higher clinical significance. Future adaptations of this model should consider incorporating RhD-negative data for a more holistic prediction.

With regard to actual blood usage, the number of blood transfusion counts for each formulation (according to blood types and blood components) was obtained from 247,537 transfusion-related electronic medical records of a single medical institution.

To develop a blood-demand prediction model for a single institution, the following monthly information was obtained from the relevant medical institution: total number of beds, number of surgeries, surgery type (major, moderate, or minor), number of RBC antibody screens, ABO/Rh tests, crossmatch, number of transfusions by department, blood components, and blood type.

The data for analysis were meticulously sourced from open-source repositories. Before inclusion, each dataset underwent a rigorous quality check involving screening for inconsistencies, missing values, and outliers. To ensure that the results derived from these datasets were feasible and generalizable, we used a combination of training and validation, namely, a stratified k-fold cross-validation technique based on the distribution of the correct values. To improve the general applicability of the model, this strategy was further enhanced by ensuring that the datasets included a wide range of transfusion-related medical information representing different scenarios. Additionally, to ensure the integrity of our results, we implemented a stringent data-cleaning regimen. This process involves handling missing values, correcting outliers, standardizing scales, and removing duplicates. This ensured that our data were robust, consistent, and ready for model training and analysis.

Three datasets were meticulously integrated for our analysis. Data from January 2010 to December 2021 served as the foundational dataset, detailing transfusion-related records from a single medical institute. Although the exact period was not specified, the dataset was incorporated based on overlapping identifiers, offering insights into transfusion demand, blood supply, and national donation trends, specifically emphasizing the three primary blood components: RBCs, FFP, and PC. Data on COVID-19 cases from February 2020 to December 2021 were overlaid to examine the potential effects of the pandemic on transfusion practices. Merging was conducted using shared variables such as patient number, date, and blood type to ensure a cohesive and comprehensive dataset for our analysis.

Development of prediction model

We created a blood demand prediction model for each of the four blood types and three blood components (12 models in total). We chose the following three boosting models as regression prediction models and evaluated their performance because the 12 output variables for predicting blood requirements are continuous numerical variables: Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and Category Boosting (CatBoost). The boosting algorithm is a machine-learning ensemble technique that improves prediction and classification performance by sequentially combining several weak learners. Because our data were hospital-structured, we used three boosting models because boosting models perform very well for such structured data. It is a supervised learning model that uses a set of rules to classify and regress data. XGBoost is a type of gradient boosting (GB) model that compensates for slowness and overfitting, which are risks of GB. The LGBM is a leafwise tree model that extends a specific branch of a tree by setting num_leaves. The LGBM has the advantage of learning more deeply about a better node by distinguishing which side is more advantageous when both sides are compared based on the cost function when developing a tree. However, the drawback is that many data points may be discarded. CatBoost can work to its advantage when a single column contains very complex information, and it learns using a balanced tree method (level-wise) such as GB. Owing to their optimized parameter tuning, XGBoost and LGBM are challenging to establish parameters using XGBoost and LGBM, whereas CatBoost can obtain results by employing an internal method without requiring complex parameter tuning. However, when data are simple, it is challenging to develop a model that considers the interaction of columns; therefore, it is generally not appropriate for simple data processing.

In the data wrangling phase, we first extracted the variables into raw units provided by the data source. For example, the numbers of surgeries, RBC antibody tests, and blood transfusions were displayed as raw numbers. To harmonize the data, we merged hospital transfusion-related data with data from open public databases based on a temporal timeline of years and months. Subsequently, we calculated and partitioned the blood products based on time intervals.

The index for prediction performance was analyzed by R-squared (R²), mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE).

Development of ensemble model

There are reports that ensemble models using several AI models can improve prediction performance.^19,20 Hence, by combining the three boosting models, a total of 12 ensemble models were created for each component and blood type. These ensemble models were created using soft-weighted voting in which the analysis results of XGBoost, LGBM, and CatBoost were equally weighted.

Stratified k-fold cross-validation

In this study, monthly usage is the output value. Accordingly, there is insufficient data to divide a specific period into a test set and a validation set. Thus, there is insufficient data to divide a specific period into test and validation sets, and it is difficult to designate the selected test and validation sets are representative. Therefore, we divided the monthly data of 12 years into 9 : 1 through K-fold (n_split = 10). This technique allowed us to use all the data from the first 10% of the 10th to the last 10% for training and validation. By employing this method, we ensured that the model was trained and validated on the entirety of the dataset, enhancing its overall robustness. In essence, we utilized an out-of-fold technique that maximized the utilization of the entire dataset without any loss.

Analysis of parameters contributing to prediction performance by “feature importance” technique

The feature importance (FI) technique was employed to determine the most important parameters in the blood demand prediction model. FI refers to a method that assigns a score to input features based on their usefulness in predicting a target variable.²¹ The FI method uses ranking to visually assess a parameter's impact on a prediction.

Results

Comparison between actual demand and predicted demand of four models (XGBoost, LGBM, CatBoost, and Ensemble model)

Supplemental Materials 1 and 2 show the monthly and annual average blood consumption of a single institution from January 2010 to December 2021. Figures 1–3 display the actual monthly blood demand from 2010 to 2021, the data collection period, and the blood demand predicted by the four models. The anticipated values of the ensemble model, as shown in Figures 1–3, display the average degrees of XGBoost, LGBM, and CatBoost. In the ensemble model, the aspects of the graphs for O+ RBC and O+ PC were comparable to the actual demand and predicted amounts. For FFP, the anticipated value and actual requirements for all four blood types were marginally different.

Figure 1. — Comparison between actual demand and predicted demand for RBC of XGBoost, LGBM, CatBoost, and ensemble model.

RBC: red blood cell; XGBoost: extreme gradient boosting; LGBM: light gradient boosting machine; CatBoost, category boosting.

Figure 3. — Comparison between actual demand and predicted demand for platelet concentrate of XGBoost, LGBM, CatBoost, and Ensemble model.

XGBoost, extreme gradient boosting; LGBM, light gradient boosting machine; CatBoost: category boosting.

Figure 2. — Comparison between actual demand and predicted demand for fresh frozen plasma of XGBoost, LGBM, CatBoost, and Ensemble model.

XGBoost, extreme gradient boosting; LGBM, light gradient boosting machine; CatBoost: category boosting.

Performance of prediction models

In this study, predictive models were developed using three algorithms: XGBoost, LGBM, and CatBoost. These models were applied to predict the blood transfusion requirements based on a combination of blood components (RBC, FFP, and PC) and blood types (A+, B+, O+, and AB+). The results summarized in Table 1 highlight the best-performing models for each scenario. For instance, the prediction performance for RBC transfusions was the most accurate for blood type A+ when using the XGBoost algorithm (R²= 0.5360, MAE = 23.0618, MSE = 750.6225, and RMSE = 27.3975), whereas blood type O+ showed superior results with CatBoost (R²= 0.6364, MAE = 19.5797, MSE = 556.2468, and RMSE = 23.5849). For FFP transfusion predictions, XGBoost outperformed blood type A+ (R²= 0.4592, MAE = 34.7690, MSE = 2109.9710, and RMSE = 45.9344), whereas LGBM performed best for blood type B+ (R²= 0.3656, MAE = 27.9301, MSE = 1341.7582, and RMSE = 36.6300). CatBoost was the preferred choice for the blood type AB+ FFP (R²= 0.4917, MAE = 34.2320, MSE = 1983.0699, and RMSE = 44.5317). In terms of PC transfusion predictions, LGBM demonstrated remarkable accuracy for blood type A+ (R²= 0.6981, MAE = 82.6260, MSE = 12,076.3967, and RMSE = 109.8927), whereas blood type B+ performed better with LGBM than with XGBoost or CatBoost (R²= 0.7350, MAE = 81.8227, MSE = 13,933.6621, and RMSE = 118.0409). Blood type O+ achieved the highest accuracy with the LGBM (R²= 0.8399, MAE = 73.1341, MSE = 10,162.0240, and RMSE = 100.8069), whereas blood type AB+ showed the optimal results with the LGBM (R²= 0.5426, MAE = 53.0461, MSE = 4739.3300, and RMSE = 68.8428). These findings highlight the effectiveness of specific algorithms for different blood components and types in predicting transfusion needs.

Table 1.

Prediction performance for each model by blood components and types.

Variables	R ²	MAE	MSE	RMSE
XGBoost
*RBC*
A⁺	0.5360*	23.0618	750.6225	27.3975
B⁺	0.4288	22.8674	826.9041	28.7559
O⁺	0.6002	19.7778	611.6880	24.7323
AB⁺	0.3307	14.8101	345.8632	18.5974
*FFP*
A⁺	0.4592*	34.7690	2109.9710	45.9344
B⁺	0.3610	27.4514	1351.4052	36.7615
O⁺	0.4451*	25.8957	956.5291	30.9278
AB⁺	0.4591	34.7690	2109.9710	45.9344
PC
A⁺	0.6879	84.0433	12,482.7581	111.7263
B⁺	0.7350*	81.8227	13,933.6621	118.0409
O⁺	0.8374	69.4440	10,321.3135	101.5939
AB⁺	0.4758	54.0728	5431.1866	73.6966
LGBM
*RBC*
A⁺	0.5155	23.8749	790.5339	28.1164
B⁺	0.4441*	22.8849	804.7080	28.3674
O⁺	0.5779	20.5805	645.8559	25.4137
AB⁺	0.3350*	14.6657	343.6337	18.5374
*FFP*
A⁺	0.4360	36.4550	2200.4620	45.9344
B⁺	0.3656*	27.9301	1341.7582	36.6300
O⁺	0.4442	24.5970	957.9558	30.9509
AB⁺	0.4359	37.0228	2201.0255	46.9151
PC
A⁺	0.6981*	82.6260	12,076.3967	109.8927
B⁺	0.7324	81.4982	14,071.7654	118.6245
O⁺	0.8399*	73.1341	10,162.0240	100.8069
AB⁺	0.5426*	53.0461	4739.3300	68.8428
CatBoost
*RBC*
A⁺	0.5172	23.9773	787.7774	28.0674
B⁺	0.4114	23.5824	852.0790	29.1904
O⁺	0.6364*	19.5797	556.2468	23.5849
AB⁺	0.3171	14.9000	352.9159	18.7861
*FFP*
A⁺	0.4482	35.1327	2152.7697	46.3979
B⁺	0.3353	27.7660	1405.9616	37.4962
O⁺	0.4451*	24.0549	956.5148	30.9276
AB⁺	0.4917*	34.2320	1983.0699	44.5317
PC
A⁺	0.6957	81.9228	12,171.2840	110.3235
B⁺	0.7350*	83.5574	13,933.1966	118.0390
O⁺	0.8362	71.4978	10,401.4455	101.9875
AB⁺	0.5118	53.4796	5058.6207	71.1240

Open in a new tab

R²: R-squared; MAE: mean absolute error; MSE: mean squared error; RMSE: root mean squared error; XGBoost: extreme gradient boosting; LGBM: light gradient boosting machine; CatBoost: category boosting; RBC: red blood cell; FFP: fresh frozen plasma; PC: platelet concentrate.

*Best performance among three models.

Ensemble model performance

An ensemble model is created using the first three prediction models (XGBoost, LGBM, and CatBoost) (Table 2). Therefore, R² for RBC by blood type is 0.5418 for A+, 0.4615 for B+, 0.6288 for O+, and 0.3564 for AB+. R² for FFP according to blood type were 0.4683, 0.3786, 0.4676, and 0.4876 for A+, B+, 0.4676 for O+, and 0.4876 for AB+, respectively. The R² values for PC according to blood type were 0.7098, 0.7498, 0.8497, and 0.5291 for A +, B+, 0.8497 for O+, and 0.5291 for AB+, respectively. Except for O+ in the RBCs and AB+ in the FFP, the prediction performance was enhanced.

Table 2.

Prediction performance for ensemble model using blood components and types.

Variables	R ²	MAE	MSE	RMSE
Ensemble model
*RBC*
A⁺	0.5418	23.4566	747.5972	27.3422
B⁺	0.4615	22.4865	779.4861	27.9193
O⁺	0.6288	19.5733	567.8626	23.8299
AB⁺	0.3564	14.4631	332.5969	18.2372
*FFP*
A⁺	0.4683	34.7742	2074.6106	45.5479
B⁺	0.3786	27.3093	1314.3403	36.2538
O⁺	0.4676	23.9352	917.7138	30.2938
AB⁺	0.4876	34.4725	1999.1331	44.7117
PC
A⁺	0.7098	80.6534	11,608.2959	107.7418
B⁺	0.7498	79.3645	13,159.3359	114.7141
O⁺	0.8497	66.9101	9544.2003	97.6944
AB⁺	0.5291	52.5927	4878.5954	69.8469

Open in a new tab

R²: R-squared; MAE: mean absolute error; MSE: mean squared error; RMSE: root mean squared error; RBC: red blood cell; FFP: fresh frozen plasma; PC: platelet concentrate.

Parameters contributing to prediction performance by “feature importance” technique

For the ensemble model, we examined the parameters that affected the prediction performance for each blood component and blood type using the FI (Figures 4–6). A+ RBCs revealed a feature impact in the order of department, number of crossmatch tests, Type A national blood donation, number of RBC antibody screening tests, and actual demand for RBC nationwide. In B+ RBCs, the order was department, number of crossmatch tests, number of RBC antibody screening tests, number of hospital beds, and total number of surgeries. The number of crossmatch tests, departments, RBC antibody screening tests, RBC reserves, and hospital beds were all in the proper order for O+ RBC.

Figure 4. — Feature impact on red blood cell (RBC) demand forecasting performance in ensemble models. The small boxes of each result are the department's feature impact.

Figure 6. — Feature impact on platelet concentrate demand forecasting performance in ensemble models. The small boxes of each result are the department's feature impact.

Figure 5. — Feature impact on fresh frozen plasma demand forecasting performance in ensemble models. The small boxes of each result are the department's feature impact.

For A+ FFP, the order was as follows: department, year, number of moderate surgeries, total number of surgeries, and number of major surgeries. In B+ FFP, the department, number of major surgeries, number of RBC antibody screening tests, year, and number of cross-matching tests were in that order. The department, number of major operations, year, capital O+ blood type donation, and number of RBC antibody screening tests were all in proper order for O+ FFP. The department, year, FFP reserves, total number of operations, and number of large operations were in the proper order in AB+ FFP (Figures 4–6).

In A+ PC, department, year, number of RBC antibody screening tests, A+ blood type national blood donation performance, and number of moderate surgeries were ranked in that order. In B+ PC, the department, number of RBC antibody screening tests, number of hospital beds, year, and number of COVID-19 verified cases were in that order. In the O+ PC department, the number of RBC antibody screening tests, year, number of COVID-19 confirmed cases, and O+ blood type national blood donation performance were ranked in that order. In AB+ of PC, department, number of RBC antibody screening tests, number of ABO/Rh tests, and AB+ blood type blood donation in the capital city were in that order (Figures 4–6).

Regarding the feature impact results of the FI, the importance of the department was high overall. Consequently, we examined the feature impact of the department in detail (Figures 4–6). For RBCs, the A+ blood type followed the order of neurosurgery, hematology–oncology, pediatrics, rheumatology, and pulmonology. For the B+ blood type, the order of neurosurgery, infectious medicine, nephrology, gastroenterology, and surgery is depicted. The O+ blood type revealed the order of neurology, neurosurgery, pediatrics, rheumatology, and plastic surgery (PS). The AB+ blood type followed the order of hematology–oncology, emergency department, gastroenterology, surgery, and pulmonology. The order of gastroenterology, thoracic surgery, emergency department, hematology–oncology, and surgery for FFP was A+. In the order of B blood type, neurosurgery, neurology, PS, infectious medicine, and hematology–oncology were performed. The O+ blood type followed the order of surgery, emergency department, gastroenterology, pulmonology, and Orthopedics. For the AB+ blood type, the order of the emergency department, surgery, neurology, orthopedics, and cardiology is shown. For PC, the A+ blood type was in the following order: hematology–oncology, urology, pulmonology, cardiology, and orthopedics. For the B+ blood type, the order was hematology–oncology, pulmonology, orthopedics, urology, and otorhinolaryngology. The O+ blood type was associated with hematology–oncology, otorhinolaryngology, rheumatology, nephrology, and infectious medicine. For the AB+ blood type, the order of hematology–oncology, nephrology, pulmonology, gastroenterology, and gynecology was revealed.

Discussion

We constructed an AI model using open public data related to the national blood supply and information related to blood transfusion in a medical institution, without the need for patient clinical information. This study is novel in that a blood-demand prediction model was created by subdividing the demand by blood components and blood type. The development of models that predict blood usage according to blood type and components can help hospitals become more organized when requesting and receiving blood, which can lead to more efficient care at the hospital level. According to Supplemental Materials 1 and 2, monthly and yearly usage were not consistent, and the range of standard deviations was quite wide. The MAE and RMSE values of our boosting models were within a reasonable range, indicating that they contributed to the resolution of the regression issue.

While many previous studies have extensively discussed the impact of reduced blood donations and a lack of blood supply,^1–4 our study aims to address this problem from a demand-forecasting perspective. Recognizing that efficient demand forecasting can alleviate supply problems, we use AI to develop an optimal model for this purpose. There have been limited attempts to use AI to predict blood demand. For example, Lin et al.²² utilized a linear regression model to predict blood needs. In contrast, Shokouhifar and Ranjbarimesan²³ employed a traditional time-series analysis. In contrast, our study utilized advanced machine learning techniques that provide a more robust and dynamic approach to addressing the complex nature of blood demand. Medical institutions can develop more effective strategies, optimize blood donation campaigns, and allocate resources by accurately predicting the amount of blood needed. Our findings complement the existing literature and pave the way for a paradigm shift in blood supply management approaches. This study bridges the gap between observed blood donation reductions and the practical steps that can be taken to manage and mitigate the resulting shortages.

In this study, K-fold validation was used to prevent data loss, and stratified K-fold validation was applied to the learning of the monthly blood predictive volume model we wanted to predict. Typically, only nominal scales with stratified options can be used. However, the predictive value that we wanted to investigate was an ordinal scale composed of numbers. To achieve an optimum outcome through the learning process, the ordinal scale, which represented the proper value, was divided into portions. In each K-fold, it was possible to obtain optimal outcomes only when training was conducted; therefore, the data of the most different intervals were included. In this study, using the q_10 option, the correct answer interval during learning was divided into 10 equal parts. The correct answer value was the same for each k-fold. The intervals were divided and learned to achieve optimal results. Logistic regression is frequently employed in regression problems, and boosting models can be utilized for both classification and regression problems.²⁴ Our data structure includes linear variables such as numerical data and categorical variables such as department. Consequently, we developed AI models for blood-demand prediction using multiple-tree-based boosting models. We applied three boosting models instead of a general linear model to address this regression issue.

By combining the three previously developed boosting models, we created an ensemble model that increased prediction accuracy (Table 2). An ensemble model improves the generalization performance (i.e. reduces variance) by combining different models learned individually. Furthermore, there is an effect of decreasing the overfitting of individual models, such that the overall performance (R², MAE, and RMSE) is enhanced.

By examining the features that affected predictive performance, the parameters of open public data were found to be as crucial as transfusion-related hospital data. A detailed analysis of the features using the FI technique is as follows: except for O+ RBC, the department had the highest feature impact. Neurosurgery has a significant impact on the prediction of RBC transfusions. Although many guidelines emphasize that RBC transfusion should follow a restrictive threshold of 7–8 g/dL,²⁵ there is a report that, concerning brain function, setting the transfusion target hemoglobin level slightly higher can improve clinical results.²⁶

Regarding the prediction of FFP transfusion, the impact of Emergency Medicine and Surgery was high. When active bleeding occurs due to clotting factor insufficiency, FFP is used to ensure hemostasis. FFP is also administered for planned surgery or invasive methods, warfarin reversal, or when vitamin K is inadequate to reverse the effects of warfarin and thrombotic thrombocytopenic purpura.²⁷ Nonetheless, FFP is frequently empirically transfused in circumstances of prolonged international normalized ratio (INR). FFP transfusions should be avoided since prolonged INR can occur in a variety of circumstances.²⁸ One study reported that FFP transfusions were inappropriately conducted in 60% of patients in the emergency department.²⁶ Contrarily, 90% of patients who did not have active bleeding or who received preventive transfusions received excessive FFP infusions.²⁹

For PC, hematology–oncology had a high influence on all blood types. This may also be because the proportion of patients with thrombocytopenia is high in hematology–oncology, and thrombocytopenia requires prompt platelet administration due to the high bleeding risk. Moreover, since the storage period of PC is limited to 5–7 days,³⁰ an immediate request from the blood inventory should be made when a blood transfusion is needed. As a result, effective preparation and supply of PC are expected by examining the hematology–oncology status of the medical institution and utilizing our model.

Excluding the department, important features in transfusion prediction differed according to the blood component and blood type. For RBCs, the feature impact was high for the number of crossmatch tests. The number of RBC antibody screening tests also significantly affected PC. These two tests are required for blood transfusion, and their outcomes are predictable. However, for FFP, the number of surgeries (particularly, major operations) had a significant impact. Recently, the limited use of FFP has been emphasized.^31,32 The use of FFP for volume expanders is discouraged, and it is recommended that the cause of prolonged INR be corrected rather than corrected INR itself.³³ However, our results showed that FFP was significantly associated with the number of operations. Although FFP transfusion is unavoidable in some cases, determining whether the latest transfusion guidelines should be followed is important. Necessary blood transfusions must be promptly administered; however, excessive inappropriate transfusions should be prevented. This is because of the potential consequences of blood transfusion, such as acute lung injury and circulatory overloads.³⁴

The predictive modeling of blood demand in the hospitals that we developed was deemed beneficial. However, given that the decline in global blood donation remains a root issue, our study represents only a part of the broader solution that urgently needs direct intervention. In South Korea, hospitals address transfusion shortages in various ways, such as targeted donations, wherein the donated blood is specifically provided to designated patients. This approach is particularly employed for inpatients requiring surgery and is effective at the individual hospital level. Moreover, although not fully practical, the advent of artificial blood could be a potential solution to address future blood shortages.

In evaluating the practical applications of our AI model, its integration holds significant promise for both clinics and blood centers. Clinics can harness this model to refine blood order estimates, minimize waste, and meet patient needs. Blood centers entrusted with the responsibility of meeting the blood demands of multiple health facilities can utilize predictive insights to optimize their distribution strategies. This synergy, facilitated by our AI prediction model, not only promotes efficient resource allocation but also enhances patient care standards.

To highlight the potential of AI in predicting blood demand, this study has several noteworthy limitations. Relying on hospital transfusion data and domestic open data, our dependence on open-source repositories may introduce biases or errors despite meticulous data cleaning. Variability in data periods could lead to consistency in capturing long-term trends. Although we employed three boosting machine learning algorithms for their anticipated efficacy, we recognize the potential of unexplored models for enhanced insights. Our study underscores the need for external validation from other institutions for robust generalization based on 12 years of data from specific hospitals and domestic open data. Future research using multicenter transfusion data could develop more generalized and accurate models. Models fine-tuned to specific datasets may not retain their precision when extrapolated to regions or countries with distinct medical dynamics. Relying predominantly on historical data, our model may only partially encapsulate blood-demand complexities, especially during unforeseen health emergencies or socio-political shifts. These limitations highlight areas for potential improvement and underscore the intricacies of the issue and the expansive scope of future research in this domain.

Conclusion

Sufficient blood supply is a countermeasure against blood demand, which has always been insufficient worldwide. An adequate blood supply is difficult to achieve for several reasons. The model we created is based on open public data related to blood supply in a country and will enable medical institutions to predict the required amount of blood in advance. The study found that the department requesting the blood was a significant parameter. Future research could investigate other parameters that might influence transfusion prediction performance. This study used data from January 2010 to December 2021. A longitudinal study conducted over a longer period could provide more insights into the trends and patterns in blood transfusion usage.

Contributorship

HJK, SMB, and DJP conceived and designed the study; SMB and DJP developed the methodology; SMB and DJP acquired the data; HJK, SP, YHP, SMB, and DJP analyzed and interpreted the data; and wrote, reviewed, and revised the manuscript.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076231224245 - Supplemental material for Development of blood demand prediction model using artificial intelligence based on national public big data

Click here for additional data file.^{(17.4KB, docx)}

Supplemental material, sj-docx-1-dhj-10.1177_20552076231224245 for Development of blood demand prediction model using artificial intelligence based on national public big data by Hi Jeong Kwon, Sholhui Park, Young Hoon Park, Seung Min Baik and Dong Jin Park in DIGITAL HEALTH

sj-docx-2-dhj-10.1177_20552076231224245 - Supplemental material for Development of blood demand prediction model using artificial intelligence based on national public big data

Click here for additional data file.^{(18.2KB, docx)}

Supplemental material, sj-docx-2-dhj-10.1177_20552076231224245 for Development of blood demand prediction model using artificial intelligence based on national public big data by Hi Jeong Kwon, Sholhui Park, Young Hoon Park, Seung Min Baik and Dong Jin Park in DIGITAL HEALTH

Footnotes

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval: This study was approved by the Institutional Review Board of the Catholic University of Korea at Yeouido St. Mary's Hospital (approval number: SC22RISI0001). This research was conducted ethically, and all study procedures were performed in accordance with the requirements of the Declaration of Helsinki of the World Medical Association. The requirement for informed consent was waived due to the retrospective nature of this study.

Funding: The authors received no financial support for the research, authorship, and/or publication of this article.

Guarantor: SMB and DJP.

ORCID iDs: Seung Min Baik https://orcid.org/0000-0003-1051-6775

Dong Jin Park https://orcid.org/0000-0002-2412-5292

Supplemental material: Supplemental material for this article is available online.

References

1.Mowla SJ, Sapiano MRP, Jones JM, et al. Supplemental findings of the 2019 National Blood Collection and Utilization Survey. Transfusion 2021; 61: S11–S35. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Veseli B, Sandner S, Studte S, et al. The impact of COVID-19 on blood donations. PLoS One 2022; 17: e0265171. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Community Blood Center, https://givingblood.org/donate-blood/why-give-blood.aspx (2022, accessed 15 January 2022).
4.American Red Cross, https://www.redcross.org/about-us/news-and-events/press-release/2022/blood-donors-needed-now-as-omicron-intensifies.html (2022, accessed 15 January 2022).
5.Korean Statical Information Service, https://kosis.kr/index/index.do (2020, accessed 15 January 2022).
6.Korean Red Cross, https://www.redcross.or.kr/main/main.do (2019, accessed 15 January 2022).
7.Sahu S, Hemlata. Verma A. Adverse events related to blood transfusion. Indian J Anaesth 2014; 58: 543–551. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Ellingson KD, Sapiano MRP, Haass KA, et al. Continued decline in blood collection and transfusion in the United States-2015. Transfusion 2017; 57: 1588–1598. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Shander A, Isbister J, Gombotz H. Patient blood management: The global view. Transfusion 2016; 56: S94–102. [DOI] [PubMed] [Google Scholar]
10.Wang M, Cheng J, Li X, et al. Development and validation of a machine learning algorithm for prediction of platelet transfusion efficiency in patients with hematological diseases. Blood 2019; 134: 2454–2454. [Google Scholar]
11.Yao Y, Cifuentes J, Zheng B, et al. Computer algorithm can match physicians’ decisions about blood transfusions. J Transl Med 2019; 17: Article ID: 340. -y [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Durand WM, DePasse JM, Daniels AH. Predictive modeling for blood transfusion after adult spinal deformity surgery: A tree-based machine learning approach. Spine (Phila Pa 1976) 2018; 43: 1058–1066. [DOI] [PubMed] [Google Scholar]
13.Doshi KA, Shastry S, Pai VB. Transfusion requirement prediction score for patients undergoing cardiac surgery: An experience from a tertiary care set-up from south India. Transfus Med 2021; 31: 243–249. [DOI] [PubMed] [Google Scholar]
14.Akaraborworn O, Chaiwat O, Chatmongkolchart S, et al. Prediction of massive transfusion in trauma patients in the surgical intensive care units (THAI-SICU study). Chin J Traumatol 2019; 22: 219–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Cantle PM, Cotton BA. Prediction of massive transfusion in trauma. Crit Care Clin 2017; 33: 71–84. [DOI] [PubMed] [Google Scholar]
16.Kim JA, Yoon S, Kim LY, et al. Towards actualizing the value potential of Korea health insurance review and assessment (HIRA) data as a resource for health research: Strengths, limitations, applications, and strategies for optimal use of HIRA data. J Korean Med Sci 2017; 32: 718–728. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Healthcare Bigdata Hub of Health Insurance Review and Assessment Service, https://opendata.hira.or.kr/home.do (2022, accessed 15 January 2022).
18.Statics Korea, https://kostat.go.kr/portal/korea/index.action (2022, accessed 15 January 2022).
19.Baik SM, Lee M, Hong KS, et al. Development of machine-learning model to predict COVID-19 mortality: Application of ensemble model and regarding feature impacts. Diagnostics (Basel) 2022; 12, 1464. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Park DJ, Park MW, Lee H, et al. Development of machine learning model for diagnostic disease prediction based on laboratory tests. Sci Rep 2021; 11: 7567. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Vimalachandran A, Jayachandran T, Tkachenko AY, et al. Gas turbine design analysis and optimization with novel hybrid model using classical physics and machine learning. In: 2021 international scientific and technical engine conference (EC), Samara, Russian Federation, 23–25 June 2021, pp.1–9. IEEE. [Google Scholar]
22.Lin F, He X, Zhang H, et al. Forecasting blood supply in Chinese major cities by fractional grey prediction model and linear regression model. medRxiv 2023. 2023.2004.2025.23287469 10.1101/2023.04.25.23287469 [DOI] [Google Scholar]
23.Shokouhifar M, Ranjbarimesan M. Multivariate time-series blood donation/demand forecasting for resilient supply chain management during COVID-19 pandemic. Clean Logist Supply Chain 2022; 5: 100078. [Google Scholar]
24.Emmert-Streib F, Yli-Harja O, Dehmer M. Artificial intelligence: A clarification of misconceptions, myths and desired status. Front. Artif. Intell 2020; 3: 524339. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Carson JL, Guyatt G, Heddle NM, et al. Clinical practice guidelines from the AABB: Red blood cell transfusion thresholds and storage. JAMA 2016; 316: 2025–2035. [DOI] [PubMed] [Google Scholar]
26.Griesdale DE, Sekhon MS, Menon DK, et al. Hemoglobin area and time Index above 90 g/L are associated with improved 6-month functional outcomes in patients with severe traumatic brain injury. Neurocrit Care 2015; 23: 78–84. [DOI] [PubMed] [Google Scholar]
27.Khawar H, Kelley W, Stevens JB, et al. Fresh frozen plasma (FFP). Treasure Island (FL): StatPearls Publishing, 2022. [PubMed] [Google Scholar]
28.Dellinger RP, Carlet JM, Masur H, et al. Surviving sepsis campaign guidelines for management of severe sepsis and septic shock. Crit Care Med 2004; 32: 858–873. [DOI] [PubMed] [Google Scholar]
29.Emektar E, Dagar S, Corbacioglu SK, et al. The evaluation of the audit of fresh-frozen plasma (FFP) usage in emergency department. Turk J Emerg Med 2016; 16: 137–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Aubron C, Flint AWJ, Ozier Y, et al. Platelet storage duration and its clinical and transfusion outcomes: A systematic review. Crit Care 2018; 22: 185. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Green L, Cardigan R, Beattie C, et al. Addendum to the British Committee for Standards in Haematology (BCSH): Guidelines for the use of fresh-frozen plasma, cryoprecipitate and cryosupernatant, 2004 (Br J Haematol 2004, 126, 11–28). Br J Haematol 2017; 178: 646–647. [DOI] [PubMed] [Google Scholar]
32.Stanworth SJ, Brunskill SJ, Hyde CJ, et al. Is fresh frozen plasma clinically effective? A systematic review of randomized controlled trials. Br J Haematol 2004; 126: 139–152. [DOI] [PubMed] [Google Scholar]
33.Biu E, Beraj S, Vyshka G, et al. Transfusion of fresh frozen plasma in critically ill patients: Effective or useless? Open Access Maced J Med Sci 2018; 6: 820–823. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Semple JW, Rebetz J, Kapur R. Transfusion-associated circulatory overload and transfusion-related acute lung injury. Blood 2019; 133: 1840–1853. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-dhj-10.1177_20552076231224245 - Supplemental material for Development of blood demand prediction model using artificial intelligence based on national public big data

Click here for additional data file.^{(17.4KB, docx)}

sj-docx-2-dhj-10.1177_20552076231224245 - Supplemental material for Development of blood demand prediction model using artificial intelligence based on national public big data

Click here for additional data file.^{(18.2KB, docx)}

[bibr1-20552076231224245] 1.Mowla SJ, Sapiano MRP, Jones JM, et al. Supplemental findings of the 2019 National Blood Collection and Utilization Survey. Transfusion 2021; 61: S11–S35. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr2-20552076231224245] 2.Veseli B, Sandner S, Studte S, et al. The impact of COVID-19 on blood donations. PLoS One 2022; 17: e0265171. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr3-20552076231224245] 3.Community Blood Center, https://givingblood.org/donate-blood/why-give-blood.aspx (2022, accessed 15 January 2022).

[bibr4-20552076231224245] 4.American Red Cross, https://www.redcross.org/about-us/news-and-events/press-release/2022/blood-donors-needed-now-as-omicron-intensifies.html (2022, accessed 15 January 2022).

[bibr5-20552076231224245] 5.Korean Statical Information Service, https://kosis.kr/index/index.do (2020, accessed 15 January 2022).

[bibr6-20552076231224245] 6.Korean Red Cross, https://www.redcross.or.kr/main/main.do (2019, accessed 15 January 2022).

[bibr7-20552076231224245] 7.Sahu S, Hemlata. Verma A. Adverse events related to blood transfusion. Indian J Anaesth 2014; 58: 543–551. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr8-20552076231224245] 8.Ellingson KD, Sapiano MRP, Haass KA, et al. Continued decline in blood collection and transfusion in the United States-2015. Transfusion 2017; 57: 1588–1598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr9-20552076231224245] 9.Shander A, Isbister J, Gombotz H. Patient blood management: The global view. Transfusion 2016; 56: S94–102. [DOI] [PubMed] [Google Scholar]

[bibr10-20552076231224245] 10.Wang M, Cheng J, Li X, et al. Development and validation of a machine learning algorithm for prediction of platelet transfusion efficiency in patients with hematological diseases. Blood 2019; 134: 2454–2454. [Google Scholar]

[bibr11-20552076231224245] 11.Yao Y, Cifuentes J, Zheng B, et al. Computer algorithm can match physicians’ decisions about blood transfusions. J Transl Med 2019; 17: Article ID: 340. -y [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr12-20552076231224245] 12.Durand WM, DePasse JM, Daniels AH. Predictive modeling for blood transfusion after adult spinal deformity surgery: A tree-based machine learning approach. Spine (Phila Pa 1976) 2018; 43: 1058–1066. [DOI] [PubMed] [Google Scholar]

[bibr13-20552076231224245] 13.Doshi KA, Shastry S, Pai VB. Transfusion requirement prediction score for patients undergoing cardiac surgery: An experience from a tertiary care set-up from south India. Transfus Med 2021; 31: 243–249. [DOI] [PubMed] [Google Scholar]

[bibr14-20552076231224245] 14.Akaraborworn O, Chaiwat O, Chatmongkolchart S, et al. Prediction of massive transfusion in trauma patients in the surgical intensive care units (THAI-SICU study). Chin J Traumatol 2019; 22: 219–222. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr15-20552076231224245] 15.Cantle PM, Cotton BA. Prediction of massive transfusion in trauma. Crit Care Clin 2017; 33: 71–84. [DOI] [PubMed] [Google Scholar]

[bibr16-20552076231224245] 16.Kim JA, Yoon S, Kim LY, et al. Towards actualizing the value potential of Korea health insurance review and assessment (HIRA) data as a resource for health research: Strengths, limitations, applications, and strategies for optimal use of HIRA data. J Korean Med Sci 2017; 32: 718–728. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr17-20552076231224245] 17.Healthcare Bigdata Hub of Health Insurance Review and Assessment Service, https://opendata.hira.or.kr/home.do (2022, accessed 15 January 2022).

[bibr18-20552076231224245] 18.Statics Korea, https://kostat.go.kr/portal/korea/index.action (2022, accessed 15 January 2022).

[bibr19-20552076231224245] 19.Baik SM, Lee M, Hong KS, et al. Development of machine-learning model to predict COVID-19 mortality: Application of ensemble model and regarding feature impacts. Diagnostics (Basel) 2022; 12, 1464. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr20-20552076231224245] 20.Park DJ, Park MW, Lee H, et al. Development of machine learning model for diagnostic disease prediction based on laboratory tests. Sci Rep 2021; 11: 7567. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr21-20552076231224245] 21.Vimalachandran A, Jayachandran T, Tkachenko AY, et al. Gas turbine design analysis and optimization with novel hybrid model using classical physics and machine learning. In: 2021 international scientific and technical engine conference (EC), Samara, Russian Federation, 23–25 June 2021, pp.1–9. IEEE. [Google Scholar]

[bibr22-20552076231224245] 22.Lin F, He X, Zhang H, et al. Forecasting blood supply in Chinese major cities by fractional grey prediction model and linear regression model. medRxiv 2023. 2023.2004.2025.23287469 10.1101/2023.04.25.23287469 [DOI] [Google Scholar]

[bibr23-20552076231224245] 23.Shokouhifar M, Ranjbarimesan M. Multivariate time-series blood donation/demand forecasting for resilient supply chain management during COVID-19 pandemic. Clean Logist Supply Chain 2022; 5: 100078. [Google Scholar]

[bibr24-20552076231224245] 24.Emmert-Streib F, Yli-Harja O, Dehmer M. Artificial intelligence: A clarification of misconceptions, myths and desired status. Front. Artif. Intell 2020; 3: 524339. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr25-20552076231224245] 25.Carson JL, Guyatt G, Heddle NM, et al. Clinical practice guidelines from the AABB: Red blood cell transfusion thresholds and storage. JAMA 2016; 316: 2025–2035. [DOI] [PubMed] [Google Scholar]

[bibr26-20552076231224245] 26.Griesdale DE, Sekhon MS, Menon DK, et al. Hemoglobin area and time Index above 90 g/L are associated with improved 6-month functional outcomes in patients with severe traumatic brain injury. Neurocrit Care 2015; 23: 78–84. [DOI] [PubMed] [Google Scholar]

[bibr27-20552076231224245] 27.Khawar H, Kelley W, Stevens JB, et al. Fresh frozen plasma (FFP). Treasure Island (FL): StatPearls Publishing, 2022. [PubMed] [Google Scholar]

[bibr28-20552076231224245] 28.Dellinger RP, Carlet JM, Masur H, et al. Surviving sepsis campaign guidelines for management of severe sepsis and septic shock. Crit Care Med 2004; 32: 858–873. [DOI] [PubMed] [Google Scholar]

[bibr29-20552076231224245] 29.Emektar E, Dagar S, Corbacioglu SK, et al. The evaluation of the audit of fresh-frozen plasma (FFP) usage in emergency department. Turk J Emerg Med 2016; 16: 137–140. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr30-20552076231224245] 30.Aubron C, Flint AWJ, Ozier Y, et al. Platelet storage duration and its clinical and transfusion outcomes: A systematic review. Crit Care 2018; 22: 185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr31-20552076231224245] 31.Green L, Cardigan R, Beattie C, et al. Addendum to the British Committee for Standards in Haematology (BCSH): Guidelines for the use of fresh-frozen plasma, cryoprecipitate and cryosupernatant, 2004 (Br J Haematol 2004, 126, 11–28). Br J Haematol 2017; 178: 646–647. [DOI] [PubMed] [Google Scholar]

[bibr32-20552076231224245] 32.Stanworth SJ, Brunskill SJ, Hyde CJ, et al. Is fresh frozen plasma clinically effective? A systematic review of randomized controlled trials. Br J Haematol 2004; 126: 139–152. [DOI] [PubMed] [Google Scholar]

[bibr33-20552076231224245] 33.Biu E, Beraj S, Vyshka G, et al. Transfusion of fresh frozen plasma in critically ill patients: Effective or useless? Open Access Maced J Med Sci 2018; 6: 820–823. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bibr34-20552076231224245] 34.Semple JW, Rebetz J, Kapur R. Transfusion-associated circulatory overload and transfusion-related acute lung injury. Blood 2019; 133: 1840–1853. [DOI] [PubMed] [Google Scholar]

PERMALINK

Development of blood demand prediction model using artificial intelligence based on national public big data

Hi Jeong Kwon

Sholhui Park

Young Hoon Park

Seung Min Baik

Dong Jin Park

Abstract

Objective

Methods

Results

Conclusion

Introduction

Methods

Data collection and processing

Development of prediction model

Development of ensemble model

Stratified k-fold cross-validation

Analysis of parameters contributing to prediction performance by “feature importance” technique

Results

Comparison between actual demand and predicted demand of four models (XGBoost, LGBM, CatBoost, and Ensemble model)

Figure 1.

Figure 3.

Figure 2.

Performance of prediction models

Table 1.

Ensemble model performance

Table 2.

Parameters contributing to prediction performance by “feature importance” technique

Figure 4.

Figure 6.

Figure 5.

Discussion

Conclusion

Contributorship

Supplemental Material

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases