Abstract
Objective
Air pollution has potential risk on asthma patients, further prolongs the length of stay. However, it is unclear that the impact of air pollution on excessive length of stay (ELoS) of heterogeneous asthma patients. In this study, we proposed a K-Nearest Neighbor (KNN) embedded approach incorporating with patient status to analyze the impact of short-term air pollution on the ELoS of asthma patients.
Methods
The KNN embedded approach includes two stages. Firstly, the KNN algorithm was employed to search for the most similar patient community and approximate kernel proxy of each index patient by Euclidean distance. Then, we built the differential fixed-effect linear model to estimate the risk of air pollution to the ELoS.
Results
We analyzed 6563 asthma patients’ medical insurance records in a large city of China from January to December in 2014. It was found that when the duration of exposure to air pollution (i.e., PM2.5, PM10, SO2, NO2, and CO) reaches around 4–5 days, the risk of increasing the ELoS becomes the largest. But only O3 shows the opposite effect. What’s more, CO is the dominant risk to increase the ELoS. With a 1 mg/m3 increment of CO average concentration in 5 days, the ELoS will go up by 0.8157 day (95%CI:0.72,0.9114). Based on the kernel proxy in the top 1% similar patient community, the additional financial burden posed on each patient increases by RMB 488.6002 (95%CI:430.1962,547.0043) due to the ELoS.
Conclusions
The KNN embedded approach is an innovative method that takes into account the heterogeneous patient status, and effectively estimates the impact of air pollution on the ELoS. It is concluded that air pollution poses adverse effects and additional financial burdens on asthma patients. Heterogeneous patients should adopt different strategies in health management to reduce the risk of increasing the ELoS due to air pollution, and improve the efficiency of medical resource utilization.
Supplementary Information
The online version contains supplementary material available at 10.1007/s40201-020-00584-8.
Keywords: Excessive length of stay, Air pollution, Heterogeneous patient, K-nearest neighbor algorithm, Financial burden
Introduction
Asthma is one of the most common chronic respiratory diseases. Moreover, asthma patients are susceptible to air pollution. It causes excessive service burdens and economic costs every year [1, 2]. Extensive studies have shown that air pollution has a potential risk to the health outcomes of asthma (e.g., hospital emergency [3], hospital admission [4], mortality [5], etc.). Although some studies [4, 6] have analyzed the relationship between air pollution and length of stay (LoS), few studies analyzed the effect of air pollution on the excessive length of stay for asthma. The impact of air pollution on the excessive length of stay for asthma is unclear.
It is of great significance to study the influence of air pollution on the excessive length of stay of asthma patients. The length of stay is one of the important indicators of operational performance in hospital [7], which is widely concerned by hospital managers and relevant researchers [6]. Excessive length of stay indicates low bed turnover. At the same time, the number of patients served will be limited. At present, there are few studies to analyze the marginal impact of air pollution on the excessive length of stay of asthmatic patients from the perspective of the composition of length of stay. On one hand, understanding the marginal impact of air pollution on excessive length of stay will be helpful for hospital managers to improve the rate of bed turnover. On the other hand, avoiding the excessive length of stay means reducing the additional expenditures for patients. Therefore, it motivates us to analyze the relationship of excessive length of stay and air pollution.
It has been confirmed that air pollution poses a potential risk on the length of stay and hospitalization cost for asthma [1, 2]. Carpenter et al. [8] analyzed the effect of air pollution on rates of hospitalization, length of stay, and associated costs. They showed that length of stay for pollutant levels 2 (i.e., low SO2 and medium particulates) and 3 (i.e., high or median SO2 and low particulates) were significantly higher than that for level 1 (i.e., low SO2 and low particulates). Baek et al. [9] reported that PM2.5 was positively associated with the LoS among children aged 5–11 years old in the age-stratified models. Luo et al. [10] verified that air pollution affects the LoS for patients with asthma. There were significant effects of PM2.5 and NO2 on the LoS in a male group. Roy et al. [11] examined the association between sub-chronic exposure to six outdoor air pollutants (i.e., PM2.5, PM10, O3, NO2, SO2, and CO) and pediatric asthma hospitalization length of stay, charges, and costs. They found that sub-chronic PM2.5 exposure was associated with increased costs for pediatric asthma hospitalizations. A 1-unit (ug/m3) increase in monthly PM2.5 led to a $123 (95%CI: $40–249) increase in charges and a $47 (95%CI: $15–93) increase in costs. From the above literature review, it is more concerned about the relationship between air pollution and the total length of stay in hospital. However, even if there is no air pollution attack, asthma patients will have a certain length of stay because of their own status (e.g., age, severity of illness, etc.). By comparison, our work is different from previous studies. This study is more concerned about heterogeneous patient status and the excessive length of hospital stay caused by air pollution.
Patient heterogeneity results in the diversity of length of stay, which interferes with the judgment of excessive length of stay attributed to air pollution. Patient heterogeneity assessment is generally defined as investigating the similarity of patients’ data in terms of their demographics (e.g., age, gender, race, etc.) and the complexity of disease. For example, Carpenter et al. [8] found that the hospitalization rate was correlated for differences in age, gender, and race distributions. Respiratory related secondary diagnoses, age, and gender of the patient were important predictors of asthma-related LoS [12]. Variability in patient age is one of reasons for the heterogeneity of length of stay [13, 14]. In the aspect of the complexity of patient’s disease [7], Roe et al. [15] emphasized the importance of comorbidities for prediction of length of stay. They found an improved prediction of length of stay in Australian National Diagnosis Related Groups (AN-DRGs) version 1.0 and in AN-DRG version 3.1, by 27.2% and 17.5%, respectively. Variability in severity of illness potentiates the heterogeneity of length of stay [13, 14]. Chang et al. [16] found that the severity of disease was an important factor influencing the length of stay. Shanley et al. [17] performed bivariate and generalized-estimating-equation logistic regression to exam the factors including sociodemographic, temporal, and health-status factors associated with the LoS for pediatric asthma hospitalizations. They found that obstructive sleep apnea, older age, obesity, complex chronic conditions, and female gender were associated with longer LoS after adjustment for severity of illness. Thus, patient heterogeneity is a very important factor in the analysis of length of stay. It inspires us to design status variables to depict the heterogeneity of patients, and use machine learning models to judge the similarity between patients.
In recent researches, the nearest neighbor-based algorithms are promising tools for assessing patient similarity [18]. It indicates a group of patients similar to an index patient is retrieved and prediction is produced by a model trained on similar patients’ data. And it has the potential to analyze the similarity between an index patient and trial population in studies and helps to choose the most appropriate clinical trial [19]. Sharafoddini et al. [18] reviewed and summarized the published studies describing computer-based approaches for predicting patients’ health status based on healthcare data and patient similarity, identifying gaps, and provided a starting point for related future research. For example, David et al. [20] employed the Euclidean distance on weighted predictors to select neighbors for an index patient. Panahiazar et al. [21] developed a multidimensional patient similarity assessment technique that leverages electronic health records to identify patients who are similar to each for disease diagnosis and prognosis. Their results suggested that it was feasible to harness population-based information from electronic health records for an individual patient-specific assessment. Therefore, in this study, introducing the patient status variables into the k-nearest neighbor algorithm to find the homogeneous patients of each target patient is possible.
The objective of this study is to assess the impact of air pollution on the excessive length of stay in heterogeneous patients. Moreover, we evaluated the financial burden due to air pollution. In this study, we do not only leverage demographic indicators and disease complexity commonly used in the existing literatures [13, 15], but also add the quality of medical services and medication history to depict the patient status. We used the KNN algorithm based on Euclidean distance to find the homogeneous patient community of each index patient. Based on the criteria of approximate kernel proxy in patients’ community, we determined the excessive length of stay of each asthmatic patient. Finally, we built a differential fixed-effect linear model to estimate the impact of air pollution on the excessive length of stay.
We summarized this study as follows. (1) We proposed a KNN embedded approach incorporating with patient status to evaluate the impact of air pollution on the excessive length of stay. In the first stage, we took the kernel proxy of patient community as the standard of homogeneous patients, and found the excessive length of stay between each index patient and the approximate kernel proxy. In the second stage, we found that exposure to five air pollutants (i.e., PM2.5, PM10, SO2, NO2, and CO) for around 4–5 days would prolong the length of stay for asthma, and bring additional financial burdens to patients. In addition, CO is the dominant risk for the excessive length of stay of asthma patients. Finally, we reported the potential difficulty and opportunity resulting from air pollution in beds scheduling optimization. Combining air pollution prediction and multi-stage stochastic programming model to do well in demand prediction and hospital bed management is promising.
The remainder of this article is organized as follows. § 2 consists of four parts. In Sect. 2.1, we describe our research data in details. In Sect. 2.2, we define patient status to represent the heterogeneity of patients and assess patient similarity. In Sect. 2.3, we introduce the KNN-based approximate kernel method and the differential fixed-effect linear model. In Sect. 2.4, we present two performance indicators (i.e., average excessive length of stay and average excessive expenditure) for evaluating health outcomes. In § 3, we summarize the distribution of data in Sect. 3.1. We show the impact of air pollution on excessive length of stay in Sect. 3.2. And we assess the excessive length of stay and excessive expenditure in Sect. 3.3. In § 4, we discuss our findings from five aspects through literature comparison and results analysis. In § 5, we conclude the contribution of this study.
Materials and methods
Medical insurance and exposure data
We collected medical insurance data of 6701 asthma patients in a large city of China from January 1 to December 31 in 2014. This data set includes age, gender, insurance type, admission date, discharge date, hospital level, disease diagnoses, medication history, and total expenditure for each patient when he/she was admitted to hospital. The disease diagnoses include the primary disease resulting in this hospital admission and other comorbidities. The medication history records whether he/she has taken the medium-acting hormone (M1), antibiotic (M2), theophylline (M3), cholinergic receptor antagonists (M4), β2 receptor agonist (M5), or leukotriene receptor antagonist (M6) before hospital admission. According to the International Classification of Diseases (ICD-10), we identified the patients with primary diagnosis of J45 as asthma patients. In order to ensure that the studied population is within the scope of air pollution, we selected the patients whose medical insurance location is our studied city as research samples. In our dataset, no missing value and error coding was found. In preprocessing, we selected patients based on the criteria of LoS ≥ 1 and log(LoS) ∈ [mean(log(LoS)) − 3. standard deviation, mean(log(loS)) + 3. standard deviation]. After carried this way, we obtained a total of 6563 patients in analysis. Because the de-identification was pre-performed, readmission within 30 days was not considered in our study. Due to data confidentiality and patient privacy protection, we signed a data confidentiality agreement with the relevant department. Therefore, the data of this study cannot be used by external personnel.
Daily concentrations of six air pollutants (i.e., PM10, PM2.5, SO2, NO2 (24 h, ug/m3), CO (24 h, mg/m3), and O3 (8 h, ug/m3)) of studied area in 2014 was collected from the China National Environment Monitoring Centre [22]. No missing value was found. Considering the lag-effect of air pollution, we calculated the moving average concentration of PM10, PM2.5, SO2, NO2, CO, and O3 in short term. The range of short term starts from the admission day to the latest 6 days (i.e., lag00, lag01, lag02, lag03, lag04, lag05, and lag06). For example, the concentrations of PM10 with lag00 is the current daily concentration of PM10. And the concentrations of PM10 with lag01 is the average concentration of PM10 in the current day and yesterday.
Definition of heterogenous patient status
Patient heterogeneity causes the diverse distribution of length of stay [7]. The objective of defining patient status is not only to represent the patient heterogeneity, but also to measure the similarity between arbitrary two patients. We define that the X is the whole independent feature vector of patients, which is only related to each patient (e.g., demographics [3], disease complexity [15], medical quality [7], and treatment [18]). The X determines the normal length of stay. In other words, the homogeneous patients with the same status X have the same normal length of stay. Although the X usually is unknown, we can partially observe it in reality. In this study, we defined patient status (S, the S is a subset of the X.) for asthma patient from four aspects that included physiological factor (i.e., age and gender), disease complexity (i.e., the number of comorbidities), medical quality (i.e, insurance type and hospital level), and medication history (i.e., whether a patient had taken the six medications before he/she was admitted to hospital). Thus, the patient status for patient i can be represented as Si = (age, gender, the number of comorbidities, insurance type, hospital level, M1, M2, M3, M4, M5,
M6). We hypothesis that the elements belonging to the patient status does not change in a short term.
Methods
The differential fixed-effect linear model
We built the differential fixed-effect linear model to estimate the coefficient of air pollution. Before modeling, we firstly clarify that the components of length of stay of each patient include the main part caused by the whole patient status X and the additional part due to air pollution. Thus,
When no air pollution attacks, the length of stay for the homogeneous patients is determined by patient status X. It can be formulated as:
1
Where LoS is the length of stay determined by patient status. α0 is an intercept. X is a vector of patient status. γ is a row vector of the coefficients of X. ε0 is a random error which identically and independently distributed (i.e., i.i.d.) as .
In fact, when air pollution attacks, the length of stay for the homogeneous patients is both influenced by the X and air pollution. Therefore, based on the formula (1), the real length of stay can be represented as:
2
where is the real length of stay both determined by patient status X and air pollution. α1 is an intercept. Z is one of the six air pollutants. ε1 is a random error which identically and independently distributed as . γ and β are the coefficients of X and Z, respectively.
Let formula (2) minus formula (1), and we get the standard excessive length of stay caused by air pollution. In this way, the differential model not only makes it easier to observe the excessive length of hospital stay due to air pollution, but also reduces the interference caused by other exogenous confounders. The residuals of the model follow a normal distribution with a mean value of zero, which shows that the model structure is reasonable. Otherwise, the model needs to consider introducing other confounding factors. Thus, the excessive length of stay can be written as formula (3):
3
When fitting the model of excessive length of stay (i.e., ) and air pollution (i.e., Z) on our data, we have to face a challenge that the ideal standard of LoS in formula (1) is hardly found in reality. In other words, it is almost impossible for us to find two patients who have the same status X when admitted to hospital. And it is not guaranteed that one of them has been attacked by air pollution, while another has not been attacked. To deal with this challenge, we searched for the most similar k-patient community for patient i (i ∈ {1, 2, …, 6,563}) in our data. We replaced the ideal kernel by an approximate kernel proxy (AP) that is the nearest patient to the ideal kernel of k-patient community.
The KNN-embedded approximate kernel proxy algorithm
The K-Nearest Neighbor algorithm was widely employed to measure the similarity of an index patient and a group of other patients through distance-based similarity metrics [18]. In our study, we standardize each feature s (s ∈ S) by , where and sd(s) is the average and standard deviation of feature s, respectively. And the s′ is the standardized s. In the process of searching for the most similar k patients for patient i, we utilized Euclidean distance to measure the similarity between arbitrary two patients with status S. Then, the status vector of the ideal kernel in the most similar (k + 1) patients community of patient i can be determined by , where Skernel is a status vector of the ideal kernel. Si is a status vector of ith patient in the (k + 1) patients community. (k + 1) is the total number of patients in this community. After carried out this way, the approximate kernel proxy (AP) of patient community, which is the nearest patient to the kernel, is represented as:
4
where Si is a status vector of patient i, i ∈ {1, 2, …, k} in the k-patient community. Skernel is a status vector of the ideal kernel.
Then, the ideal length of stay for the homogeneous patient with status X can be replaced by the approximate kernel proxy. Thus, the formula (1) can be replaced by:
5
where is the length of stay of the approximate kernel proxy. ∆X is a vector of the deviation between the approximate kernel proxy and the ideal kernel. ∆Z(AP) is the concentration difference between air pollutants attacking the index patient and that attacking his/her approximate kernel proxy.
We use formula (2) minus formula (5). The excessive length of stay can be re-written as formula (6):
6
When ∆X → 0 (in other words, the approximate kernel proxy gradually approaches the ideal kernel), the . In this study, based on the formula (6), we estimated the (with 95%CI) to represent the effect of air pollution on the excessive length of stay. And we checked the distributions of the residuals. The residuals that are shown in the S.A. Fig. 1 are identically and independently normal distribution with mean value of zero. It means that the model structure is reasonable and the interference caused by other confounders can be ignored. We examined the robustness of coefficient , when the threshold of the number of the nearest k patients (i.e., k) was changed in four scenarios of top 1%, top 5%, top 10%, and top 15% of total number of asthma patients.
Fig. 1.
The distribution of length of stay (days)
Outcomes evaluation
The key outcomes of patient admission are length of stay and financial burden. These two indicators largely attract attentions of hospital practitioners and patients. In this study, we defined average excessive length of stay and average excessive expenditure in order to evaluate the impact of air pollutants on length of stay and financial burden.
Average excessive length of stay: This indicator represents the average on excessive length of stays of all patient. Based on the formula (6), we fitted the patient i’s excessive length of stay ∆LoSj caused by air pollutant j. Thus, the average excessive length of stay (i.e., attributed to air pollutant j is calculated by the formula (7):
7 - Average excessive expenditure: This indicator represents the average expenditure in excessive length of stays ∆LoSj caused by air pollutant j for all patient. We calculated the daily expenditure (i.e., ci) during length of stay for patient i by ci = TEi/LoSi, where LoSi is the whole length of stay of patient i. TEi is the total expenditure in the whole length of stay of patient i. Then, the average excessive expenditure (i.e., ) attributed to air pollutant j is calculated by the formula (8).
8
Results
Summary of data description
In our study, 6563 medical insurance records of asthma patients from January 1 to December 31 in 2014 were analyzed. The Fig. 1 shows the distribution of length of stay. In the Fig. 1(a), the minimum and average length of stay of patients admitted to hospital slightly fluctuate around 3 and 9 days, respectively. By comparison, the maximum length of stay has a bigger fluctuation than the formers. Furthermore, we utilized the spearman’s rank correlation to test whether the maximum, minimum, and average length of stay were correlated with time series. The estimated correlation coefficients of the maximum, minimum and average length of stay and time series were − 0.0072 (P = 0.896), −0.0348 (P = 0.526), and − 0.0190 (P = 0.475), respectively. Non statistically significant correlation demonstrated that the length of stay did not have time trend. In the Fig. 1(b), the probability density of the total length of stay shows a right-sided distribution with a long tail.
The Fig. 2 shows the distribution of six air pollutants (i.e., PM10, PM2.5, SO2, NO2, CO, and O3) over time. It is obvious that the fluctuations in the concentrations of PM10 and PM2.5 are bigger than that in the others. It can be seen that the daily concentrations of PM10, PM2.5, SO2, NO2, and CO show a similar U-shaped seasonal trend. In general, the daily concentrations of these five air pollutants in spring and winter is higher than that in summer and autumn. However, the trend of daily concentration of O3 is the opposite.
Fig. 2.
The distributions of the daily concentration of six air pollutants. (PM10, PM2.5, SO2, NO2 (24 h, ug/m3), CO (24 h, mg/m3), and O3 (8 h, ug/m3)) of studied city in 2014)
The Fig. 3 visualized the directed network of each index patient and his/her approximate kernel proxy. In Fig. 3(c), it can be seen that the majority of studied patients has a considerable similarity. And there are four patients whose APs are themselves. Based on the edge betweenness algorithm [23], we detected the modules of patients and their APs in the Fig. 3(c). In Fig. 3(d), the vertices of the same color belong to the same module. And we found a total number of 1795 APs in Fig. 3(d).
Fig. 3.
Directed network of patient community in a top 1% similarity criteria. Note that the vertex is a patient. The head and tail of each edge are the index patient i and his/her AP
The impact of air pollutants on excessive length of stay
The Fig. 4 shows the impact of air pollution on excessive length of stay in the most similar (i.e., top 1%) patient community. From the lag00 to lag06, PM2.5, PM10, SO2, NO2, and CO gradually increased the risk effect on the excessive length of stay. Generally, when exposure to air pollution in 5 days (i.e., lag05), the risk of excessive length of stay of asthma patients reached its highest point. Meanwhile, the biggest-risk pollutant is CO with a coefficient of 0.9910 (95%CI: 0.6491, 1.3330), followed by SO2 with a coefficient of 0.0182 (95%CI: 0.0048, 0.0315), NO2 with a coefficient of 0.0123 (95%CI: 0.0038, 0.0208), PM2.5 with a coefficient of 0.0053 (95%CI: 0.0030, 0.0076), and PM10 with a coefficient of 0.0029 (95%CI: 0.0012, 0.0046). In contrast, no statistically significant relationship of the O3 and the excessive length of stay was found.
Fig. 4.
The estimated betas (95%CIs) vary with exposure time from lag00 to lag 06 in the top 1% similar patient community
Table 1 summarized the biggest-risk exposure time of air pollutants in different patient communities. In the robustness analysis of beta, we successively determined the number of patients in community based on the similarity criteria of top 1%, top5, top 10%, and top 15%. We found that the largest-risk exposure time of air pollutants ranged from 4 days (i.e., lag04) to 5 days (i.e., lag05). In the top 1% similar community, the biggest-risk exposure time of five air pollutants is always lag05.
Table 1.
The biggest-risk exposure time of five air pollutants in different communities based on the top 1%, top 5%, top 10%, and top 15% criteria of similarity
PM2.5 | PM10 | SO2 | NO2 | CO | |
---|---|---|---|---|---|
Top1% | Lag05 | Lag05 | Lag05 | Lag05 | Lag05 |
Top5% | Lag04 | Lag04 | Lag04 | Lag05 | Lag04 |
Top10% | Lag04 | Lag04 | Lag04 | Lag05 | Lag04 |
Top15% | Lag05 | Lag04 | Lag04 | Lag05 | Lag05 |
Outcomes evaluation
Table 2 summarized the average excessive length of stay attributed to air pollution in different patient communities. We found that the average excessive length of stay attributed to air pollution increased along with the expanding size (i.e., from top 1% to top 15%) of patient community. The excessive length of stay attributed to CO climbed from 0.8157 day (95%CI: 0.72,0.9114) to 1.2928 day (95%CI: 1.2034,1.3822). The similar results were found in PM2.5, PM10, NO2, and SO2. When the criterias of patient community are the top 5% and top 10%, the average excessive length of stay keeps stable. And in the same size of patient community, the CO was the largest-risk air pollutant of prolonging the excessive length of stay. By comparison, the average excessive length of stay resulting from the SO2 was the smallest.
Table 2.
The average excessive length of stay (days, 95%CI) attributed to air pollution in different patient communities
PM2.5 | PM10 | SO2 | NO2 | CO | |
---|---|---|---|---|---|
Top1% | 0.7457 (0.6617,0.8298) | 0.693 (0.6029,0.7831) | 0.6587 (0.5651,0.7523) | 0.6792 (0.5821,0.7763) | 0.8157 (0.72,0.9114) |
Top5% | 0.8807 (0.7986,0.9628) | 0.8207 (0.7331,0.9082) | 0.8504 (0.759,0.9419) | 0.8763 (0.7794,0.9731) | 1.04 (0.9454,1.1345) |
Top10% | 0.8807 (0.7986,0.9628) | 0.8207 (0.7331,0.9082) | 0.8504 (0.759,0.9419) | 0.8763 (0.7794,0.9731) | 1.04 (0.9454,1.1345) |
Top15% | 1.1373 (1.0528,1.2218) | 1.0629 (0.9764,1.1495) | 1.1815 (1.0954,1.2676) | 1.0251 (0.9325,1.1177) | 1.2928 (1.2034,1.3822) |
Table 3 showed the average excessive expenditure attributed to air pollution. The exposure time of air pollutants in Table 3 was in accordance with Table 2. We found that the average excessive expenditure attributed to air pollutant increased along with the expanding size (i.e., from top 1% to top 15%) of patient community. The excessive expenditure attributed to CO climbed from RMB 488.6002 (95%CI: 430.1962,547.0043) to RMB 802.1244 (95%CI: 745.7239,858.5249). The similar results were found in PM2.5, PM10, NO2, and SO2. When the criterias of patient community are top 5% and top 10%, the average excessive length of stay keeps stable. And in the same size of patient community, the CO was the largest risk of the excessive expenditure. By comparison, the average excessive expenditure attributed to the SO2 was the smallest.
Table 3.
The average excessive expenditure (RMB, 95%CI) attributed to air pollution in different patient communities
PM2.5 | PM10 | SO2 | NO2 | CO | |
---|---|---|---|---|---|
Top1% | 441.8072 (391.503,492.1114) | 410.0773 (356.2174,463.9372) | 392.9208 (336.1661,449.6755) | 404.9288 (347.2387,462.6189) | 488.6002 (430.1962,547.0043) |
Top5% | 535.0343 (484.6916,585.377) | 493.5452 (439.9839,547.1065) | 521.9131 (464.6986,579.1276) | 535.7901 (475.2309,596.3494) | 634.3436 (575.5374,693.1498) |
Top10% | 535.0343 (484.6916,585.377) | 493.5452 (439.9839,547.1065) | 521.9131 (464.6986,579.1276) | 535.7901 (475.2309,596.3494) | 634.3436 (575.5374,693.1498) |
Top15% | 700.8527 (648.2135,753.4919) | 657.971 (603.8032,712.1388) | 747.3186 (691.4942,803.1429) | 641.3504 (581.8374,700.8634) | 802.1244 (745.7239,858.5249) |
Discussion
In this study, we analyzed the impact of air pollution on the excessive length of stay based on the 6563 medical insurance records of asthma patients [1, 2]. We found that air pollution has an adverse effect on the excessive length of stay, which is consistent with the previous findings. Through literature review and analysis, we discussed our findings from five aspects: (1) contribution of KNN-embedded estimation approach, (2) the risk of air pollution to the excessive length of hospital stay, (3) implication on the excessive length of stay and financial burden, (4) implication on bed scheduling optimization, and (5) limitations.
Contribution of the KNN-embedded estimation approach
In our work, we subtly embedded the KNN algorithm into the differential fixed-effect linear model to estimate the impact of air pollution on the excessive length of stay. From the perspective of research method, the parametric survival analysis [3, 24], multi-variate linear regression [15, 16], logistic regression [4], Poisson regression [10, 11] etc. were common methods to simulate the length of stay with confounding factors (e.g., air pollution, patient’s age and gender, comorbidity, etc.) Although these methods can estimate variables’ coefficients, the goodness of fit and residuals check were usually ignored. Therefore, it causes lack of fit and insufficient precise of coefficient. Distinguish from previous study, we focused on the excessive length of stay, rather than the direct length of stay. In the Sect. 2.3, it is a trick that we construct the formula (3) to avoid lack of fit due to insufficiently interpreting variables. Thus, our model can sufficiently interpret the excessive length of stay attributed to air pollution. In the residuals check (S.A. Fig. 1), the i.i.d residuals guarantee a reasonable model.
Besides, we have to face a challenge how to determine the excessive length of stay in the homogeneous patients. In the previous studies [3, 13, 17], patient heterogeneity was taken into consideration. On the one hand, patient features (e.g., age, gender, race etc.) were incorporated into the regression models as confounding factors into models to adjust the estimated parameter. On the other hand, these features were treated as stratified factors in model after discretization. In this study, we determined the excessive length of stay in a distinct way. We defined the patient status S from four aspects (i.e., physiological factor, disease complexity, medical quality, and medication history). And we employed the KNN algorithm to search for the most similar community for the index patient. The similarity between arbitrary two patients was measured by the Euclidean distance. Then, the excessive of length of stay was the difference of length of stay between index patient and his/her approximate kernel proxy. We clarified the KNN embedded approximate kernel proxy algorithm in Sect. 2.3.2. In future researches, our approach can be further developed and transferred to other disease (e.g., chronic obstructive pulmonary disease, stroke, etc.) for risk analysis.
The risk of air pollution to the excessive length of hospital stay
Air pollution is a risk factor of the excessive length of stay for asthma patients. It is consistent with [3]. They showed that length of stay for pollutant levels 2 (i.e., low SO2 and medium particulates) and 3 (i.e., high, or median SO2 and low particulates) were significantly higher than for level 1 (i.e., low SO2 and low particulates). Also, Luo et al. [10] confirmed that the effects of air pollution on the LoS differed by gender. And there were significant effects of PM2.5 and NO2 on LoS in a male group. This study [11] found that sub-chronic PM2.5 exposure was associated with increased costs for pediatric asthma hospitalizations. A 1-unit (μg/m3) increase in monthly PM2.5 led to a $123 increase in charges (95%CI: $40–249) and a $47 increase in costs (95%CI: $15–93). In our results, the risks of the excessive length of stay attributed to PM2.5, PM10, SO2, NO2, and CO when the exposure days last around 4–5 days, reach to their largest points. In these five air pollutants, the CO has the biggest risk (0.9910, 95%CI: 0.6491, 1.3330) to the excessive length of stay, followed by SO2 (0.0182, 95%CI: 0.0048, 0.0315), NO2 (0.0123, 95%CI: 0.0038, 0.0208), PM2.5 (0.0053, 95%CI: 0.0030, 0.0076), and PM10 (0.0029, 95%CI: 0.0012, 0.0046).
However, our finding of a negative association (from S.A. Fig. 2 to S.A. Fig. 4) between O3 and excessive length of stay contrasts with the one of a previous study [9] in South Texas of the U.S.. Their results showed the increased ozone level was significantly associated with the prolonged LoS. One explanation for the discrepancy of our findings might relate to the different health outcomes. We focused on the effect of air pollution on the excessive length of stay, while [9] concerned about the total length of stay. In addition, the associations were only observed for children in [9]. But in our data, we analyzed asthma patients with age ranged from 0.53 year to 96.35 years. It may indicate the bias from different objective population.
Implication on the excessive length of stay and financial burden
From the perspective of operational management in hospital, the adverse effect of air pollution on the excessive length of stay will become one of the bases of improving the management strategy of medical institutions. Also, it will provide an evidence to improve the healthcare management of asthmatic patients. For example, Luo et al. [10] measured the effects of air pollutants on the LoS for asthma patients and offered policy makers quantitative evidence that can support the relevant policies for healthcare resource management and ambient air pollutants control. In our results, with 1 mg/m3 increase on the average concentration of CO within five-day exposure, the average excessive length of stay of asthma patients will increase by 0.8157 days (95%CI: 0.72, 0.9114). According to the average length of stay of asthmatic patients (approximately 9 days), for each reduction in the ELoS of about 11 asthma patients, one more patient can be admitted to the hospital. For the hospital practitioners, the ELoS will seriously reduce the turnover rate of scarce bed resources in the hospitals and reduce the throughput of asthma patients. Therefore, the efficiency of hospital service operations will be reduced.
In addition, reducing the excessive length of stay due to air pollution will be helpful to decrease financial burden for patients. In this study [1], it was suggested that the United States could save $15 million annually in reduced the healthcare costs from hospitalizations of children with bronchiolitis living in urban areas, if the levels of fine particulate matter were reduced to 7% below the their annual standard. In our study, when the average concentration of PM2.5 in the 5 days before admission increases by1 ug/m3, the average excessive hospitalization expenditure of asthma patients will increase by RMB 488.6002 (95%CI: 430.1962, 547.0043). Although air pollution will bring serious burden to the hospitals and related patients, its adverse effects can be controlled and prevented. For example, asthma patients can avoid exposure to the air through reasonable care measures outside the hospital, which can reduce the adverse consequences due to air pollution. Roy et al. [11] also concluded that policy changes to reduce outdoor sub-chronic pollutant exposure may lead to improved asthma outcomes as well as substantial savings in healthcare spending.
Implication on bed scheduling optimization
Air pollution will increase the uncertainty in length of stay of asthma patients, and bring difficulties for patients’ admission decision-making and bed scheduling optimization. In the demand forecasting and scheduling problems of hospital beds, demand uncertainty is one of the common sources of uncertainty. Because air pollution can lead to the variation of the excessive length of stay, the number of beds released per day is uncertain. Then, the decision-making of bed scheduling has to face the challenge of double uncertainty of supply and demand. In general, optimization models solve the uncertainty of length of stay by using the mean or expected value and random sampling methods in the empirical or hypothetical distribution of length of stay [25] (e.g., exponential distribution, lognormal distribution, Weibull distribution, Poisson distribution, and negative binomial distribution) in a given sample [26]. But such methods are insufficient description of length of stay due to the highly skewed and varied shapes of distribution of length of stay [27]. Consequently, it seems reasonable to think that the distribution of length of stay is generated by the composition of multiple stationary distributions.
As far as we know, few studies deal with the uncertainty from the perspective of the composition of length of stay. For asthma patients, the total length of stay is mainly composed of the normal length of stay determined by the patient’s status [7] (e.g., physiological factors, disease complexity, medical quality, medication history, etc.) and the excessive length of stay attributed to air pollution. The patients with more similar status will occupy more similar normal length of stays. In the case of exposure to different levels of air pollution, the excessive length of stay due to air pollution increases the degree of uncertainty in length of stay. For specific diseases related to air pollution (e.g., asthma [10], bronchiolitis [1], type 2 diabetes [6], cardiovascular diseases [28], etc.), it is possible to improve the prediction accuracy of the length of stay of patients by simulating different exposure scenarios of air pollution. Furthermore, it is useful to establish air pollution scenarios and multi-stage stochastic programming model in scheduling optimization of bed resources in hospital. Therefore, it is reasonable to believe that combining air pollution prediction and stochastic programming model in demand prediction and bed management in hospital is promising.
Limitations
There are some limitations in our research. Firstly, our research data is limited. The sample size and studied area are not very large. And we used daily concentration of air pollutants to represent the degree of individual exposure to air pollution, which leads to exposure error. Secondly, we replaced the ideal kernel by the approximate kernel proxy, which results in estimation error of coefficient. Finally, we did not consider the interactive effect of various air pollutants. There are correlations among different air pollutants. The interaction can be taken into consideration in the future work.
Conclusion
In summary, we proposed a KNN embedded approach that takes into account the heterogeneous patient status, and effectively estimates the impact of air pollution on the ELoS. To deal with the challenge of determining the excessive length of stay in homogeneous patients, we defined the patient status and measured the similarity between patients by the Euclidean distance, in which we replaced the ideal kernel by the approximate kernel proxy in the homogeneous patient community.
In the risk analysis and outcomes evaluation, we found that exposure to five air pollutants (i.e., PM2.5, PM10, SO2, NO2, and CO) for around 4–5 days would increase the excessive length of hospital stay for the asthma patients, and bring additional financial burden to patients. And the CO is the dominant risk to the excessive length of stay.
In the future researches, air pollution brings difficulties and opportunities for asthma patients’ admission decision-making and bed scheduling optimization. From the perspective of hospital operations management, controlling adverse effect of air pollution on the excessive length of stay will become the basis of improving the management strategy of hospital beds. Combining air pollution prediction and stochastic programming model in demand prediction and hospital bed management is promising.
Supplementary Information
(PDF 371 kb)
Acknowledgements
The authors would like to thank these cooperative institutes for providing research data. And we would like to thank the editors and anonymous peers for their insightful and constructive comments.
Funding
This research was supported in part by the National Natural Science Foundation of China (Grant No. 71532007, Grant No. 71131006, Grant No. 71172197, and Grant No.72042007) and Scientific Research Project of the Health Commission of Sichuan Province (Grant No. 19PJ248).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Sheffield P, Roy A, Wong K, Trasande L. Fine particulate matter pollution linked to respiratory illness in infants and increased hospital costs. Health Aff. 2011;30(5):871–878. doi: 10.1377/hlthaff.2010.1279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Pascal M, Corso M, et al. Assessing the public health impacts of urban air pollution in 25 European cities: results of the Aphekom project. Sci Total Environ. 2015;449:390–400. doi: 10.1016/j.scitotenv.2013.01.077. [DOI] [PubMed] [Google Scholar]
- 3.Pérez-Hoyos S, Ballester F, Tenías JM, Rivera AML. Length of stay in a hospital emergency room due to asthma and chronic obstructive pulmonary disease: implications for air pollution studies. Eur J Epidemiol. 2000;16(5):455–463. doi: 10.1023/A:1007631609827. [DOI] [PubMed] [Google Scholar]
- 4.Nhung NTT, Schindler C, Dien TM, Probst-Hensch N, Kuenzli N. Association of ambient air pollution with lengths of hospital stay for hanoi children with acute lower-respiratory infection, 2007–2016. Environ Pollut. 2019;247(APR.):752–762. doi: 10.1016/j.envpol.2019.01.115. [DOI] [PubMed] [Google Scholar]
- 5.Cakmak S, Burnett RT, Krewski D. Methods for detecting and estimating population threshold concentrations for air pollution-related mortality with exposure measurement error. Risk Anal. 1999;9(3):487–96. [DOI] [PubMed]
- 6.Xiang L, Kai T, Xu-Rui J, Ying X, Jing X, Li-Li Y et al. Short-term air pollution exposure is associated with hospital length of stay and hospitalization costs among inpatients with type 2 diabetes: a hospital-based study. J Toxicol Environ Health Part A. 2018; 81(17):819–29. [DOI] [PubMed]
- 7.Harrison GW, Escobar GJ. Length of stay and imminent discharge probability distributions from multistage models: variation by diagnosis, severity of illness, and hospital. Health Care Manag Sci. 2010;13(3):268–279. doi: 10.1007/s10729-010-9128-5. [DOI] [PubMed] [Google Scholar]
- 8.Carpenter BH, Chromy JR, Bach WD, LeSourd DA, Gillette DG. Health costs of air pollution: a study of hospitalization costs. Am J Publ Health. 1979;69(12):1232–1241. doi: 10.2105/AJPH.69.12.1232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Baek J, Kash BA, Xu X, Benden M, Carrillo G. Association between ambient air pollution and hospital length of stay among children with asthma in South Texas. Int J Environ Res Publ Health. 2020;17(11):3812. doi: 10.3390/ijerph17113812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Luo L, Ren J, Zhang F, Zhang W, Li C, Qiu Z et al. The effects of air pollution on length of hospital stay for adult patients with asthma. Int J Health Plan Manag. 2018;33(3): e751–67 [DOI] [PubMed]
- 11.Roy A, Sheffield P, Wong K, Trasande L. The effects of outdoor air pollutants on the costs of pediatric asthma hospitalizations in the united states, 1999 to 2007. Med Care. 2011;49(9):810–817. doi: 10.1097/MLR.0b013e31820fbd9b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Soyiri IN, Reidpath DD, Sarran, Christophe Asthma length of stay in hospitals in london 2001–2006: demographic, diagnostic and temporal factors. PLoS ONE. 2011;6:e27184. doi: 10.1371/journal.pone.0027184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Orchard C. Comparing healthcare outcomes. BMJ. 1994;308:1493. doi: 10.1136/bmj.308.6942.1493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Editorial Hospital admission through the emergency department. JAMA. 2274;1991:266. [PubMed] [Google Scholar]
- 15.Roe CJ, Dodich N, Kulinskaya E, Adam WR. Comorbidities and prediction of length of hospital stay. Aust N Z J Med. 1998;28(6):811–815. doi: 10.1111/j.1445-5994.1998.tb01559.x. [DOI] [PubMed] [Google Scholar]
- 16.Chang KC, Tseng MC, Weng HH, Lin YH, Liou CW, Tan TY. Prediction of length of stay of first-ever ischemic stroke. Stroke. 2002;33(11):2670–2674. doi: 10.1161/01.STR.0000034396.68980.39. [DOI] [PubMed] [Google Scholar]
- 17.Shanley LA, Lin H, Flores G. Factors associated with length of stay for pediatric asthma hospitalizations. J Asthma. 2015;52(5):471–7 [DOI] [PubMed]
- 18.Sharafoddini A, Dubin JA, Lee J. Patient similarity in prediction models based on health data: a scoping review. Jmir Med Inform. 2017;5(1):e7. doi: 10.2196/medinform.6730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Boening A, Burger H. Visual assessment of the similarity between a patient and trial population. Appl Clin Inform. 2016;07(02):477–488. doi: 10.4338/ACI-2015-12-RA-0178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.David G, Bernstein L, Coifman RR. Generating evidence based interpretation of hematology screens via anomaly characterization. Open Clin Chem J. 2011;411(1):10–16. doi: 10.2174/1874241601104010010. [DOI] [Google Scholar]
- 21.Panahiazar M, Taslimitehrani V, Pereira NL, Pathak J. Using ehrs for heart failure therapy recommendation using multidimensional patient similarity analytics. Stud Health Technol Inform. 2015;210:369–373. [PMC free article] [PubMed] [Google Scholar]
- 22.China National Environment Monitoring Centre Daily average concentration of air pollutants in cities across the country. http://www.cnemc.cn/. Accessed 25 June 2020
- 23.Newman M, Girvan M. Finding and evaluating community structure in networks. Phys Rev E. 2004;69:026113. doi: 10.1103/PhysRevE.69.026113. [DOI] [PubMed] [Google Scholar]
- 24.Chen H, Burnett RT, Copes R, Kwong JC, Villeneuve PJ, Goldberg MS, et al. Ambient fine particulate matter and mortality among survivors of myocardial infarction: population-based cohort study. Environ Health Perspect. 2016;124(9):1421. doi: 10.1289/EHP185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Marazzi A, Paccaud F, Beguin RC. Fitting the distributions of length of stay by parametric models. Med Care. 1998;36(6):915–927. doi: 10.1097/00005650-199806000-00014. [DOI] [PubMed] [Google Scholar]
- 26.Shwartz M, Ash A. Measuring model performance when the outcome is continuous. In: Li I, editor. Risk adjustment for measuring health care outcomes. 3. Chicago: Health Administration Press; 2003. pp. 235–248. [Google Scholar]
- 27.Littig SJ, Isken MW. Short term hospital occupancy prediction. Health Care Manage Sci. 2007;10(1):47–66. doi: 10.1007/s10729-006-9000-9. [DOI] [PubMed] [Google Scholar]
- 28.Devos S, Cox B, Dhondt S, Nawrot T, Putman K. Cost saving potential in cardiovascular hospital costs due to reduction in air pollution. Sci Total Environ. 2015;527–528:413–419. doi: 10.1016/j.scitotenv.2015.04.104. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
(PDF 371 kb)