Abstract
Background
Patients with prolonged hospitalizations account for 14% of all hospital days in US hospitals. Predicting which medical patients are at risk for prolonged hospitalizations would allow early proactive management to reduce their length of stay.
Methods
Using the National Inpatient Sample, we examined risk factors for prolonged hospitalizations among adults hospitalized on the medicine service in 2014. We defined prolonged hospitalizations as those lasting 21 days or longer. We divided the sample into derivation and validation sets, and used logistic regression to identify significant risk factors in the derivation set, which were validated in the validation set. We used the estimates from the model to derive a risk score for prolonged hospitalizations.
Results
Our sample included 2,997,249 hospitalizations (median age of 66 years, 53.5% female). 1.2% of hospitalizations were 21 days or longer. Patients with prolonged hospitalizations were younger, and had a greater number of chronic diseases. A prolonged hospitalization risk score, derived from the many significant predictors in our model, performed well in discriminating between prolonged and non-prolonged hospitalizations, with c-statistics of 0.80 in both the derivation and validation sets.
Conclusion
Our predictive model using readily available administrative data was able to discriminate between prolonged and non-prolonged hospitalizations in a national sample of medical patients, and performed well on internal validation. If prospectively validated, such a tool could be of use to hospitals and researchers interested in targeting development, testing, and/or deployment of programs to reduce length of stay.
Keywords: hospital medicine, utilization, health services research
Introduction
Prolonged hospitalizations are increasing in hospitals nationwide. In 2012, patients with hospitalizations over 21 days represented only 2% of hospitalizations, but approximately 14% of hospital days and cost over $20 billion dollars annually.1 Prolonged hospitalizations strain hospital capacity, particularly as beds per capita continue to decrease.2 In addition, with market consolidation, prolonged hospitalizations disproportionately affect in particular urban academic centers.1
Prolonged hospitalizations have been associated with both clinical and socioeconomic factors, as well as markers of increased severity of illness, such as palliative care consultation, mechanical ventilation, and ICU admission or surgery.3 However, Medicaid insurance and discharge to facilities are also associated with increased length of stay, likely reflecting insurance challenges to finding an adequate discharge plan.4 The multiple factors contributing to prolonged hospitalizations pose a challenge to hospitals trying to control length of stay.
Often, safely discharging patients with prolonged hospitalizations requires a dedicated effort to arrange appropriate discharge disposition and services. One study of patients with prolonged hospitalizations, defined as over 30 days, found that the most common cause of non-clinical delay for patients was coordinating the site of discharge, which affected 56% of delayed patients.5 Discharge site coordination, which included finding an appropriate location for discharge based on insurance and family/patient agreement, caused on average 8.5 days of delay. This suggests that nonclinical causes likely contribute to significant delays. A prediction tool to identify patients at risk for prolonged hospitalization, using administrative data available at the time of admission, could help direct scarce hospital resources to this group of patients early in their stay, to potentially reduce their length of stay and improve hospital capacity. We aimed to identify independent predictors of prolonged hospitalizations in a national sample of medical patients, and to use this information to derive a risk score to predict prolonged hospitalizations.
Methods
We performed a retrospective study using data from the 2014 National Inpatient Sample (NIS). The NIS is the largest all-payer inpatient care database currently available in the United States.8 The NIS samples from the State Inpatient Databases participating the Healthcare Cost and Utilization Project (HCUP), a family of healthcare databases sponsored by the Agency for Healthcare Research and Quality (AHRQ). The NIS approximates a 20%-stratified sample of discharges from U.S. non–Federal, short–term hospitals, including public hospitals, community hospitals, and academic medical centers. The NIS includes data on primary and secondary discharge diagnoses according to the International Classification of Diseases, 9th revision, Clinical Modification (ICD-9-CM).9 The Institutional Review Board at our institution did not consider the study to be human subjects research, based on the de-identified nature of this publicly available data.
Study Sample
All adult (age ≥18 years), medical hospitalizations, identified using the “service line” variable in the NIS dataset, were eligible for inclusion. We opted to develop the model only in medical patients, since we anticipated that the conditions prompting hospitalization, and, accordingly, the reasons for the prolonged hospitalization would likely differ between surgical and medical patients. We excluded hospitalizations with a primary diagnosis of rehabilitation care (defined using the HCUP Clinical Classifications Software [CCS] categorization scheme, ‘DXCCS1’=254: Rehabilitation care; fitting of prostheses; adjustment of devices) since we felt that such hospitalizations would not be representative of a typical medical hospitalization. We also excluded hospitalizations with missing length of hospitalization, and hospitalizations with other missing predictor variables (sex, transfer status, admission month, and elective/non-elective status). Patients who died during hospitalization were included in the sample.
Study variables
Our primary outcome variable was prolonged hospitalization, defined as any hospitalization with a length of stay of 21 days or more, consistent with prior analyses.1,4 Candidate predictor variables were chosen a priori, based on possible association with prolonged stay and availability in the NIS. They included age (<80 versus ≥80), gender, primary payer, the ten most common primary discharge diagnoses in the sample (classified by the HCUP CCS), season of admission (winter versus not winter), weekend admission, direct admission (versus through the emergency department), transfer from another facility, number of chronic diseases (<4 versus ≥4), and chronic comorbidities (classified by the HCUP Comorbidity Software, with diabetes with and without chronic complications grouped as a single variable). We chose not to include race in our models because we felt that this sociodemographic variable should not be used to inform decisions around utilization and resource allocation stemming from our model.
Statistical Analysis
Patient and hospital characteristics are reported as medians with 25th-75th percentiles for continuous variables and proportions for binary or categorical variables. To internally validate our predictive model, consistent with prior analyses,10,11 we randomly divided the sample into derivation (80%) and validation (20%) subsets, and derived the predictive model in the derivation subset. We estimated a logistic regression model for long-stay hospitalization, including all candidate predictor variables as independent variables in the model. We used the area under the receiver operating characteristic (ROC) curve (c-statistic) to assess performance of our final model in the derivation and validation sets.
We then developed a risk score using our predictive model. We based our point assignment on the model-derived odds ratios, which we standardized by subtracting one, dividing by the smallest value, setting the minimum point value to zero (to avoid negative point values), dividing by 10 (to achieve a narrower range), and rounding to the nearest integer.12 We did not apply discharge weights in any of our analyses, since the sampling strategy employed after the 2012 NIS redesign does not require application of weights when estimating means/medians and rates, as reported in our analysis. A 2-sided type-I error of < 0.05 was used to indicate statistical significance for all comparisons. Data were analyzed with SAS 9.4 (SAS Institute, Inc., Cary, NC).
Sensitivity Analyses
Because medical complexity itself can result in prolonged hospitalizations that may be less responsive to system-focused improvements, we reran our primary model in the derivation sample after excluding patients with discharge diagnosis codes for complications of care (‘DXCCS1’ = 237: Device Complications; ‘DXCCS1’ = 238: surgical/medical complications) as well as those who died during their hospital stay, to assure the model estimates were similar in this less medically complex subset of patients. In addition, since the primary diagnosis may not yet be clear at the start of a hospitalization, we also reran our main model in the derivation sample after removing the variables representing primary discharge diagnosis. Along those same lines, we did not include discharge disposition in our main model since this is not likely to be known at the time of admission; however, since facility disposition was much more common among prolonged stay hospitalizations, we reran our main model after including a term for facility disposition (versus non-facility) to obtain the adjusted association between this discharge status and prolonged hospitalization. Finally, to assess sensitivity of our results to our chosen definition of prolonged hospitalization, we reran our main model using a length of stay of 14 and 30 days to define prolonged hospitalization.
Results
There were 7,071,762 hospitalizations in the 2014 NIS dataset. Of these, 5,950,391 were age ≥ 18, and 3,108,712 of these were medical admissions. We excluded 76,653 hospitalizations with a primary diagnosis of rehabilitation care, 142 with missing length of hospitalization, and 34,668 with other missing predictor variables (471 with missing sex, 20,482 with missing transfer status, 4,450 with missing admission month, and 9,265 with missing elective/non-elective status), resulting in a final analytic sample of 2,997,249. This sample was divided into derivation (80%, n=2,397,800) and validation (20%, n=599,449) sets.
Prolonged Hospitalizations
Overall, 37,062 (1.2%) hospitalizations were characterized as prolonged. The median length of hospitalization in this group was 26.0 days (25th-75th percentile 23–33), compared to 3.0 (2–5) in the non-prolonged group. The derivation and validation sets were similar in all demographic, clinical, and hospitalization characteristics (Appendix Table 1). Table 1 demonstrates characteristics of the derivation set stratified by length of stay. Patients with prolonged hospitalizations were younger, more likely to have Medicaid insurance, and more likely to have four or more chronic diseases. The most common diagnosis for prolonged and non-prolonged hospitalizations was septicemia, but this diagnosis represented a larger percentage of prolonged hospitalizations. Patients with prolonged hospitalizations were less likely to be admitted from the ED, and more likely to be transferred from another facility. They were also more likely to be discharged to a facility or die during their hospitalization.
Table 1.
Characteristics of derivation cohort overall and stratified by prolonged and non-prolonged hospitalizations.
| Derivation | LOS ≤ 21 days | LOS > 21 days | p-value* | |
|---|---|---|---|---|
| Characteristics | n = 2,397,800 | n = 2,368,140 | n = 29,660 | |
| Age – % | <0.0001 | |||
| <50 | 20.5 | 20.5 | 20.0 | |
| 50–59 | 16.9 | 16.9 | 19.8 | |
| 60–69 | 19.6 | 19.5 | 23.3 | |
| 70–79 | 19.7 | 19.6 | 20.3 | |
| 80+ | 23.4 | 23.4 | 16.5 | |
| Female – % | 53.5 | 53.6 | 48.9 | <0.0001 |
| Race – % | <0.0001 | |||
| White | 66.2 | 66.3 | 58.0 | |
| Black | 15.5 | 15.4 | 21.8 | |
| Hispanic | 8.8 | 8.8 | 9.3 | |
| Asian | 2.1 | 2.0 | 2.9 | |
| Other/missing | 7.5 | 7.5 | 8.0 | |
| Primary payer – % | <0.0001 | |||
| Medicare | 59.1 | 59.1 | 54.1 | |
| Medicaid | 13.4 | 13.3 | 19.9 | |
| Private | 20.3 | 20.3 | 18.9 | |
| Self-pay | 4.4 | 4.4 | 4.0 | |
| No charge, other, missing | 2.9 | 2.9 | 3.1 | |
| Most common primary diagnosis grouped by AHRQ CCS** - condition (%) | ||||
| Septicemia | 8.6 | 8.4 | 20.5 | |
| CHF*** | 5.6 | 5.6 | 4.7 | |
| Pneumonia**** | 5.1 | 5.1 | 4.3 | |
| COPD and bronchiectasis | 3.8 | 3.8 | ||
| Cardiac dysrhythmias | 3.5 | 3.6 | ||
| Acute cerebrovascular disease | 3.5 | 3.5 | 3.7 | |
| Acute and unspecified renal failure | 3.2 | 3.2 | 3.2 | |
| Skin and subcutaneous tissue infections | 3.2 | 3.2 | ||
| Urinary tract infections | 3.1 | 3.1 | ||
| Diabetes mellitus with complications | 2.6 | 2.6 | ||
| Respiratory failure | 5.5 | |||
| Leukemias | 4.1 | |||
| Complication of device; implant or graft | 3.3 | |||
| Maintenance chemotherapy or radiotherapy | 3.1 | |||
| Complications of surgical procedures or medical care | 2.0 | |||
| Number of chronic diseases – % | <0.0001 | |||
| <4 | 21.4 | 21.6 | 7.9 | |
| ≥4 | 78.6 | 78.5 | 92.1 | |
| Comorbidities – % | ||||
| AIDS | 0.2 | 0.2 | 0.5 | <0.0001 |
| Alcohol abuse | 5.2 | 5.1 | 7.8 | <0.0001 |
| Deficiency anemias | 22.2 | 22.0 | 37.3 | <0.0001 |
| Rheumatoid arthritis/collagen vascular diseases | 3.5 | 3.5 | 3.4 | 0.37 |
| Chronic blood loss anemia | 1.1 | 1.1 | 1.7 | <0.0001 |
| Congestive heart failure | 13.5 | 13.4 | 23.8 | <0.0001 |
| Chronic pulmonary disease | 23.7 | 23.7 | 26.3 | <0.0001 |
| Coagulopathy | 6.3 | 6.1 | 19.4 | <0.0001 |
| Depression | 13.3 | 13.2 | 13.4 | 0.53 |
| Diabetes, uncomplicated | 24.3 | 24.3 | 22.6 | <0.0001 |
| Diabetes with chronic complications | 6.6 | 6.6 | 9.5 | <0.0001 |
| Drug abuse | 4.6 | 4.5 | 6.6 | <0.0001 |
| Hypertension | 59.9 | 60.0 | 54.5 | <0.0001 |
| Hypothyroidism | 14.3 | 14.3 | 12.5 | <0.0001 |
| Liver disease | 4.4 | 4.4 | 8.2 | <0.0001 |
| Lymphoma | 1.2 | 1.2 | 2.3 | <0.0001 |
| Fluid and electrolyte disorders | 34.9 | 34.5 | 63.0 | <0.0001 |
| Metastatic cancer | 2.8 | 2.8 | 3.9 | <0.0001 |
| Other neurological disorders | 9.6 | 9.5 | 13.6 | <0.0001 |
| Obesity | 14.3 | 14.3 | 16.2 | <0.0001 |
| Paralysis | 3.0 | 3.0 | 7.6 | <0.0001 |
| Peripheral vascular disorders | 7.3 | 7.3 | 8.0 | <0.0001 |
| Psychoses | 5.8 | 5.7 | 9.3 | <0.0001 |
| Pulmonary circulation disorders | 3.4 | 3.3 | 8.2 | <0.0001 |
| Renal failure | 18.1 | 18.0 | 25.5 | <0.0001 |
| Solid tumor without metastasis | 2.9 | 2.9 | 3.2 | <0.0001 |
| Peptic ulcer disease excluding bleeding | 0.04 | 0.04 | 0.1 | 0.007 |
| Valvular disease | 5.0 | 5.0 | 6.2 | <0.0001 |
| Weight loss | 6.7 | 6.5 | 28.2 | <0.0001 |
| Season of admission – % | <0.0001 | |||
| Winter | 25.8 | 25.7 | 27.5 | |
| Spring | 25.5 | 25.4 | 24.7 | |
| Summer | 24.5 | 24.6 | 23.9 | |
| Fall | 24.3 | 24.3 | 24.0 | |
| Weekend admission – % | 24.3 | 24.3 | 21.5 | <0.0001 |
| Came through ED – % | 78.2 | 78.3 | 67.6 | <0.0001 |
| Transferred in from another facility – % | 7.8 | 7.6 | 18.7 | <0.0001 |
| Utilization/Outcomes | ||||
| Length of hospitalization – median days (25th-75th percentile) | 3.0 (2–5) | 3.0 (2–5) | 26.0 (23–33) | <0.0001 |
| Disposition – % | <0.0001 | |||
| Home | 73.9 | 74.3 | 42.1 | |
| Facility | 18.2 | 17.9 | 42.2 | |
| Left against medical advice | 1.7 | 1.7 | 0.8 | |
| Dead | 3.3 | 3.2 | 11.2 | |
| Other | 2.9 | 2.9 | 3.8 |
P-value for comparison of prolonged and non-prolonged length of stay, using Wilcoxon and Fisher’s Exact tests for continuous and categorical data, respectively
Agency for Healthcare Research and Quality’s Clinical Classifications Software1
Non-hypertensive
Except that caused by tuberculosis or sexually transmitted disease
Adjusted Associations and Predictive Model
Many potential risk factors had strong independent associations with prolonged hospitalization (Table 2), the strongest of which were having four or more chronic diseases, and chronic comorbidities of electrolyte disturbance, weight loss, coagulation disorders and paralysis. The c-statistic for this model was 0.80 in both the derivation and validation sets (see Appendix Figure 1 for the corresponding ROC curves).
Table 2.
Adjusted associations between candidate risk factors and long-stay hospitalizations in the derivation cohort (n=2,397,800).
| Candidate Risk Factor | Estimate | Odds Ratio | 95% CI | |
|---|---|---|---|---|
| Demographics | ||||
| Age <80 | 0.44 | 1.55 | 1.50, 1.61 | |
| Male | 0.09 | 1.10 | 1.07, 1.12 | |
| >4 chronic diseases | 0.43 | 2.35 | 2.25, 2.47 | |
| Primary Payer | ||||
| Medicare | Reference | Reference | Reference | |
| Medicaid | 0.26 | 1.72 | 1.67, 1.78 | |
| Private | −0.08 | 1.22 | 1.18, 1.26 | |
| Self-pay | 0.05 | 1.39 | 1.31, 1.48 | |
| Other/no charge/missing | 0.05 | 1.39 | 1.30, 1.49 | |
| Primary diagnosis grouped by AHRQ CCS | ||||
| Sepsis | 0.38 | 1.47 | 1.42, 1.52 | |
| Congestive Heart Failure | 0.06 | 1.06 | 1.00, 1.13 | |
| Pneumonia | −0.13 | 0.88 | 0.83, 0.93 | |
| Chronic obstructive pulmonary disease | −0.75 | 0.47 | 0.43, 0.52 | |
| Arrhythmia | −0.93 | 0.40 | 0.35, 0.45 | |
| Stroke | 0.37 | 1.45 | 1.36, 1.54 | |
| Acute renal failure | −0.52 | 0.59 | 0.55, 0.64 | |
| Skin and soft tissue infections | −0.60 | 0.55 | 0.50, 0.61 | |
| Urinary tract infection | −0.87 | 0.42 | 0.38, 0.47 | |
| Diabetes mellitus | −0.87 | 0.42 | 0.38, 0.46 | |
| Respiratory failure | 0.46 | 1.58 | 1.50, 1.67 | |
| Comorbidities | ||||
| AIDS | 0.22 | 1.25 | 1.05, 1.49 | |
| Alcohol abuse | −0.10 | 0.91 | 0.87, 0.95 | |
| Deficiency anemias | 0.30 | 1.34 | 1.31, 1.38 | |
| Arthritis | −0.12 | 0.89 | 0.83, 0.95 | |
| Blood loss anemia | 0.25 | 1.28 | 1.17, 1.40 | |
| Congestive heart failure | 0.48 | 1.62 | 1.57, 1.67 | |
| Chronic lung disease | −0.13 | 0.88 | 0.86, 0.91 | |
| Coagulation disorders | 0.73 | 2.07 | 2.01, 2.14 | |
| Depression | −0.07 | 0.93 | 0.90, 0.97 | |
| Diabetes mellitus | −0.10 | 0.91 | 0.88, 0.93 | |
| Drug abuse | 0.12 | 1.13 | 1.07, 1.18 | |
| Chronic hypertension | −0.29 | 0.75 | 0.73, 0.77 | |
| Hypothyroidism | −0.17 | 0.85 | 0.82, 0.88 | |
| Chronic liver disease | 0.04 | 1.04 | 0.99, 1.09 | |
| Lymphoma | 0.34 | 1.41 | 1.30, 1.52 | |
| Fluid and Electrolyte disturbances | 0.84 | 2.31 | 2.26, 2.37 | |
| Metastatic cancer | −0.15 | 0.86 | 0.81, 0.92 | |
| Other neurologic disorders | 0.17 | 1.19 | 1.14, 1.23 | |
| Obesity | 0.08 | 1.09 | 1.05, 1.12 | |
| Paralysis | 0.63 | 1.89 | 1.80, 1.97 | |
| Peripheral vascular disease | −0.05 | 0.95 | 0.91, 0.99 | |
| Psychosis | 0.35 | 1.42 | 1.36, 1.48 | |
| Pulmonary circulatory disorder | 0.48 | 1.61 | 1.54, 1.69 | |
| Chronic renal failure | 0.21 | 1.24 | 1.20, 1.27 | |
| Solid tumor without metastases | −0.15 | 0.86 | 0.81, 0.92 | |
| Peptic ulcer disease without bleeding | 0.37 | 1.45 | 0.94, 2.24 | |
| Valvular disease | −0.12 | 0.89 | 0.84, 0.93 | |
| Weight loss | 1.22 | 3.39 | 3.30, 3.49 | |
| Hospitalization characteristics | ||||
| Winter | 0.09 | 1.10 | 1.07, 1.13 | |
| Weekday admission | 0.15 | 1.16 | 1.12, 1.19 | |
| Direct admission | 0.43 | 1.54 | 1.50, 1.58 | |
| Admitted as transfer | 0.60 | 1.82 | 1.76, 1.89 | |
Table 3 demonstrates the risk score derived from the significant variables in the model. Risk of prolonged hospitalization increased with increasing risk score in both the derivation and validation sets, with less than 1% of hospitalizations with a risk score of 16 or less having a prolonged hospitalization, and almost 20% of hospitalizations with a risk score of 36 or more having a prolonged hospitalization (Figure 1). Table 4 shows the proportion of hospitalizations falling into each risk group, and the corresponding rates of prolonged hospitalizations at varying risk thresholds.
Table 3.
Clinical Risk Scoring System for Long-Stay Hospitalizations In Medical Patients
| Risk Factor | Pointsa |
|---|---|
| Age <80 | 2 |
| Male | 1 |
| ≥4 chronic diseases | 4 |
| Primary payer | |
| Medicare | 0 |
| Medicaid | 3 |
| Private | 2 |
| Self-Pay | 2 |
| Other/no charge/missing | 2 |
| Primary diagnosisb: | |
| Sepsis | 2 |
| Congestive heart failure | 1 |
| Pneumonia | 1 |
| Stroke | 2 |
| Respiratory failure | 2 |
| Comorbiditiesc: | |
| Acquired immune deficiency syndrome | 2 |
| Alcohol abuse | 1 |
| Deficiency anemia | 2 |
| Rheumatoid arthritis/collagen vascular diseases | 1 |
| Chronic blood loss anemia | 2 |
| Congestive heart failure | 2 |
| Chronic pulmonary disease | 1 |
| Coagulopathy | 3 |
| Depression | 1 |
| Diabetes (with or without complications) | 1 |
| Drug abuse | 1 |
| Hypertension | 1 |
| Hypothyroidism | 1 |
| Lymphoma | 2 |
| Fluid and electrolyte disorders | 4 |
| Metastatic cancer | 1 |
| Other neurological disorders | 1 |
| Obesity | 1 |
| Paralysis | 3 |
| Peripheral vascular disorders | 1 |
| Psychoses | 2 |
| Pulmonary circulation disorders | 2 |
| Renal failure | |
| Solid tumor without metastasis | 1 |
| Valvular disease | 1 |
| Weight loss | 6 |
| Hospitalization characteristics | |
| Winter season (December, January, February) | 1 |
| Weekday admission (Monday to Friday) | 1 |
| Not admitted through the Emergency Department | 2 |
| Admitted as a transfer from another healthcare facility | 3 |
An individual patient’s Clinical Risk Score is derived by summing the points for each risk factor present on admission
Defined by primary ICD-9-CM discharge diagnosis groupings, based on the Agency for Healthcare Research and Quality’s (AHRQ) Clinical Classifications Software.
Defined by the HCUP comorbidity software, which identifies coexisting medical conditions that are not directly related to the principal diagnosis, or the main reason for admission, and are likely to have originated prior to the hospital stay. Comorbidities are identified using ICD-9-CM diagnoses and the Diagnosis Related Group (DRG) in effect on the discharge date.
Figure 1.
Rates of prolonged hospitalization according to number of points in the derivation and validation sets.
Table 4.
Long-Stay Hospitalizations According To Risk Group In The Overall Cohort (n = 2,997,249)
| Risk Score Threshold | Hospitalizations (% of total) | Long-Stay | Percent With Long-Stay |
|---|---|---|---|
| ≥18 | 834,649 (27.85) | 25,032 | 3.00 |
| ≥20 | 518,002 (17.28) | 20,177 | 3.90 |
| ≥22 | 303,124 (10.11) | 15,221 | 5.02 |
| ≥24 | 166,461 (5.55) | 10,562 | 6.35 |
| ≥26 | 85,795 (2.86) | 6,797 | 7.92 |
| ≥28 | 41,167 (1.37) | 4,041 | 9.82 |
| ≥30 | 17,861 (0.60) | 2,118 | 11.86 |
| ≥32 | 7,096 (0.24) | 993 | 13.99 |
| ≥34 | 2,606 (0.09) | 448 | 17.19 |
| ≥36 | 880 (0.03) | 173 | 19.66 |
Sensitivity Analyses
As seen in Appendix Tables 2 and 3, the model estimates and c-statistics were almost identical after excluding patients of higher medical complexity (c-statistic 0.80), and after removing variables representing discharge diagnosis from the model (c-statistic 0.79). Disposition to a facility was significantly associated with prolonged hospitalization (odds ratio 2.81, 95% confidence interval 2.74–2.89), and marginally increased the model c-statistic to 0.81 (Appendix Table 4). Finally, using 14 day and 30 day cutoffs to define prolonged hospitalization resulted in models with largely similar point estimates and similar c-statistics (0.78 and 0.81, respectively; Appendix Tables 5 and 6).
Discussion
In this large, national sample of adult medical hospitalizations, we identified many characteristics significantly associated with prolonged hospitalizations, including multiple chronic morbidities, weight loss, electrolyte disturbances, coagulation disorders, and paralysis, among others. The risk prediction score we developed had excellent predictive value for patients at risk for prolonged hospitalizations, with a c-statistic of 0.80 in the validation set, and a greater than 20-fold difference in risk for prolonged hospitalization between patients at lowest and highest risk based on our score. This is the first prediction score for patients at risk for prolonged hospitalizations derived using nationally representative administrative data.
Previous single-center studies also found that younger age, public insurance, and discharge to a post-acute care facility were associated with increased risk of prolonged hospitalizations. In addition, sepsis remained the most common diagnosis among prolonged hospitalizations in our paper, as in prior papers.4,13 Our study adds further to this literature in several ways. First, by identifying these factors using purely administrative data sources, our model should be easier to apply across U.S. hospitals compared to models based on collection of clinical data. Second, using nationally representative data to derive our model should increase generalizability over prior single center analyses. Our study also adds to prior studies with the incorporation of individual comorbidities, including electrolyte disturbances and weight loss, as well as burden of chronic disease at baseline. These findings are consistent with prior research demonstrating that poor nutritional status and increased comorbidity burden are associated with increased length of stay.14
We found that patients accepted in transfer from other hospitals are also at risk of prolonged hospitalization, independent of the other clinical factors in our model. This is consistent with prior studies that have demonstrated increased length of stay among transfer patients, as well as fewer discharges to home, which may also contribute to prolonged length of stay.15–17 Finally, we also found that patients with prolonged hospitalizations were more likely to die during the hospitalization, consistent with prior studies.
All of these markers of increased clinical complexity shed light on the multifactorial drivers that contribute to prolonged length of stay, not solely socioeconomic factors, which have been previously emphasized. Our sensitivity analysis excluding patients who died or experienced complications of care demonstrated identical model performance among the less medically complex population, who may be more likely to experience unplanned and/or preventable delays in hospital discharge.
Prior studies have suggested that patients with prolonged hospitalization suffer from delays related to discharge coordination and placement.5 This is corroborated by our finding on sensitivity analysis that facility discharge was significantly associated with prolonged hospitalization. In addition, unnecessary delays are harmful for patients psyche and may increase risk of hospital-acquired complications. Early identification with a screening tool may eventually allow for deployment of programs to reduce length of stay. The specific interventions deployed would depend on subsequent case review to identify the specific reason(s) for the delay. Potential interventions include an ethics consult, review by a multidisciplinary panel,6 or deployment of a specialized social work or case management team.7 Given the heterogeneity of characteristics associated with prolonged length of stay, as suggested in our research and prior research, different responses are necessary based on the particular risk for delay.
It is important to note, however, that despite the greater than 20 fold increase in risk of prolonged hospitalization between the lowest and highest risk groups, the positive predictive value of our model is low due to the infrequency of prolonged hospitalization. For example, based on our model, only 15 out of 100 patients in the high risk group had a prolonged hospitalization. Because of this, the greatest utility of our model may lie in efficiently targeting groups for further study, to better understand the more granular contributors to prolonged hospitalization and use this information to identify appropriate intervention targets.
Although our model performed well when applied retrospectively, the performance of the model when applied prospectively will need to be assessed in future studies. Specifically, our model relied on discharge diagnosis codes to define clinical conditions, some of which may not have been apparent at the time of admission. However, even after removing primary discharge diagnosis from our model, the effect estimates for the remaining variables and model performance were unchanged. Along the same lines, we did not include discharge disposition in our main predictive model despite a strong association between disposition and prolonged hospitalization, since our goal was to develop a model that could be applied at the time of admission, when disposition status may not be known. Additionally, because the NIS data are de-identified, we could not include diagnosis codes from prior episodes of care in our model. Future studies should assess the performance of our model when applied in real-time at hospital admission, using all available prior diagnosis codes.
Nonetheless, this study is a first step toward developing a new tool for hospital use which may be able to identify patients at risk for a prolonged hospitalization. At present, hospitals have experience with risk assessment tools to identify patients at risk for a 30-day readmission,19 but do not currently have a tool to prospectively identify patients at risk for prolonged If prospectively validated, the tool could be used to identify the groups of patients at highest risk for a prolonged by incorporating administrative data available early in the course of hospitalization. Hospital administrators or researchers could vary their chosen threshold of risk depending on the cost or availability of whatever risk modification strategies are to be applied to that subset of patients. This will be increasingly useful as additional evidence-based interventions to reduce length of stay for patients with prolonged hospitalizations become available.20
In addition to the limitations described above related to use of discharge diagnoses, there are other limitations to our analysis. Although the administrative nature of the data could increase ease of operationalization at individual hospitals, there are limitations to administrative data as well; specifically, the possibility of diagnostic misclassification and the lack of individualized clinical data that would permit understanding of the details of the hospital stay, including possibly unnecessary hospital days. The absence of individual patient identification in the data set also prevented us from identifying repeated hospitalizations of the same patient in our data set. In addition, the complexity of our model, which arises from the large number of statistically significant variables, limits the application of the model to operationalization through electronic health records (EHRs) and other automated methods, rather than individual clinicians or administrators. Given the current penetration of EHRs, this should not significantly limit the applicability of the tool. Since the sample excluded surgical patients, the model should not be applied to this patient population.
Conclusion
Many factors are associated with prolonged hospitalizations. An administrative score derived from those factors can be used to identify patients at risk for prolonged hospitalizations. If prospectively validated, our score could allow for scarce resources to be directed toward facilitating discharges once medically ready.
Supplementary Material
Acknowledgments
Funding: Dr. Herzig was funded by grant number K23AG042459 from the National Institute on Aging. The funding organization had no involvement in any aspect of the study, including design, conduct, and reporting of the study.
Footnotes
Conflicts of interest: There are no conflicts of interest to report.
References
- 1.Doctoroff L, Hsu DJ, Mukamal KJ. Trends in Prolonged Hospitalizations in the United States from 2001 to 2012: A Longitudinal Cohort Study. Am. J. Med April 2017;130(4):483 e481–483 e487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Association AH. Trendwatch Chartbook 2018: Trends Affecting Hospitals and Health Systems. Washington, DC: American Hospital Association; 2018: https://www.aha.org/system/files/2018-07/2018-aha-chartbook.pdf. Accessed 4/12/2019. [Google Scholar]
- 3.O’Sullivan K, Martensson J, Robbins R, Farley K, Johnson D, Jones D. Epidemiology of long-stay patients in a university teaching hospital. Internal Medicine Journal. 2017;47(5):513–521. [DOI] [PubMed] [Google Scholar]
- 4.Anderson ME, Glasheen JJ, Anoff D, Pierce R, Capp R, Jones CD. Understanding predictors of prolonged hospitalizations among general medicine patients: A guide and preliminary analysis. Journal of Hospital Medicine. 2015;10(9):623–626. [DOI] [PubMed] [Google Scholar]
- 5.Zhao EJ, Yeluru A, Manjunath L, et al. A long wait: barriers to discharge for long length of stay patients. Postgrad. Med. J 2018;94(1116):546–550. [DOI] [PubMed] [Google Scholar]
- 6.MacKenzie TD, Kukolja T, House R, et al. A discharge panel at Denver Health, focused on complex patients, may have influenced decline in length-of-stay. Health Aff. (Millwood). August 2012;31(8):1786–1795. [DOI] [PubMed] [Google Scholar]
- 7.Osborne S, Harrison G, O’Malia A, Barnett AG, Carter HE, Graves N. Cohort study of a specialist social worker intervention on hospital use for patients at risk of long stay. BMJ Open. 2018;8(12):e023127–e023127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.(NIS) HNIS. Healthcare Cost and Utilization Project (HCUP) Agency for Healthcare Research and Quality. 2011. [PubMed]
- 9.Steiner C, Elixhauser A, Schnaier J. The healthcare cost and utilization project: an overview. Eff. Clin. Pract May-Jun 2002;5(3):143–151. [PubMed] [Google Scholar]
- 10.Sengupta N, Tapper EB. Derivation and Internal Validation of a Clinical Prediction Tool for 30-Day Mortality in Lower Gastrointestinal Bleeding. Am. J. Med May 2017;130(5):601 e601–601 e608. [DOI] [PubMed] [Google Scholar]
- 11.Oseran AS, Lage DE, Jernigan MC, Metlay JP, Shah SJ. A “Hospital-Day-1” Model to Predict the Risk of Discharge to a Skilled Nursing Facility. Journal of the American Medical Directors Association. June 2019;20(6):689–695 e685. [DOI] [PubMed] [Google Scholar]
- 12.Iasonos A, Schrag D, Raj GV, Panageas KS. How to build and interpret a nomogram for cancer prognosis. J Clin Oncol March 10 2008;26(8):1364–1370. [DOI] [PubMed] [Google Scholar]
- 13.Barba R, Marco J, Canora J, et al. Prolonged length of stay in hospitalized internal medicine patients. European journal of internal medicine. December 2015;26(10):772–775. [DOI] [PubMed] [Google Scholar]
- 14.Tsaousi G, Panidis S, Stavrou G, Tsouskas J, Panagiotou D, Kotzampassi K. Prognostic indices of poor nutritional status and their impact on prolonged hospital stay in a Greek university hospital. BioMed research international. 2014;2014:924270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sokol-Hessner L, White AA, Davis KF, Herzig SJ, Hohmann SF. Interhospital transfer patients discharged by academic hospitalists and general internists: Characteristics and outcomes. J Hosp Med April 2016;11(4):245–250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mueller S, Zheng J, Orav EJ, Schnipper JL. Inter-hospital transfer and patient outcomes: a retrospective cohort study. BMJ Quality & Safety. 2018:bmjqs-2018–008087. [DOI] [PMC free article] [PubMed]
- 17.Librero J, Peiro S, Ordinana R. Chronic comorbidity and outcomes of hospital care: length of stay, mortality, and readmission at 30 and 365 days. J. Clin. Epidemiol March 1999;52(3):171–179. [DOI] [PubMed] [Google Scholar]
- 18.Wilson DM, Vihos J, Hewitt JA, Barnes N, Peterson K, Magnus R. Examining waiting placement in hospital: utilization and the lived experience. Global journal of health science. November 14 2013;6(2):12–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Donze J, Aujesky D, Williams D, Schnipper JL. Potentially avoidable 30-day hospital readmissions in medical patients: derivation and validation of a prediction model. JAMA Internal Medicine. April 22 2013;173(8):632–638. [DOI] [PubMed] [Google Scholar]
- 20.Townsend CS, McNulty M, Grillo-Peck A. Implementing Huddles Improves Care Coordination in an Academic Health Center. Professional Case Management. 2017;22(1):29–35. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

