Abstract
Background:
Using electronic health records (EHRs) for population risk stratification has gained attention in recent years. Compared to insurance claims, EHRs offer novel data types (e.g., vital signs) that can potentially improve population-based predictive models of cost and utilization.
Objective:
To evaluate if EHR-extracted body mass index (BMI) improves the performance of diagnosis-based models to predict concurrent and prospective healthcare costs and utilization.
Methods:
We used claims and EHR data over a 2-year period from a cohort of continuously insured patients (aged 20–64 years) within an integrated health system. We examined the addition of BMI to three diagnosis-based models of increasing comprehensiveness (i.e., demographics, Charlson, and Dx-PM model of the ACG system) to predict concurrent and prospective costs and utilization, and compared the performance of models with and without BMI.
Results:
The study population included 59,849 patients, 57% female, with BMI class-I, II and III comprising 19%, 9% and 6% of the population. Among demographic models, R2 improvement from adding BMI ranged from 61% (i.e., R2 increased from .56 to .90) for prospective pharmacy cost to 29% (1.24 to 1.60) for concurrent medical cost. Adding BMI to demographic models improved the prediction of all binary service-linked outcomes (i.e., hospitalization, emergency department admission, and being in top 5% total costs) with AUC increasing 2% (.602 to .617) to 7% (.516 to .554). Adding BMI to Charlson models only improved total and medical cost predictions prospectively (13% and 15%; 4.23 to 4.79 and 3.30 to 3.79), and also improved predicting all prospective outcomes with AUC increasing 3% (.649 to .668) to 4% (.639 to .665; and, .556 to .576). No improvements in prediction were seen in the most comprehensive model (i.e., Dx-PM).
Discussion:
EHR-extracted BMI levels can be used to enhance predictive models of utilization especially if comprehensive diagnostic data are missing.
Keywords: Body Mass Index (BMI), Risk Stratification, Predictive Modeling, Electronic Health Records, Administrative Claims, The Johns Hopkins ACG System
Introduction
Obesity and its related conditions account for approximately 10% of all medical spending in the United States, which was estimated to amount to $147 billion annually in 2008 1. Medicare and Medicaid finance a substantial fraction of state-level obesity costs, which varies by state (ranges 25%−64%) 2. Increasing body mass index (BMI) has been associated with greater inpatient, outpatient and emergency department (ED) utilization 3.
Obesity-related conditions increase healthcare costs at both the person and population levels 4. Most recent estimates suggest that nearly 40% of U.S. adults are obese 5, and excess body weight increases the risk of death from cardiovascular disease, diabetes mellitus, kidney disease, and certain cancers 6. Obesity makes some conditions, like hypertension and osteoarthritis, costlier than if these conditions were present without obesity 4. This evidence suggests that BMI may help predict healthcare costs and utilization.
To date, predictive models have typically used diagnosis information derived from health insurance claims to forecast costs and utilization. Health plans and government agencies heavily rely on these models for profiling providers, adjusting payments, underwriting, and prioritizing patients for care management 7, 8. With the widespread implementation of electronic health records (EHR), a new data source is becoming widely accessible that contains not only diagnosis information, but other measures such as vital signs and laboratory values that are not available in claims 9, 10. In a previous study, we found that adding laboratory markers to diagnosis-based models significantly improved the prediction of costs and inpatient admissions for basic models that account for age, gender, and multiple comorbid conditions 11. The predictive ability of comprehensive diagnosis-based risk models were only marginally improved at the population level, which we suspect occurred due to the interrelationship between laboratory results and conditions already captured in the comprehensive diagnostic information.
Several studies using EHR data have found that patients who meet BMI criteria for overweight or obesity often do not have a diagnosis code for these conditions 12, 13, 14. Since weight status is not usually captured in diagnosis data, claims-based predictive models might benefit from the addition of BMI to improve population-based forecasts of costs and inpatient hospitalization. Although prior studies have evaluated the value of BMI in predicting cost 15, studies have not incorporated BMI into a diagnosis-based predictive model for healthcare utilization outcomes.
In this study, our objective was to determine whether the addition of BMI markers improves the performance of diagnosis-based predictive models among individuals enrolled in a health plan and actively receiving ambulatory care from an integrated delivery system. We hypothesized that these BMI risk markers would improve predictive models of costs and inpatient admissions, relative to diagnosis-based risk markers obtained from claims and structured EHR data.
Methods
Study Design and Setting
This study is a two-year retrospective cohort study (2012–2013) of data provided by HealthPartners, which is a Minnesota-based health insurer with 1.5 million beneficiaries and an integrated delivery system that provides healthcare at 7 hospitals, 47 primary care clinics, and 22 urgent care centers 16.
Data Source
The database contained (1) structured EHR data that included encounter diagnoses, height, weight and BMI values obtained during outpatient visits within the network; (2) administrative data that included insurance enrollment, benefit eligibility files, and demographics; and, (3) medical and pharmacy claims data that included ICD diagnoses and CPT procedure codes from inpatient and outpatient settings, prescriptions as National Drug Code (NDC), date of services, paid amount by plan, and out-of-pocket amount by individuals. Our predictors, including BMI, were derived from 2012 data, while our cost and utilization outcomes were constructed from data in both 2012 (concurrent) and 2013 (prospective).
Study Population
Our denominator population included 141,087 patients. We excluded 37,245 patients who were not aged 20–64 years, were not continuously enrolled in 2012 and 2013, or did not have at least one visit to one of Health Partners’ outpatient clinics in 2012 (Figure 1). We excluded individuals who were identified as having bariatric surgery (n=223) or limb amputation (n=48) in 2012, or having cancer (n=5401), pregnancy (n=5646) or congestive heart failure (n=444) in either years, as these conditions or procedures can substantially alter weight. Total patients excluded due to these conditions were 11,158 individuals as some patients had more than one condition. We also excluded people without a valid BMI value (n=32,835; processes described below). Our final analytic sample included 59,849 patients who had at least one outpatient visit in 2012 with a valid BMI record (Figure 1; see online supplemental digital content # 1 for additional information about the excluded patients).
Creation of BMI Risk Markers
The creation of BMI risk markers required several steps (Figure 1). First, we defined invalid height and weight values that were entered into the EHR due to the potential data entry errors in clinical settings. We defined the range of valid weight between 25 and 635 kg and valid height between .67 and 2.72 meters, based on reported minimums and maximums weight 17, 18 and height 19, 20. Second, we determined the BMI value for each visit. We calculated BMI for each visit that had valid weight and height values. For visits that only had valid weight available, we used the BMI value, if available, under the assumption that the EHR calculated the BMI using a prior height measure which is generally stable in adults. For visits that only had valid height available, we did not use the BMI value, if available, given the EHR is likely carrying over a prior weight and weight has the potential to change over time. We excluded visits that had neither valid weight nor height. Third, we designated a BMI value for every individual. The value needed to be within a defined valid BMI range (9–129 kg/m2) 5 and occur in 2012 leaving 141,305 visits for inclusion (Figure 1). For individuals with multiple BMI values in 2012, we selected the last BMI recorded that year.
We created BMI markers based on CDC’s standard categories 21 (i.e., underweight BMI<18.5 kg/m2; normal 18.5≤BMI<25 kg/m2; overweight 25≤BMI<30 kg/m2; class I obesity 30≤BMI<35 kg/m2; class II obesity 35≤BMI<40 kg/m2; and, class III obesity BMI≥40 kg/m2). After reviewing distribution of groups and model performance, we grouped together underweight, normal and overweight into a single category to have a more parsimonious model. Our final BMI markers used a four-level BMI category (i.e., under/normal/overweight BMI<30 kg/m2; class I obesity 30≤BMI<35 kg/m2; class II obesity 35≤BMI<40 kg/m2; and, class III obesity BMI≥40 kg/m2).
Diagnosis-based Population Risk Assessment Models
Similar to the strategy that we used in our prior research to assess the value of EHR data in predicting utilization 9, 11, 22, we created two different diagnosis-based population risk assessment models using 2012 data. We also created a basic model that only included gender and age-groups (aged 20–34; 35–44; 45–54; 55–64). The first diagnosis-based model used the Charlson comorbidity index in addition to age and gender 23, 24. Charlson index has been adopted widely and shown to be predictive of healthcare costs 25. The second diagnosis-based model used the Dx-PM score, derived from the Adjusted Clinical Groups® (ACG) System 26, 27, in addition to age and gender. The ACG system provides various measures of an individual’s morbidity using diagnosis and/or pharmacy claims. The Dx-PM score is a comprehensive diagnosis-based predicted score constructed from various ACG morbidity metrics, including age groups, sex, a pregnancy without delivery indicator, hospital dominant markers (factors associated with 50%+ of hospital admission in the next year), a medically frail indicator, and selected chronic disease markers. The Dx-PM score has been demonstrated to be a valid measure of morbidity 9, 10, 22, 28.
Outcome Measures
Our primary outcomes were the individual’s annual costs and utilization of healthcare services in 2012 (concurrent) and 2013 (prospective). For costs, we calculated three types of annual costs – total, medical, and pharmacy costs. Annual total cost was the sum of paid and out-of-pocket amount derived from both medical and pharmacy claims during a calendar year, while annual medical and pharmacy costs were derived from their respective claims. All costs were truncated at top 0.5%, as cost is highly skewed 9, 22, 28.
We also identified patients in the top 5% total costs among all beneficiaries. For utilization, we identified beneficiaries as having any hospitalization and having any emergency department (ED) visit in each period.
Statistical Analyses
We determined the descriptive characteristics of study subjects by the BMI groups. To test the added value of BMI markers, we added them to each of the three base models (i.e., demographic, Charlson, Dx-PM) to create BMI-enhanced models. For the cost outcomes, we used a linear regression model, which is the standard model adopted in risk adjustment studies and has been shown to produce similar performance relative to more advanced statistical methods 9, 22, 28. We compared changes in R2 and Mean Absolute Prediction Error (MAPE) 29 between the base and BMI-enhanced models.
For the utilization outcomes and top 5% total costs analyses, we applied logistic regression and compared the area under the curve (AUC) between the base and BMI-enhanced models. We performed a bootstrap analysis of 300 runs to provide point estimate and 95% confidence intervals for all performance measures (R2, MAPE and AUC). BMI was considered statistically significant in improving model performance if the 95% CIs of the BMI-enhanced model did not contain the point estimate (i.e., R2, MAPE or AUC) of the base model and vice versa.
Results
Characteristics of Study Population
The study population included 59,849 patients, 57% female, with 45–54 and 55–64 age groups having the highest share of patients (29% & 28%; Table 1). The majority of patients were in the ‘under/normal/overweight’ group (66%) followed by class-I (19%), class-II (9%) and class-III (6%). Higher classes of BMI had a higher percentage of older age ranges (e.g., age 45–54 constituted 27% of ‘under/normal/overweight’ group while this was 32% for class-III obesity). Few patients who had BMI values consistent with obesity had an encoded diagnosis of obesity in EHR and/or claims (8.15% for class I; 18.14% for class II, and 35.19% for class III; Table 1).
Table 1:
BMI Levela | Under-Normal -Overweightb |
Class I | Class II | Class III | Total Sample |
---|---|---|---|---|---|
N | 39,621 | 11,597 | 5,193 | 3,438 | 59,849 |
% Percentage | 66.20 | 19.38 | 8.68 | 5.74 | 100.00 |
Demographics | |||||
% Female | 59.34 | 48.66 | 57.42 | 66.67 | 57.53 |
Age in 2012 | 44.44 | 47.55 | 47.72 | 47.42 | 45.50 |
% Age 20–34 | 24.73 | 15.05 | 14.54 | 14.89 | 21.41 |
% Age 35–44 | 21.06 | 20.77 | 20.99 | 21.32 | 21.01 |
% Age 45–54 | 27.79 | 31.84 | 31.25 | 32.34 | 29.13 |
% Age 55–64 | 26.42 | 32.34 | 33.22 | 31.44 | 28.45 |
Comorbidity in 2012c | |||||
Charlson Index | 0.13 | 0.19 | 0.22 | 0.28 | 0.16 |
# Chronic Conditions | 1.63 | 2.27 | 2.61 | 3.14 | 1.92 |
# Rx Ingredients | 3.11 | 3.82 | 4.40 | 4.92 | 3.46 |
Dx-PM Score | 1.11 | 1.31 | 1.45 | 1.62 | 1.21 |
% Type-1 Diabetes | 1.05 | 1.07 | 1.23 | 0.70 | 1.05 |
% Type-2 Diabetes | 2.76 | 9.18 | 14.08 | 20.19 | 5.99 |
% Hypertension | 11.68 | 26.33 | 34.07 | 40.55 | 18.12 |
% Ischemic Heart | 1.33 | 2.95 | 2.79 | 2.39 | 1.83 |
% Acute Myocardial Infarction | 0.19 | 0.33 | 0.33 | 0.15 | 0.23 |
% Obesityd | 0.50 | 8.15 | 18.14 | 35.19 | 5.51 |
Utilization | |||||
# IP in 2012 | 0.04 | 0.06 | 0.06 | 0.07 | 0.05 |
# ED in 2012 | 0.13 | 0.17 | 0.17 | 0.21 | 0.14 |
% Any IP in 2012 | 3.21 | 4.65 | 4.81 | 5.67 | 3.77 |
% Any ED in 2012 | 10.06 | 12.51 | 13.15 | 15.68 | 11.13 |
% Top 5% cost in 2012 | 4.30 | 5.85 | 6.64 | 7.82 | 5.01 |
# IP in 2013 | 0.03 | 0.05 | 0.06 | 0.09 | 0.04 |
# ED in 2013 | 0.11 | 0.14 | 0.16 | 0.20 | 0.12 |
% Any IP in 2013 | 2.54 | 3.96 | 4.56 | 7.04 | 3.25 |
% Any ED in 2013 | 8.60 | 10.55 | 12.75 | 14.51 | 9.68 |
% Top 5% cost in 2013 | 4.00 | 5.87 | 7.36 | 10.15 | 5.01 |
Cost in $e | |||||
Total Cost in 2012 | 4618 | 5625 | 6247 | 7158 | 5100 |
Medical cost in 2012 | 3880 | 4731 | 5114 | 5901 | 4268 |
Pharmacy in 2012 | 695 | 870 | 1096 | 1223 | 794 |
Total Cost in 2013 | 4333 | 5572 | 6336 | 7957 | 4955 |
Medical cost in 2013 | 3551 | 4613 | 5126 | 6516 | 4064 |
Pharmacy in 2013 | 741 | 940 | 1176 | 1363 | 853 |
statistical significance of utilization differences among different BMI-levels was not calculated as p-value is not very informative in large populations (i.e., all p-values were significant at .001 level);
under 587 (0.98%), normal 19,162 (32.02%), and overweight 19,872 (33.20%);
generated by the ACG system using diagnosis codes of both claims and EHR data;
diagnosis of obesity captured as a diagnostic code in the EHR and/or claims; and,
cost is truncated at top 0.5%.
ED: Emergency Department; IP: Inpatient; Rx: Medication
Comorbidity also increased across the BMI levels. The overall population had a Charlson score of 0.16 and Dx-PM score of 1.21, while these scores were 0.22 and 1.45 for class-II obesity, and 0.28 and 1.62 for class-III obesity. Higher BMI levels had a considerably higher rate of type 2 diabetes, hypertension, and ischemic heart disease compared to the lower groups. For example, patients with class III obesity had 7-, 3- and almost 2-times higher rates of type 2 diabetes, hypertension, and ischemic heart disease respectively when compared to the ‘under/normal/overweight’ group (Table 1).
Utilization rates were higher among higher BMI levels. Patients with class-III obesity had 3-times higher rates of inpatient admissions in 2013 compared to the ‘under/normal/overweight’ group, while this rate was 1.8 for ED admissions. A higher percentage of patients with class III obesity were among the top 5% total costs when compared to the other groups (10.1% versus 4.0%, 5.8% and 7.3%; Table 1). The same trend was evident for continuous costs (Table 1).
Impact of Adding BMI Levels on Predicting Cost
The added predictive value of BMI markers was measured by comparing model performance before and after adding them to three underlying “base” models: demographics, Charlson, and Dx-PM (Table 2). The models were used to predict total, pharmacy, and medical cost concurrently (2012) and prospectively (2013). Adding the 4 categories of BMI levels improved the performance of the demographic model across all costs (i.e., 95% CI of the enhanced model did not contain the R2 of the base model and vice versa). The relative R2 improvement of the demographic model ranged from 61% (i.e., absolute R2 increased from .56 to .90) for pharmacy cost prediction prospectively to 29% (from 1.24 to 1.60) for medical cost concurrently. Adding BMI to the Charlson model only improved prospective total and medical cost predictions (13% and 15%, respectively; from 4.23 to 4.79 and from 3.30 to 3.79). Although the face value of R2 increased by adding BMI levels to Dx-PM models (i.e., advanced diagnostic-based models), the added performance did not prove to stabilize in iterative runs, hence did not show statistical significance (i.e., 95% CI ranges of the base model contains R2 of the BMI-enhanced model; Table 2).
Table 2:
Period | Concurrent | Prospective | ||||
---|---|---|---|---|---|---|
Type of Cost | Total Cost |
Pharmacy Cost |
Medical Cost |
Total Cost |
Pharmacy Cost |
Medical Cost |
Base Model 1: Age, Sex | 1.49 (1.31-1.68) |
0.61 (0.49-0.76) |
1.24 (1.07-1.44) |
1.62 (1.44-1.84) |
0.56 (0.44-0.70) |
1.38 (1.21-1.57) |
Base Model 1 + Four BMI Levels | 1.97 (1.74-2.20) |
0.96 (0.79-1.13) |
1.60 (1.39-1.83) |
2.43 (2.20-2.74) |
0.90 (0.75-1.07) |
2.07 (1.86-2.32) |
Base Model 2: Age, Sex, Charlson Index | 6.22 (5.66-6.85) |
2.82 (2.47-3.18) |
5.18 (4.61-5.76) |
4.23 (3.85-4.68) |
2.42 (2.11-2.77) |
3.30 (2.92-3.71) |
Base Model 2 + Four BMI Levels | 6.47 (5.85-7.07) |
3.02 (2.66-3.40) |
5.36 (4.80-5.98) |
4.79 (4.39-5.24) |
2.64 (2.34-2.96) |
3.79 (3.42-4.21) |
Base Model 3: Age, Sex, ACG Risk Score | 43.34 (41.96-45.04) |
26.93 (24.83-28.72) |
32.06 (30.45-33.66) |
24.21 (22.86-25.61) |
23.18 (21.88-25.04) |
15.09 (13.73-16.46) |
Base Model 3 + Four BMI Levels | 43.36 (41.97-45.04) |
26.94 (24.84-28.74) |
32.08 (30.46-33.66) |
24.43 (23.12-25.83) |
23.19 (21.19-25.06) |
15.33 (14.01-16.65) |
(a) cost is truncated at top 0.5%; and, (b) 95% confidence interval is generated using 300 runs of different random splitting
Using the MAPE performance measure, which can be used to compare model performance across different outcomes, revealed no value in adding BMI levels to any of the underlying base predictive models (Table 3).
Table 3:
Period | Concurrent | Prospective | ||||
---|---|---|---|---|---|---|
Type of Cost | Total Cost |
Pharmacy Cost |
Medical Cost |
Total Cost |
Pharmacy Cost |
Medical Cost |
Base Model 1: Age, Sex | 100.49 (99.80-101.19) |
136.54 (135.79-137.26) |
103.40 (102.65-104.15) |
105.16 (104.43-105.82) |
138.98 (138.35-139.65) |
107.86 (107.09-108.51) |
Base Model 1 + Four BMI Levels | 99.98 (99.28-100.69) |
135.91 (135.16-136.66) |
103.02 (102.26-103.76) |
104.32 (103.60-104.99) |
138.38 (137.75-139.09) |
107.20 (106.43-107.91) |
Base Model 2: Age, Sex, Charlson Index | 96.76 (96.11-97.45) |
132.18 (131.42-133.04) |
100.31 (99.61-101.01) |
103.12 (102.45-103.76) |
135.05 (134.34-135.89) |
106.47 (105.66-107.13) |
Base Model 2 + Four BMI Levels | 96.48 (95.77-97.14) |
131.86 (131.08-132.71) |
100.13 (99.42-100.88) |
102.53 (101.85-103.14) |
134.68 (133.97-135.52) |
105.99 (105.21-106.66) |
Base Model 3: Age, Sex, ACG Risk Score | 67.71 (66.92-68.58) |
103.63 (102.69-104.90) |
79.28 (78.5-80.06) |
86.26 (85.42-86.99) |
109.17 (108.15-110.46) |
96.86 (96.00-97.52) |
Base Model 3 + Four BMI Levels | 67.72 (66.92-68.56) |
103.78 (102.81-105.04) |
79.27 (78.50-80.04) |
86.17 (85.36-86.88) |
109.33 (108.28-110.58) |
96.73 (95.85-97.39) |
MAPE: mean absolute prediction error; (b) cost is truncated at top 0.5%; and, (c) 95% confidence interval is generated using 300 runs of different random splitting
Impact of Adding BMI Levels on Binary Utilization Outcomes
The value of BMI markers was also assessed in predicting binary utilization outcomes such as having an inpatient or ED admission, and being in the top 5% total costs. Adding BMI to the demographic model improved the prediction of all binary outcomes (i.e., AUC increased relatively between 2% to 7%; absolute AUC increased from .602 to .617 and from .516 to .554). Adding BMI levels to the Charlson model also improved predicting all prospective outcomes, ranging from 4% improvement of AUC for predicting hospitalization and ED admission (from .639 to .665 and from .556 to .576) to 3% to predict being in the top 5% of total costs (from .649 to .668). Adding BMI to the Charlson model only improved predicting concurrent ED admission. The performance of Dx-PM model in predicting binary utilization outcomes did not improve after adding BMI-levels.
Sensitivity Analysis
Analyzing the excluded population that missed a valid BMI showed a consistent trend (i.e., lower mean age, slightly lower burden of comorbidities and lower utilization rates); however, the observed trend was expected due to study’s exclusion criteria (see online supplemental digital content # 1).
Within the study population, an individual on average had 2.33 eligible BMI values in 2012. Utilizing the average BMI of 2012, instead of last recorded BMI in 2012, did not change the modeling results.
Discussion
Adopting new digital data sources to improve utilization prediction has become a priority topic within the population health management field 30. Due to the increased adoption of EHRs among primary care providers 31, 32, using EHR data to improve risk stratification has gained attention in recent years 9, 11, 22, 33. In this study, we assessed the value of BMI derived from EHR data when added to data extracted from medical insurance claims in predicting cost, hospitalization and ED admission. The results showed that adding BMI to demographic and Charlson models provides a statistical significant, though limited, value in predicting various cost and utilization outcomes. However, these improvements disappear when more comprehensive models are used (e.g., Dx-PM model of the ACG system).
BMI and Utilization
A recent systematic review found that total annual healthcare costs were 36% higher for individuals with obesity as compared to normal weight, and costs were higher for medications (68% higher for obese), hospitalizations (34% higher) and outpatient care (26% higher) 15. Many of these epidemiologic studies have adjusted their analyses for confounders such as basic demographic characteristics, socioeconomic status, and/or health behaviors, but have avoided including characteristics hypothesized to be on the causal pathway between BMI and healthcare costs (e.g., type 2 diabetes mellitus or coronary heart disease). While this approach is appropriate for epidemiologic studies, these analyses are unable to disentangle the costs attributable directly to obesity from indirect costs due to conditions associated with obesity.
Health plans and government agencies currently employ diagnosis-based predictive models to forecast costs and utilization, which is then used to profile providers, adjust payments, and prioritize patients for care management interventions 7, 8. Therefore, understanding how BMI may enhance these diagnosis-based predictive models is critical, as these systems are unlikely to be able to direct resources to all patients with obesity but rather need to focus on identifying patients at high cost risk. To our knowledge, our study is the first to examine how BMI may impact diagnosis-based models to predict healthcare costs and utilization.
Role of BMI in Dx-based Models
Adding BMI to predictive models of utilization that incorporate diagnostic codes (Dx) either showed limited improvements (i.e., Charlson model) or no change (i.e., Dx-PM model; Tables 2, 3 and 4). The limited role of BMI in Dx-based models could be partly explained due to the existing coding of obesity as a diagnosis in the study population, especially within the higher classes of BMI (e.g., 35% of patients with class-III obesity had a diagnostic code for obesity). Hence, the added-value of BMI categories was partially absorbed and consequently neutralized by the captured diagnostic codes of obesity. However, the diagnosis encoding of obesity was considerably lower in class-I and class-II of BMI levels (8.15% and 18.14%) thus indicating the need to further investigate the added value of BMI levels in improving utilization prediction among the obese but under-coded subpopulations.
Table 4:
Period | Concurrent | Prospective | ||||
---|---|---|---|---|---|---|
Type of Event | Any IP | Any ED | Top 5% Total Costs |
Any IP | Any ED | Top 5% Total Costs |
Base Model 1: Age, Sex | 0.602 (0.590-0.614) |
0.524 (0.517-0.530) |
0.595 (0.584-0.604) |
0.602 (0.592-0.615) |
0.516 (0.509-0.524) |
0.603 (0.595-0.615) |
Base Model 1 + Four BMI Levels | 0.617 (0.605-0.629) |
0.552 (0.545-0.559) |
0.611 (0.600-0.620) |
0.637 (0.624-0.649) |
0.554 (0.546-0.562) |
0.634 (0.624-0.644) |
Base Model 2: Age, Sex, Charlson Index | 0.664 (0.649-0.676) |
0.583 (0.576-0.590) |
0.661 (0.651-0.671) |
0.639 (0.627-0.651) |
0.556 (0.547-0.564) |
0.649 (0.639-0.660) |
Base Model 2 + Four BMI Levels | 0.671 (0.658-0.683) |
0.593 (0.586-0.600) |
0.668 (0.659-0.679) |
0.665 (0.652-0.676) |
0.576 (0.568-0.584) |
0.668 (0.658-0.678) |
Base Model 3: Age, Sex, ACG Risk Score | 0.851 (0.841-0.859) |
0.688 (0.681-0.694) |
0.917 (0.910-0.922) |
0.719 (0.706-0.730) |
0.624 (0.617-0.634) |
0.819 (0.811-0.828) |
Base Model 3 + Four BMI Levels | 0.849 (0.839-0.859) |
0.684 (0.677-0.690) |
0.916 (0.910-0.922) |
0.728 (0.716-0.738) |
0.626 (0.618-0.634) |
0.822 (0.814-0.831) |
area under the curve, also known as c-stat; and, (b) 95% CI is generated using 300 runs of different random splitting
Obesity-related conditions (e.g., type 2 diabetes) are highly correlated with obesity and these diseases in turn are predictive of utilization (Table 1). Therefore, the existence of obesity-related diagnostic codes – even when obesity codes are missing – could likely diminish the independent effect of BMI in improving Dx-based models. This effect is perhaps more evident in higher classes of BMI as patients in this subgroup have a higher rate of diagnostic codes for morbidities such as type 2 diabetes, hypertension, and ischemic heart disease (Table 1). These findings reinforce that these comorbid conditions are on the causal pathway between BMI and healthcare costs, as often identified in epidemiologic studies. Therefore, predictive models targeted at subpopulations of obese patients (i.e., classes I to III) who have yet to develop obesity-related conditions may benefit more from adding their BMI-levels in predictive models of utilization, which should be examined in future studies.
The usefulness of temporal patterns of BMI should be assessed as a risk marker for utilization. In this study we used the last recorded BMI in the training year (i.e., 2012). Although the absolute BMI value can help predicting utilization by filling the missing diagnostic codes of obesity, it does not show the trajectory of BMI (e.g., a patient gaining or losing weight), which could be predictive for certain outcomes of interest 13. In addition, our study only included two years of data in which the first year’s BMI was used to predict next year’s outcomes. However, the BMI levels and its change over time may need additional years of data to show the true effect of BMI on utilization in long term, and hence improving the predictive models. Future research should investigate the effect of BMI (and its change) over an extended time period (e.g., 5 to 10 years); although such longitudinal EHR data, coupled with claims data, for large populations are currently not commonly accessible for research.
Potential Use Cases
Value-based care has extended population health management efforts, including population analytics, from insurance to provider organizations 34. Healthcare providers, however, often do not have access to the full spectrum of diagnostic codes accumulated in insurance claims that represent patient diagnoses documented by all covered entities/providers 35, 36. Indeed, providers often use their local EHR data for risk stratification, which is often limited to diagnostic codes collected within their network 9. In this study we did not limit the diagnostic codes to EHR data only; however, EHR diagnostic codes have been shown to be subpar compared to diagnostic data extracted from claims in predicting utilization 9, 10; hence, adding BMI categories might represent a useful approach for providers to modestly improve their utilization predictions. In addition, small practices may not have access to more comprehensive risk stratification tools/models, thus benefitting from integrating BMI-levels in more accessible predictive models (e.g., Charlson; Table 2 and 4).
BMI has long been collected and analyzed within and outside of clinical settings for purposes such as public/population health research and weight-management interventions. Multiple federal and statewide efforts collect obesity data, including BMI, from large populations 37; however, diagnostic information on all clinical conditions is often not collected 38. In such a context, using BMI in addition to the demographic information (i.e., age and sex) can enable the sponsoring entities/programs to better stratify their underlying target populations for various utilization outcomes. Similarly, online and app-based weight-management interventions often do not collect detailed diagnostic information from their users, thus leaving application developers with an opportunity to use BMI in risk stratifying their end users. Future research should investigate the potential of repeatedly-measured temporal BMI data (e.g., weekly BMI records) in risk stratifying users of weight-management applications.
Limitations
Our research has several limitations: First, we used a single population of working-aged insured patients; therefore, our results need to be examined in other settings and populations (e.g., assessing the value of BMI for risk stratification among the pediatric, older adult, Medicaid, or uninsured populations). Second, we selected the last BMI value in the index year; however, the change in values over time (potentially in multiple years) might contain important risk information that we did not include in this analysis. Third, we limited our outcomes to costs and utilization, and did not test the relationship of BMI markers with other outcomes such as mortality or functional status. Fourth, we did not assess the added value of BMI to models that include medication information assuming that the added-value of BMI will diminish in more comprehensive models (i.e., having medication data in addition to diagnostic information). Fifth, we combined the under, normal, and overweight populations as one category due to the distribution of cost across the underlying study population; however, future research should further disentangle the value of BMI in these subcategories. Sixth, although R2 showed a considerable improvement in performance when adding BMI to non-comprehensive models, MAPE measures did not reveal a significant improvement. As MAPE is often used to compare performance of models predicting different outcomes, further investigation is needed to compare BMI-enhanced models predicting cost with models predicting other utilization outcomes. Finally, our findings should be treated as preliminary and potentially limited to the data sources, patient denominator, and analytical methodologies used in this study. Most importantly, the generalizability of our results is confined to the quality of BMI information captured in the EHR records used in this study (e.g., clinical settings may vary in capturing high quality data 39 such as BMI and/or assigning obesity-related diagnostic codes for all of their patients).
Supplementary Material
Acknowledgement
We acknowledge the support of HealthPartners, Inc. (Bloomington, MN) in sharing the underlying data and providing the research team with technical support throughout the research. We also acknowledge the technical and management support provided by other Johns Hopkins team members (Tom Richards, Fardad Gharghabi, Elyse Lasser, and Klaus Lemke).
Funding
Not applicable.
Additional Disclosures
“This manuscript has been prepared by faculty and staff at The Johns Hopkins University (JHU). The manuscript also references the Adjusted Clinical Groups (ACG) system. JHU holds the copyright to the ACG System and receives royalties from the global distribution of the ACG system. The authors are members of a group of researchers who develop and maintain the ACG System with support from JHU.”
“KAG was supported by a career development award from the National Heart, Lung, and Blood Institute (K23HL116601).”
Ethics Approval
This study was approved by the institutional review board of the Johns Hopkins University (IRB# 00005784).
Consent for Publication
Not applicable.
Availability of Data
The data that support the findings of this study are available from HealthPartners but restrictions apply to the availability of these data, which were used under a special agreement for the current study, and are not publicly available. Despite these restrictions, data are available from the authors upon reasonable request and with permission of HealthPartners.
Footnotes
Declarations
The following declarations are removed from the blinded copy: funding; conflict of interest; disclosures; ethics approval; availability of data; authors’ contribution; and, acknowledgement.
Conflict of Interest
Authors do not have any conflict of interest to report.
Online Supplemental Digital Content List
Online Supplemental Digital Content # 1
Bibliography
- 1.Finkelstein EA, Trogdon JG, Cohen JW, Dietz W. Annual medical spending attributable to obesity: payer-and service-specific estimates. Health Aff. 2009;28(5):w822–831 [DOI] [PubMed] [Google Scholar]
- 2.Trogdon JG, Finkelstein EA, Feagan CW, Cohen JW. State- and payer-specific estimates of annual medical expenditures attributable to obesity. Obesity (Silver Spring). 2012;20:214–220. [DOI] [PubMed] [Google Scholar]
- 3.Suehs BT, Kamble P, Huang J, et al. Association of obesity with healthcare utilization and costs in a Medicare population. C urr Med Res Opin. 2017;33(12):2173–2180. [DOI] [PubMed] [Google Scholar]
- 4.Li Q, Blume SW, Huang JC, Hammer M, Ganz ML. Prevalence and healthcare costs of obesity- related comorbidities: evidence from an electronic medical records system in the United States. J Med Econ. 2015;18(12):1020–1028. [DOI] [PubMed] [Google Scholar]
- 5.Hales CM, Fryar CD, Carroll MD, Freedman DS, Ogden CL. Trends in Obesity and Severe Obesity Prevalence in US Youth and Adults by Sex and Age, 2007–2008 to 2015–2016. J Am Med Assoc. 2018;319(16):1723–1725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Flegal KM, Graubard BI, Williamson DF, Gail MH. Cause-specific excess deaths associated with underweight, overweight, and obesity. J Am Med Assoc. 2007;298(17):2028–2037. [DOI] [PubMed] [Google Scholar]
- 7.Hu G, Lesneski E. The differences between claim-based health risk adjustment models and cost prediction models. Dis Manag. 2004;7(2):153–158. [DOI] [PubMed] [Google Scholar]
- 8.Forrest CB, Lemke KW, Bodycombe DP, Weiner JP. Medication, diagnostic, and cost information as predictors of high-risk patients in need of care management. Am J Manag Care. 2009;15:41–48. [PubMed] [Google Scholar]
- 9.Kharrazi H, Chi W, Chang HY, et al. Comparing population-based risk-stratification model performance using demographic, diagnosis and medication data extracted from outpatient electronic health records versus administrative claims. Med Care. 2017;55(8):789–796. [DOI] [PubMed] [Google Scholar]
- 10.Kharrazi H, Weiner JP. A practical comparison between the predictive power of population-based risk stratification models using data from electronic health records versus administrative claims: setting a baseline for future EHR-derived risk stratification models. Med Care. 2018;56(2):202–203. [DOI] [PubMed] [Google Scholar]
- 11.Lemke KW, Gudzune KA, Kharrazi H, Weiner JP. Assessing marker from ambulatory laboratory tests for predicting high-risk patients. Am J Manag Care. 2018;24(6):e190–e195. [PubMed] [Google Scholar]
- 12.Mattar A, Carlston D, Sariol G, et al. The prevalence of obesity documentation in primary care electronic medical records. Are we acknowledging the problem? Appl Clin Inform. 2017;8(1):67–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pantalone KM, Hobbs TM, Chagin KM, et al. Prevalence and recognition of obesity and its associated comorbidities: cross-sectional analysis of electronic health record data from a large US integrated health system. BMJ Open. 2017;7(11):e017583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Mocarski M, Tian Y, Smolarz GB, McAna J, Crawford A. Use of international classification of diseases, ninth revision codes for obesity: trends in the United States from an electronic health record-derived database. P opul Health Manag. 2017;21(3):222–230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kent S, Fusco F, Gray A, Jebb SA, Cairns BJ, Mihaylova B. Body mass index and healthcare costs: a systematic literature review of individual participant data studies. Obes Rev. 2017;18(8):869–879. [DOI] [PubMed] [Google Scholar]
- 16.HealthPartners. Quick facts about HealthPartners. 2016. Available at: https://www.healthpartners.com/hp/about/quick-facts/index.html. Accessed September 3, 2016.
- 17.Guinness World Records. Heaviest man ever. Available at: http://www.guinnessworldrecords.com/world-records/heaviest-man. Accessed Jan 10, 2018.
- 18.DailyMail. Valieria Levitin - 4st anorexic. Dec 19, 2012. Available at: http://www.dailymail.co.uk/health/article-2250422/Frightening-words-4st-anorexic-Valeria-Levitin-gets-FAN-MAIL-shes-thin.html. Accessed Jan 10, 2018.
- 19.Guinness World Records. Tallest man ever. Available at: http://www.guinnessworldrecords.com/world-records/tallest-man-ever. Accessed Jan 10, 2018.
- 20.Guinness World Records. Shortest man - living. Available at: http://www.guinnessworldrecords.com/world-records/shortest-man-living-(mobile: ). Accessed Jan 10, 2018. [Google Scholar]
- 21.Centers for Disease Control and Prevention. About Adult BMI. Available at: https://www.cdc.gov/healthyweight/assessing/bmi/adult_bmi/index.html. Accessed Jan 12, 2018.
- 22.Chang HY, Richards TM, Kenneth SM, et al. Evaluating the impact of prescription fill rates on risk stratification model performance. Med Care. 2017;55(12):1052–1060. [DOI] [PubMed] [Google Scholar]
- 23.Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–383. [DOI] [PubMed] [Google Scholar]
- 24.Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–1139. [DOI] [PubMed] [Google Scholar]
- 25.Charlson M, Wells MT, Ullman R, King F, Shmukler C. The Charlson comorbidity index can be used prospectively to identify patients who will incur high future costs. PLoS One. 2014;9(12):e112479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Weiner JP, Starfield BH, Steinwachs DM, Mumford LM. Development and application of a population-oriented measure of ambulatory care case-mix. Med Care. 1991;29(5):452–472. [DOI] [PubMed] [Google Scholar]
- 27.Health Services Research & Development Center at the Johns Hopkins University Bloomberg School of Public Health. The Johns Hopkins ACG Case-Mix System Reference Manual Version 11.0 Baltimore: The Johns Hopkins University; Bloomberg School of Public Health; 2014. Technical Reference Guide. [Google Scholar]
- 28.Chang HY, Lee WC, Weiner JP. Comparison of alternative risk adjustment measures for predictive modeling: High risk patient case finding using Taiwan’s national health insurance claims. BMC Health Serv Res. December 2010;10(343). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Society of Actuaries. A comparative analysis of claims-based tools for health risk assessment. Available at: https://www.soa.org/research-reports/2007/hlth-risk-assement/. Accessed Jan 8, 2018.
- 30.Kharrazi H, Lasser EC, Yasnoff WA, et al. A proposed national research and development agenda for population health informatics: summary recommendations from a national expert workshop. J Am Med Inform Assoc. 2016;24(1):2–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hsiao CJ, Hing E. Use and Characteristics of Electronic Health Record Systems Among Office-based Physician Practices: United States, 2001–2013. Centers for Disease Control and Prevention. January 2014. http://www.cdc.gov/nchs/data/databriefs/db143.pdf. Accessed Dec 20, 2016. [PubMed]
- 32.The Office of the National Coordinator for Health Information Technology (ONC). Office-based Physician Electronic Health Record Adoption. Health IT Quick-Stat #50. December 2016. Available at: https://dashboard.healthit.gov/quickstats/pages/physician-ehr-adoption-trends.php. Accessed March 1, 2017.
- 33.Kan HJ, Kharrazi H, Leff B, et al. Defining and assessing geriatric risk factors and associated health care utilization among older adults using claims and electronic health records. Med Care. 2018;56(3):233–239. [DOI] [PubMed] [Google Scholar]
- 34.Adler-Milstein J, Embi P, Middleton B, Sarkar I, Smith J. Crossing the health IT chasm: considerations and policy recommendations to overcome current challenges and enable value-based care. J Am Med Inform Assoc. 2017;24(5):1036–1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ferver K, Burton B, Jesilow P. The use of claims data in healthcare research. Open Public Health J. 2009;2:11–24. [Google Scholar]
- 36.Optum. The benefit of using both claims data and electronic medical record data in health care analysis. February 2012. Available at: https://www.optum.com/content/dam/optum/resources/whitePapers/Benefits-of-using-both-claims-and-EMR-data-in-HC-analysis-WhitePaper-ACS.pdf. Accessed October 29, 2016.
- 37.Bennett WL, Wilson RF, Zhang A, et al. Methods for evaluating natural experiments in obesity: a systematic review. Ann Intern Med. 2018;168(11):791–800. [DOI] [PubMed] [Google Scholar]
- 38.Bennett WL, Cheskin LJ, Wilson RF, et al. Methods for evaluating natural experiments in obesity: systematic evidence review Vol Comparative Effectiveness Reviews, No. 204. Rockville, MD: Agency for Healthcare Research and Quality; 2017. [PubMed] [Google Scholar]
- 39.Kharrazi H, Wang C, Scharfstein D. Prospective EHR-based clinical trials: The challenge of missing data. J Gen Intern Med. 2014;29(7):976–978. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.