Skip to main content
Health Care Financing Review logoLink to Health Care Financing Review
. 1987 Winter;9(2):15–23.

Adjusting capitation using chronic disease risk factors: A preliminary study

Jonathan Howland, Joseph Stokes III, Stephen C Crane, Albert J Belanger
PMCID: PMC4192869  PMID: 10312390

Abstract

Researchers have sought ways to modify Medicare's capitation formula, the adjusted average per capita cost (AAPCC), by including measures of individual health status. The present study assesses the value of risk factors for disease as predictors of hospitalization for Framingham Heart Study participants (1,210 males and 1,496 females) 60-65 years of age. Regression models including several common physiologic measures and prior hospitalizations yielded adjusted R2s of 9.69 percent for males and 3.61 percent for females. The contributions of the risk factors and prior hospitalization were about equal and independent. These results confirm the potential utility of disease risk factors for adjusting the AAPCC.

Introduction

Beginning with the Nixon Administration, both the White House and Congress have considered the enrollment of Medicare beneficiaries in health maintenance organizations (HMO's) to be one means of containing the rate of growth in Federal health expenditures. Because of burdensome regulations, early legislation aimed at establishing a risk-sharing approach to Medicare reimbursement failed to stimulate interest among prepaid providers (Igelhart, 1985). Subsequently, the Tax Equity and Fiscal Responsibility Act (TEFRA) of 1982 included provisions to ease administrative requirements and provide greater financial incentives and protection to providers. HMO's appear to be more interested in Medicare participation on a prepaid capitated basis as a consequence of these changes. But the success of capitation depends on the willingness of HMO's to assume the financial risks involved, and the extent of cost savings to Medicare relative to its traditional payment system.

A critical part of risk-sharing reimbursement is thus the capitation rate that is set for HMO's. The current method by which Medicare establishes payments is based on the average expenditure for all Medicare beneficiaries in a county, adjusted for age, sex, welfare, and institutional status. Even after these adjustments, a great deal of variation in health care costs remains. The financial risk to either HMO's or Medicare depends on whether either high or low users of health services systematically enroll in prepaid plans, thus creating selection bias. Selection bias can result in one of two possible outcomes, depending on the direction of the bias. If high-cost patients tend to join, a capitation rate based on the mean expenditures will be insufficient to cover the costs of delivering services; HMO's will therefore be discouraged from further participation in Medicare. If low-cost patients tend to join HMO's, Medicare pays excessive premiums, while retaining high-cost beneficiaries under the fee-for-service system; this results in a net increase in Medicare expenditures under capitation.

A second component of risk under capitation is random error. Even without selection bias, a provider could, by chance, enroll a population unbalanced in terms of its cost distribution. The problem of random error diminishes as the number of enrollees increases. Nevertheless, because there is so much variation in Medicare expenditures, large numbers would be required to bring the risk due to chance within reasonable limits (Gruenberg, 1984).

The method by which Medicare presently determines the capitation schedule—the average adjusted per capita cost (AAPCC)—is intended to structure payments in a way that limits risk from both selection bias and random error to acceptable levels. It has been suggested, however, that a major weakness of the AAPCC is that it does not account for variation in health status (Gruenberg, 1982; Thomas et al., 1983). Recognizing this and other problems in the reimbursement methodology, Congress included provisions in the TEFRA legislation for identifying additional predictors for use in developing capitation rates. Unfortunately, while there is general agreement that better predictor variables are needed, identifying these variables remains problematic.

Research to date has focused on several kinds of variables for improving the actuarial power of the AAPCC. These include measures of prior health care utilization (Anderson, Resnick, and Gertman, 1982; Beebe, Lubitz, and Eggers, 1985; Ash, 1985), functional health status (Thomas and Lichtenstein, 1986), and prior utilization categorized by diagnoses predictive of high and low future costs (Anderson, Resnick, and Gertman 1982; Ash, 1985). Thus far, measures of prior utilization have been the most promising. Studies utilizing these measures develop much more predictive power than the AAPCC variables alone. Nevertheless, prior utilization poses several problems. First, for persons recently eligible for Medicare, reliable medical histories may be difficult to retrieve. Second, even if prior data are available, their use as actuarial factors may create inappropriate financial incentives and thus invite HMO's to “game” the reimbursement system. For instance, a capitation schedule based on prior utilization could tempt HMO's to trade off the losses from excessive utilization in one year for future profits from increasing enrollees' risk status (and, therefore, the capitation payments) the following year. Third, categorizing diagnoses by future projected costs provides incentives for manipulating diagnoses in the same manner in which the use of DRG's has engendered “DRG creep” (that is, the tendency to use the diagnosis that yields the highest prospective payment when more than one diagnosis is possible). Finally, prior utilization measures may say little about emerging health problems that to date have not caused any use of services but will in the future.

McClure (1984) suggested an alternative approach to improving the AAPCC: adjusting the capitation payment using health status risk factors. The logic to this idea is as follows: Risk factors theoretically predict the future development of disease. At some point, most diseases are likely to generate the need to use health care services. Risk factors thus may prove useful as predictors of health care utilization.

Use of risk factors to adjust the AAPCC offers several advantages over measures of prior utilization. First, risk factors can be measured in the present and do not require knowledge of previous utilization. Second, they can identify previously unrecognized disease not associated with prior utilization. Third, they are difficult to “game” because most can be measured and validated objectively.

The purpose of this study is to assess the value of specific behavioral and physiological risk factors in predicting frequency of hospitalization. To accomplish this, data from the Framingham Heart Study have been used to determine the degree of association between factors that predict the onset of cardiovascular disease and the frequency of reported hospitalizations. Although this study is limited to the prediction of only one dimension of utilization, it presents an approach that may have potential application to other kinds of utilization (e.g., ambulatory care, long-term care, etc.).

Methods

The Framingham Heart Study is an ongoing epidemiologic investigation of cardiovascular risk factors that has studied a sample of 5,209 males and females 30-59 years of age since 1948. Risk factors have been assessed every 2 years. At each biennial exam, subjects are asked to report how many hospitalizations they have experienced during the interexam period.

The data set for the present study was derived by culling subjects 60 to 65 years of age from exams conducted over the 24 years from 1956 to 1980 (exams numbered 4 through 16).1 Risk factor measures were selected from the exam closest to the 65th birthday. When data on one or more risk factors were missing, measurement from the next earlier exam was used, provided that it was not obtained before the age of 60. Those subjects, for whom no information on 2-year postexam hospitalization was recorded were eliminated, unless the subject had died. If the subject had died within the 2-year postexam period, his or her risk factor data were included in the study and the subject's medical records were examined to determine the frequency of hospitalizations prior to death.

The samples derived for this study included 1,210 males and 1,496 females. Ninety percent of the subjects in both samples were either 64 or 65 years of age. Seventy-five percent of the males and 80 percent of the females reported having had no hospitalizations during the 2-year postexam period.

Independent variables used for this analysis were divided into three categories: demographic characteristics of subjects, prior utilization (frequency of hospitalization during the 2-year preexam period), and behavioral and physiological risk factors. The variables are presented in Table 1. Details on how risk factors were operationalized and measured are provided in Dawber's (1980) account of the Framingham Heart Study. Brief descriptions of clinical measures with which the reader may not be familiar are provided in a Technical Note at the end of this article.

Table 1. Means and units of measurement for specified independent variables.

Variable Mean Units of measure or coding

Males Females
Demographics
Age 64.4 64.4 Years:
Marital status 1.9 1.6 0 = single, divorced, separated, widowed
1 = married
Education 3.7 3.6 0 = 0-4 years 2 = 8th grade
4 = high school 6 = college graduate
1 = 5-7 years 3 = some high school
5 = some college 7 = postgraduate
Assessment exam 10.5 10.7 Number of exam range: 4-16
Risk factors
Smoking:
 Cigarette per day 6.4 3.7 Cigarettes per day
 Current smoking status 0.49 0.25 0 = no, 1 = yes
Respiratory function:
 Expiratory volume in 1 second 246.4 180.1 Centiliters
 Vital capacity 476.5 359.9 Centiliters
Weight:
 Metropolitan relative weight 120.4 123.2 Weight: Mean metropolitan weight for age, sex
Subscapular skinfold 17.5 18.2 Centimeters
Blood sugar:
 Glucose intolerance 0.14 0.11 0 = no, 1 = yes
 Blood sugar 90.3 90.9 mg / 100 mc
Cardiac function:
 Left ventricular hypertrophy 1.1 1.1 0 = no, 1 = borderline, 2 = yes
 Ventricular rate 73.7 77.1 Beats per minute
 EKG (general impression) 1.7 1.5 0 = normal, 1 = borderline, 2 = abnormal
Blood pressure:
 Hypertension 1.7 1.8 0 = no, 1 = borderline, 2 = yes
 Systolic pressure 139.9 143.4 Millimeters of mercury
 Diastolic pressure 81.8 81.7 Millimeters of mercury
Hematrocrit 47.1 43.9 Percent
Serum cholesterol 227.8 254.8 Milligrams per 100 milliliters
Prior utilization
Hospitalizations during 2 years preceding assessment exam .33 .21 Number of hospitalizations

SOURCE: Boston University: Data from the Framingham Heart Study.

The dependent variable for the analysis was the number of hospitalizations reported by the subject to the examining physician for the 2-year followup period after the exam at which risk factors were measured.

Because many of the risk factor variables available through the Framingham Heart Study were considered redundant, and because missing data led to substantial attrition in sample size when stepwise regression was used, the following protocol was developed for selecting risk factor variables:

  • Risk factors were sorted into one of eight domains (Table 1).

  • Intradomain correlation coefficients were examined, when domains had more than one variable (Table 2).

  • If variables within a domain were highly correlated (r > .50), one variable was selected to represent the domain, with preference given to the continuous variable.

  • If two or more intradomain variables were highly correlated, and of the same order, i.e., dichotomous, ordinal, or continuous, clinical judgment was used to select the variable most sensitive to disease status.

  • If two or more intradomain variables were highly correlated, but thought to represent different clinical manifestations, both were included.

  • Variables with low intradomain correlations were retained in the analysis.

Table 2. Intradomain correlations for behavioral and physiological risk factors.

Domain Variables Correlation coefficient (r) Variable(s) selected

Males Females
Smoking Cigarettes per day, by current smoking status .57 .82 Cigarettes per day
Respiratory function Expiratory volume in 1 second, by vital capacity .72 .81 Expiratory volume in 1 second Vital capacity
Weight Metropolitan relative weight, by subscapular skinfold .64 .67 Subscapular skinfold
Blood sugar Glucose intolerance, by blood sugar .55 .55 Blood sugar
Cardiac function Left verticular hypertrophy, by Electrocardiogram
 ventricular rate, and by .01 .03 Left ventricular hypertrophy
 electrocardiogram .30 .33 Ventricular rate
Ventricular rate, by electrocardiogram .05 −.02
Blood pressure Hypertension, by Systolic blood pressure
 systolic blood pressure, and by .84 .85
 diastolic blood pressure .67 .70
Systolic blood pressure, by
 diastolic blood pressure .67 .7 6

SOURCE: Boston University: Data from the Framingham Heart Study.

The variables selected for regression analysis were the same for males as for females. These variables are also presented in Table 2. The rationale for the selection of specific variables is as follows:

Smoking

The current number of cigarettes smoked per day was selected over the smoking status variable because the former was a continuous variable and a better indicator of degree of exposure.

Respiratory function

Despite the fact that they were highly correlated, both respiratory function variables were selected because clinical studies have indicated that they measure different functions. Vital capacity tends to be a measure of restrictive lung disease (scarring of lung tissue or neuromuscular disorder); whereas forced expiratory volume tends to be a measure of airflow obstruction (emphysema, chronic bronchitis, asthma).

Weight

Subscapular skinfold was selected over relative weight because skinfold is a more accurate measure of body fat, particularly for females.

Blood sugar

Blood sugar was selected over glucose intolerance because the former was a continuous variable.

Cardiac function

All three cardiac function variables (ventricular rate, EKG [electrocardiogram] = general impression, and left ventricular hypertrophy) were selected because they were only weakly correlated.

Blood pressure

Systolic blood pressure was selected over hypertension because the former was continuous; the selection of systolic, rather than diastolic, blood pressure was arbitrary.

Least-squares regression analysis was performed using various groupings of demographics, prior utilization, and risk factor variables for individuals as potential predictors of their number of hospitalizations. Males and females were analyzed separately. The following sequence of steps was used for the analyses:

  • The number of hospitalizations was regressed on demographic variables only (Model 1).

  • The number of hospitalizations was regressed on prior utilization only (Model 2).

  • The number of hospitalizations was regressed on behavioral and physiological risk factors only (Model 3).

  • The number of hospitalizations was regressed on demographics, prior utilization and behavioral and physiological risk factor variables (Model 4).

  • On the basis of the preceeding steps, the most parsimonious model, i.e., the model using the fewest variables without significant reductions in R2, was identified and run (Model 5).

None of the variables used in this study had many missing data points, but most variables had some missing data. Thus, depending on the number of variables included in a model, some cases were excluded from each regression. To assess potential for bias as a result of missing data, t-tests were used to compare means for independent variables for those subjects included in regressions to means for subjects not included in the regressions.

For both males and females, the distribution of the dependent variable was very skewed. Most subjects had no hospitalization (75 percent and 80 percent for males and females, respectively) and another 18 percent males and 16 percent females had only one hospitalization. This skewness in frequency of hospitalization for the Framingham cohort of persons 64-65 years of age reflects the distribution for total Medicare reimbursements (Gruenberg, 1982). Skewness in the dependent variable poses several threats to validity. First, a small number of cases with a large number of hospitalizations could have a disproportionate effect on the slope of the regression line. Second, even without outliers, least-squares regression assumes a normally distributed dependent variable.

We took several measures to address the skewness in the distribution of the hospitalization variable. First, we plotted residuals for the Model 5 regressions and subsequently reran these models eliminating outliers. Second, we performed log (base 10) transformations on the Model 5 dependent variables to better normalize the distribution. Third, we developed and analyzed logistic regression models with frequency of hospitalization dichotomized as “no” hospitalization and “one or more” hospitalizations.

To assess the goodness of fit of the Model 5 regressions, we examined polynomial terms in the equations.

Results

Model 1: Regression with demographic variables

The independent variables for this model were age, years of education, marital status, and the time period (number of the exam) from which the individual subject was drawn. This last variable was intended to control for the independent effect of secular trends in hospitalization (e.g., changes in practice and the introduction of Medicare). For males (N = 1, 173), the overall regression was significant (F = 3.173, p = .0123). The adjusted R2 for the regression was .74 percent. For females (N = 1,474), the regression was not significant (F = 1.523, p = .1931). The adjusted R2 was .14 percent.

Model 2: Regression with prior utilization

This model had one independent variable, prior utilization. For males (N = 1,035), the regression was highly significant (F = 55.177, p = .0001). The R2 for the regression was 4.97 percent. For females (N = 1,284), the results were also significant (F = 26.374, p = .0001), but the regression explained less variation (R2 = 1.94 percent).

Model 3: Stepwise regression

To derive this model, stepwise regression (forward and backward) was used with the 11 selected behavioral and physiological risk factor variables. For males (N = 780), the model was significant (F = 9.39, p = .0001) explaining 4.61 percent of the variation. Four variables were retained using an entry/exit p- value of .15. These were EKG (p = .0001), expiratory volume (p = .0121), subscapular skinfold (p = .0359), and blood sugar (p = .0512). Because of the sample size attrition due to missing values, an additional regression was run using only significant (p < 15) variables from the stepwise procedure. The results were not significantly different and therefore are not reported.

For females (N = 959), four variables were retained using the stepwise procedures. These were EKG, expiratory volume, hematocrit, and blood sugar. Another regression (Model 3a) using only these variables was run to increase sample size (N = 1,261). This model was significant (F = 7.084, p = .0001). The adjusted R2 was 1.88 percent. Two variables were significant at the p = .01 level: EKG (p = .0195) and expiratory volume (p = .003). Hematocrit (p = .0623) and blood sugar (p = .0903) were significant at the p = .10 level.

Model 4: Regression with additional variables

In this model, the demographic variables and the measure of prior health services utilization were added to the behavioral and physiological risk factors that were retained by the stepwise procedure above. For males (N = 663), the regression was significant (F = 9.339, p = .0001) and explained (adjusted R2) 10.17 percent of the variation in number of hospitalizations. Three variables were significant at the p = .05 level: prior utilization (p = .0001), EKG (p = .0004), and subscapular skinfold (p = .0372). Expiratory volume (p = .0778) was significant at the p = .10 level. For females (N = 1,120), the regression was significant (F = 5.765, p = .0001). The adjusted R2 was 3.23 percent. Two variables were significant at the p = .01 level: prior utilization (p = .001) and EKG (p = .0039). Expiratory volume (p = .0745) and blood sugar (p = .0696) were significant at the p = .10 level.

Model 5: Most parsimonious model

Based on the results of the preceding steps, regressions were run on the number of hospitalizations using only the variables identified as statistically significant at the p = .10 level. For males (N = 734) the model F-ratio (20.682) was significant (p = .0001). The adjusted R2 was 9.96 percent. Three variables were significant: prior utilization (p = .0001), EKG (p = .0003), and expiratory volume (p = .0328). Subscapular skinfold was not significant (p = .1299) nor was exam number (p = .15). For females (N = 1,134), the model F-ratio (11.608) was significant (p = .0001). The adjusted R2 was 3.61 percent. Two variables were significant at the p = .01 level: prior utilization (p = .0001), and EKG (.0030). Expiratory volume (p = .0802) and blood sugar (p = .0595) were significant at the 7 = .10 level.

To assess the extent to which bias was created as a result of sample attrition due to missing data, the means of all variables included in Model 5 were compared with the means of these same variables for subjects not included in the regression. Although significant in some cases, differences in means were not large and the two groups were judged clinically comparable (Table 3).

Table 3. Group means for subjects included and excluded from regression on Model 5, by sex and variable.

Variable Group means

Males Females


Included Standard deviation Excluded Standard deviation (P) Included Standard deviation Excluded Standard deviation (P)
Prior utilization .31 (.75) .33 (.81) (.59) .21 (.6) .13 (.4) (.02)
Electrocardiogram 1.8 (.86) 1.6 (.87) (.00) 1.5 (.8) 1.6 (.8) (.13)
Expiratory volume 251 (65) 238 (59) (.00) 182 (41) 171 (46) (.00)
Subscapular skinfold 18 (7) 14 (6) (.00)
Blood sugar 92 (31) 88 (27) (.05)

SOURCE: Boston University: Data from the Framingham Heart Study.

Examination of the Model 5 residuals for the regressions for males and females showed several outliers. A check of the variable values for these subjects suggested that the outliers were not artifacts of data measurement or recording error. Regressions were performed eliminating outliers. The outcome was a slight decrease in R2 for males and a slight increase in R2 for females but little change in the significance levels or regression coefficients for individual variables.

Log (base 10) transformations were performed for Model 5 dependent variables. For both males and females the results were similar: R2s decreased slightly, relative to the equations in which the dependent variable was untransformed. Again, there was little change in the significance levels of individual variables.

Logistic regression was performed using Model 5 independent variables and the dependent variable dichotomized as “no” hospitalization and “one or more” hospitalizations. For both males and females, the results of the stepwise logistic procedures and the stepwise least-squares regression procedures were essentially the same: for males, prior utilization, expiratory volume, and EKG were significant; for females, prior utilization, EKG, and blood sugar were significant.

Theoretically, the relationship between some risk factors and utilization might be curvilinear; extreme values could be associated with low levels of utilization because those at one tail of the distribution represent excellent health, while those at the opposite tail represent imminent death. In either case, extreme values might be associated with lower levels of utilization than values representing intermediate health status. To explore this possibility, the variables for expiratory volume, subscapular skinfold, and blood sugar were squared and included in the Model 5 equations. The results were unremarkable (polynomial terms were not significant) suggesting that the linear model could not be improved upon through use of this method.

A summary of the results of the regression analyses is presented in Tables 4 (males) and 5 (females).

Table 4. Summary of regression on hospitalizations for males, by model number and variable.

Item Model number

1 2 3 4 5
R2 percent .74 4.97 4.61 10.17 9.69
N 1,173 1,035 780 663 734
Variables Standard estimate
Age ** − .063 −.033
Education −.005 −.012
Marital status −.007 .012
Exam number ***.075 .059 .056
Prior utilization ***.225 ***.259 ***.253
Electrocardiogram ***.168 ***135 ***.130
Expiratory volume *** − .082 * − .068 **−.076
Skinfold ** −.075 ** − .055 −.054
Blood sugar **.069 .007
Vital capacity
Left ventricular hypertrophy
Systolic blood pressure
Hematocrit
Cholesterol
Cigarettes per day
*

Significant at the .10 level.

**

Significant at the .05 level.

***

Significant at the .01 level.

SOURCE: Boston University: Data from the Framingham Heart Study.

Discussion

Demographic variables failed to prove useful as predictors of the frequency of hospitalization. Adjusted R2s were less than 1 percent for both males and females. It should be noted, however, that the demographic variables available for this study had little relation to those used in the AAPCC since age, sex, institutional and welfare status, and region of residency were either controlled through limited value ranges (age), separate analysis (sex), or were not relevant to the Framingham data set (institutional and welfare status, and residency). Our results are, however, consistent with those of Thomas and Lichtenstein (1986), who found that income, education, and marital status were poor predictors of Medicare reimbursements.

Prior utilization was an important predictor of 2-year postexam hospitalizations. These results are consistent with those of Anderson, Resnick, and Gertman (1982), Beebe, Lubitz, and Eggers (1985), Thomas and Lichtenstein (1986), and Ash (1985). In the present study, this association was stronger for males (adjusted R2 = 4.97 percent) than for females (adjusted R2 = 1.94 percent). This difference in the performance of the prior utilization measure for males and females is curious and suggests that the kinds of health problems for which males and females age 64-65 years are hospitalized are different. It is possible that males may be more apt to be hospitalized for chronic conditions, whereas females may be more apt to be hospitalized for “incident” conditions not related to prior admissions. This point is discussed further in this article.

Several risk factors for chronic diseases were found to be important predictors of hospitalization. This was true for both males and females, although again the risk factors were more powerful predictors for males (adjusted R2 = 4.61 percent) than for females (adjusted R2 = 1.88 percent). As with prior utilization, the reason for the differential between males and females is unclear. One possible explanation relates to the nature of the risk factors used for the present study. Framingham risk factors focus on cardiovascular disease. Expiratory volume, which proved useful for prediction, is a measure of obstructive airway disease (emphysema and chronic bronchitis). Both cardiovascular disease and airway disease develop earlier in males than in females. Given that the samples for both sexes were of comparable age and were for relatively young people, in light of the natural histories of these diseases, it is probable that cardiovascular and airway diseases were, on average, more advanced among the sample of males than among the sample of females. Therefore, a stronger association between these risk factors and hospitalization might be expected for males than for females.

Obesity, as measured by skinfold, was a predictor of hospitalization for males, but not for females. It is interesting that the coefficient for skinfold for males was consistently negative. As suggested above, this may reflect more advanced disease among males; weight loss often accompanies advanced stages of disease. The finding that blood sugar was a stronger predictor of hospitalization for females than for males is consistent with reports that the cardiovascular sequelae of diabetes are more pronounced for females than for males; females with high blood sugar are more apt to develop cardiovascular problems than males with high blood sugar. (Kannel and Stokes, 1985).

It is important to consider those variables that were included in this analysis but that failed to attain importance. For example, because smoking is a risk factor for several major diseases (cardiovascular, lung cancer, chronic obstructive airway disease) one might expect it to have been included among the significant variables. Nevertheless, given the low incidence of lung cancer, and given that regressions in effect controlled for cardiovascular disease (EKG) and chronic airway disease (FEV1), it is not surprising that smoking was eliminated from the stepwise processes for both males and females.

Similarly, other variables that are risk factors for cardiovascular disease (e.g., elevated systolic blood pressure, ventricular rate, and cholesterol) were probably eliminated by the presence of the variable for EKG.

For both males and females, the combination of prior utilization and physiological risk factors appeared to be additive. The regressions including both sets of variables resulted in an adjusted R2 of 9.96 percent for males, roughly equivalent to the sum of the R2 for prior utilization (4.97 percent) alone, and the R2 for risk factors (4.61 percent) alone. For females, the regression combining prior utilization and risk factors yielded an adjusted R2 of 3.61 percent, also equivalent to the R2 for prior utilization (1.94 percent) alone, and risk factors (1.88 percent) alone. This seeming independence of prior utilization and risk factors is also demonstrated by the consistency of the standardized beta weights across the various regression models (see Tables 4 and 5). When prior utilization was regressed for males as a single variable, its standardized beta value was .225; when combined with risk factors, it was .253. Similarly, for females, the standardized beta weights for prior utilization were .1419 and .1429, when this variable was used singly and in combination with physiological risk factors.

Table 5. Summary of regression on hospitalizations for females, by model number and variable.

Item Model number

1 2 3a 4 5
R2 percent .14 1.94 1.88 3.23 3.61
N 1,474 1,284 1,261 1,120 1,134
Variables Standard estimate
Age * − .0475 .008
Education − .0347 −.004
Marital status − .0206 −.002
Exam number −.0193 .017 .016
Prior utilization ***.1419 ***.142 ***.142
Electrocardiogram ***.067 ***.087 ***.088
Expiratory volume ***−.10 *−.55 *−051
Skinfold
Blood sugar *.049 *.054 *.055
Vital capacity
Left ventricular hypertrophy
Systolic blood pressure
Hematocrit *−.05 −.017
Cholesterol
Cigarettes per day
*

Significant at the .10 level.

***

Significant at the .01 level.

SOURCE: Boston University: Data from the Framingham Heart Study.

The fact that the importance of prior utilization is relatively unchanged by the addition of risk factors suggests that the two sets of variables are explaining different portions of the variance in frequency of hospitalization. Several interpretations are possible. First, prior utilization may be associated with reasons for hospitalization different from those associated with risk factors. Alternatively, prior utilization and risk factors could be measuring different stages of the same disease, e.g., present during the preexam period but requiring hospitalization only during the postexam period.

Limitations

The results of this investigation must be qualified in several ways. First, there is reason to believe that recall bias may lead to an underreporting of hospitalizations among the Framingham cohort. A cross-sectional sample from the 1977-78 Current Medicare Survey, based on reimbursement data, shows that among beneficiaries 66-68 years of age, 68 percent of males and 74 percent of females had no hospitalizations. This contrasts with 75 percent (males) and 80 percent (females) of the Framingham subjects who reported no hospitalization. This difference suggests a possible rate of underreporting of around 25 percent for both sexes. A study of recall error for hospitalizations conducted by the University of Michigan Survey Research Center (on behalf of the National Health Survey) found a 19 percent rate of underreporting (for 12 months) among males and females 65-74 years of age (Cannell, Fisher, and Bakker, 1961). Because Framingham subjects were asked to recall hospitalization over a 2-year period, an underreporting rate of 25 percent is not necessarily unexpected. The nature of the bias related to underreporting is suggested by the same Michigan study, which found that underreporting is a function of the number of acute and chronic conditions reported by respondents: the fewer conditions reported, the greater the underreporting.

Second, observer error might also bias results if, for example, examining physicians had been more assiduous in recording hospitalizations related to heart disease than hospitalizations for other conditions. We have no way of assessing the probability of this bias.

Accordingly, reported hospitalizations may have been biased toward serious conditions. Heart disease is a serious condition. Thus, a recall bias in favor of hospital episodes for serious health problems and/or observation bias by recording physicians could have inflated the association between cardiovascular risk factors and frequency of hospitalization.

Third, deaths posed a methodological dilemma. Death is often preceeded by hospitalization. Thus, excluding subjects who died eliminates an important source of variation in the dependent variable. On the other hand, including deaths in the sample results in differentials in the probability of utilization among the subjects; subjects who live have a 2-year exposure to possible hospitalizations, whereas subjects who die have a probability proportional to their time alive. Moreover, the fact that medical records were used to count hospitalizations for persons who died could also contribute to the bias toward serious conditions because underreporting would be less likely than with recall. Because of the importance of utilization of those who had died during the study period we chose to include these subjects in the study.

Fourth, sampling bias represents a potential threat to the validity of this study's results. This can occur in two ways. First, the original Framingham sample was not representative of the national Medicare population. Second, the sample used in any particular regression could be biased due to missing data. To mitigate the second problem, the procedures followed attempted to maximize sample size at each stage of analysis. In addition, means were compared for the samples and included in, and excluded from, regression models. Furthermore, the overlap in variables between the separately derived models for males and females suggests that the results were valid, although perhaps idiosyncratic to the Framingham population.

Fifth, the purpose of this study was to evaluate the association between risk factors and hospitalization with the intent of drawing inferences about the potential predictive value of risk factors for health care costs. How well behavioral and physiological risk factors will predict other measures of utilization (e.g., hospital days, ambulatory care visits, Medicare expenditures) remains to be seen. Even if risk factors were not good predictors of nonhospital utilization, they might remain useful predictors of total health care costs because of the relative expense of hospital days. Nevertheless, interpretation of these results must be constrained by the limitations of the dependent variable. Frequency of hospitalization will be imperfectly correlated with reimbursement for hospitalization. The dependent variable contained no information on ambulatory care or other reimbursable health care costs.

Sixth, although the risk factors for many conditions causing hospitalization are unknown, some other risk factors are known and deserve future study within the context of predicting health services utilization. Alcohol consumption is an example. The original Framingham data set does contain some information on alcohol consumption. This information was collected irregularly, however, over the full-time period of the study and thus was not available for use in this analysis without substantial loss in sample size.

In addition, health system characteristics can have a potential effect on health care utilization. Wennberg and Gittlesohn (1982) have demonstrated differentials in population-based rates for certain surgical procedures. Their studies suggest that variation tends to occur between, rather than within, hospital service areas. Because Framingham represents a relatively cohesive service area (there is only one community hospital), the influence of individual physicians on hospitalization rates should not be great. Nevertheless, it is possible that the probability of hospitalization varies, to some degree, as a function of the individual primary care physician, but this was not analyzed in this study. This consideration warrants further attention in research on the AAPCC.

Finally, further research is required to determine the extent to which the results of this study are generalizable to populations other than Framingham.

Conclusion

In conclusion, low R2s resulting from the regression models developed for this study suggest that the risk factors may be of limited practical value for predicting health care utilization for individuals. Moreover, there are many health problems for which no known risk factors exist. Thus, at present, it is probable that other approaches (e.g., reinsurance) will be required to address the immediate problems of the Medicare capitation program.

Nevertheless, this study has demonstrated that risk factors can and do make a contribution to the prediction of hospitalization as one measure of utilization. It has also been shown that this contribution is independent of prior utilization. Accordingly, the results of this investigation add to knowledge of the epidemiology of health services utilization. Risk factors other than those available for the present study are known to be important causes of morbidity and mortality. Alcohol consumption as already noted is one example. Moreover, new risk factors are constantly being identified and should be evaluated with respect to their contributions to predicting hospital utilization.

The association between risk factors and health care use has implications that go beyond the AAPCC. To the extent that risk factors can be modified through behavior change and/or early intervention, this association has the potential for providing the connection between the ideal of disease prevention and the policy of cost containment. Quantifying the risks for health services utilization provides the opportunity to assess the economic value of risk reduction. The emergence of competition in the health care market provides the mechanism by which estimates of long-term health costs can become near-term economic incentives. Given the pressures for cost containment, and the breadth of the community of interest (employers, government, individuals), it is likely that new methods for assessing risk for health care expenditures will find additional applications.

Acknowledgments

The authors would like to acknowledge the advice and assistance of Timothy Heeren in providing statistical consultation and the support by the Pew Memorial Trust for the research reported in this article. The comments of the three anonymous reviewers were also helpful.

Technical note

Definitions for certain risk factor variables

Subcapular skinfold

A measure of body fat obtained by measuring skinfold under shoulderblade.

Forced expiratory volume (FEV1)

Volume expelled (with effort) from the lungs in 1 second: a measure of lung tissue elasticity or bronchial obstruction.

Vital capacity (FVC)

Total volume expelled (with effort) from the lungs: a measure of lung tissue scarring or neuromuscular restriction of breathing.

Glucose intolerance

Test to determine the body's ability to tolerate ingested sugar: indicates diabetes mellitus.

Blood sugar

Blood sugar level also diagnoses diabetes mellitus.

EKG (General impression)

Examining physician's overall impression of electrocardiogram tracing.

Left ventricular hypertrophy

Overdevelopment of the left ventricular muscle: a measure of coronary occlusion.

Ventricular rate

Ventricular rate is equivalent to pulse rate.

Hematocrit

Proportion of blood cells to fluid volume after separation by centrifuge: indicates anemia, etc.

Footnotes

1

To limit the potentially compounding influence of age, we focused on a narrow age cohort. The specific age range of 64-65 years was selected because it offered the largest number of Medicare-eligible subjects and it represented an age at which new beneficiaries were likely to make decisions regarding enrollment in capitated programs.

References

  1. Anderson JJ, Resnick AL, Gertman PM. Prediction of Subsequent Year Reimbursement Using the Medicare History Files. Boston University Medical Center, University Health Policy Consortium; Boston. 1982. Unpublished paper. [Google Scholar]
  2. Ash A. Predicting Costs for Non-randomly Selected Medicare Populations: Doing Better with Prior Utilization Data. Boston University Medical School, Health Care Research Unit; Boston: 1985. Unpublished paper, draft of March 20, 1985. [Google Scholar]
  3. Beebe J, Lubitz J, Eggers P. Health Care Financing Review. No. 3. Vol. 6. Washington: U.S. Government Printing Office; Spring. 1985. Using prior utilization to determine payments for Medicare enrollees in health maintenance organizations. HCFA Pub. No. 03198. Office of Research and Demonstrations, Health Care Financing Administration. [PMC free article] [PubMed] [Google Scholar]
  4. Cannel C, Fisher G, Bakker T. Reporting of hospitalization in the Health Interview Survey. No. 4. Washington: U.S. Government Printing Office; 1961. (Health Statistics Series D). USDHEW Pub. No. (PHS) 584-D4. [Google Scholar]
  5. Dawber TR. The Framingham Study. Cambridge, Mass.: Harvard University Press; 1980. [Google Scholar]
  6. Gruenberg L. The AAPCC—A Preliminary Examination of the Issues. Brandeis University, Center for Health Policy Research and Analysis, Florence Heller School; Waltham, Mass.: 1982. Unpublished paper. [Google Scholar]
  7. Gruenberg L, Thompkins C. An Analysis of Risk-Sharing and Reinsurance in Medicare HMO's. Brandeis University, Center for Health Policy Research and Analysis, Florence Heller School; Waltham, Mass.: 1984. Unpublished paper. [Google Scholar]
  8. Iglehart JK. Medicare turns to HMO's. New England Journal of Medicine. 1985;312(2):132–136. doi: 10.1056/NEJM198501103120237. [DOI] [PubMed] [Google Scholar]
  9. Kannel WB, Stokes J. The epidemiology of coronary artery disease. In: Cohn PF, editor. Diagnosis and Therapy of Coronary Artery Disease. New York: Martinus Nijhoff Publishing; 1985. [Google Scholar]
  10. McClure W. On the research status of risk-adjusted capitation rates. Inquiry. 1984 Fall;21(3):205–213. [PubMed] [Google Scholar]
  11. Thomas JW, Lichtenstein R, Wyszewianski L, Berki SE. Increasing Medicare enrollment in HMO's: The need for capitation rates adjusted for health status. Inquiry. 1983 Fall;20(3):227–239. [PubMed] [Google Scholar]
  12. Thomas JS, Lichtenstein R. Health Care Financing Review. No. 3. Vol. 7. Washington: U.S. Government Printing Office; Spring. 1986. Functional health measure for adjusting health maintenance organizations capitation rates. HCFA Pub. No. 03222. Office of Research and Demonstrations, Health Care Financing Administration. [PMC free article] [PubMed] [Google Scholar]
  13. Wennberg J, Gittlesohn A. Variations in medical care among small areas. Scientific American. 1982;246(4):120–134. doi: 10.1038/scientificamerican0482-120. [DOI] [PubMed] [Google Scholar]

Articles from Health Care Financing Review are provided here courtesy of Centers for Medicare and Medicaid Services

RESOURCES