Abstract
OBJECTIVE
To examine the responsiveness of the 36-Item Short Form Health Survey (SF-36) to clinical changes in three surgical groups and to study how health-related quality of life (HRQL) changes with time among patients who undergo total hip arthroplasty, thoracic surgery for treatment of non-small-cell lung cancer, or abdominal aortic aneurysm (AAA) repair.
DESIGN
Prospective cohort study with serial evaluations of HRQL preoperatively and at 1, 6, and 12 months after surgery.
SETTING
University tertiary care hospital.
PATIENTS
Of 528 patients, more than 50 years of age, who were admitted for these elective procedures, 454 (86%) provided preoperative health status data and are members of the study cohort. At 12 months after surgery, 439 (93%) of the cohort was successfully contacted and 390 (90%) provided follow-up interviews.
MEASUREMENTS AND MAIN RESULTS
The Medical Outcomes Study SF-36, the Specific Activity Scale, five validated health transition questions, and a 0 to 100 scale measure of global health were used to assess changes in health status at 1, 6, and 12 months after surgery. Change in health status as measured by the SF-36 demonstrated that physical function and role limitations due to physical health problems were worse 1 month after these three surgeries. However, by 6 months after surgery, most patients experienced significant gains in the majority of the dimensions of health, and these gains were sustained at 12 months after surgery. Longitudinal changes in the SF-36 were positively associated with responses to the five health transition questions, to changes on the Specific Activity Scale and global health rating question, and to clinical parameters for persons who had AAA repair. These findings indicate that the SF-36 has evidence of validity and is responsive to expected changes in HRQL after elective surgery for these procedures.
CONCLUSIONS
For the total hip arthroplasty patients, responsiveness was greatest for the SF-36 scales that measure physical constructs. However, for the two other procedures and at various points of recovery, significant changes were observed for all eight subscales, suggesting that responsiveness was dependent on the type of surgery and the timing of follow-up, and that multidimensional measures are needed to fully capture changes in HRQL after surgery.
Keywords: health-related quality of life, elective surgery
In recent years, pressures to prioritize health care spending have amplified the need to assess and compare the outcomes of medical interventions.1,4 To this end, investigators have developed questionnaires that broaden the focus beyond traditional clinical outcomes to include the patient's perception of health-related quality of life (HRQL).5,9 This investigation assesses the responsiveness of the Medical Outcomes Study (MOS) 36-Item Short Form Health Survey (SF-36 or RAND 36-Item Health Survey 1.0) when it is used to monitor changes in HRQL after major elective surgery and also examines how HRQL changes with time after three specific surgical procedures.10,13 The elective surgical population is of particular interest because of the high costs and uncertain benefits associated with many procedures.
The SF-36 is an attractive measure because of its brevity, patient acceptability, and rigorous psychometric development.9,18 In addition, published population-based data greatly enhance the interpretation of SF-36 scores.17 The SF-36 has well-established reliability and validity in cross-sectional studies of medical and psychiatric patients,11,18,21 and in longitudinal studies of selected medical conditions such as diabetes mellitus,15,22 end-stage renal disease,16 HIV infection,23,24 asthma,25 coronary disease,26 and psychiatric diseases.27,29
Except for arthroplasty surgery,30,35 however, only a few surgery-specific studies of highly select patients have provided data to support the SF-36's responsiveness to the effects of elective surgery over time.36,39 This is the first investigation to examine simultaneously the responsiveness of the SF-36 across multiple procedures at various postoperative time intervals. These analyses describe preoperative HRQL and observed changes in health status after three major elective surgical procedures: total hip arthroplasty (THA); thoracic surgery for the treatment of non-small-cell lung cancer; and resection of abdominal aortic aneurysm (AAA).
Subjects
Eligible subjects were patients older than 50 years of age who were admitted to Brigham and Women's Hospital, a 720-bed urban teaching hospital, for unilateral THA for the treatment of hip dysfunction caused by osteoarthritis, thoracic surgery (lobectomy or pneumonectomy) for the treatment of non-small-cell lung cancer, or repair of AAA, between November 1, 1990, and May 30, 1993. Eligibility criteria included the ability to speak English and adequate hearing and cognitive function to complete a preoperative self-administered health status questionnaire and postoperative telephone interviews. During the study period, 528 (95%) of the eligible patients who were approached by research staff gave informed consent and were enrolled in the study. Of these, 454 patients (86%), completed the preoperative health status questionnaire and constituted the cohort for this report. The distributions of gender, age, and race among the persons who did not return the preoperative questionnaire were similar to those among persons who did. However, a greater proportion of those who did not return the survey had three or more coexistent chronic medical conditions (p < .05 for linear trend).
Of the 454 participating patients, 5 died before the 1-month follow-up interview, another died before the 6-month follow-up, and 17 more deaths occurred before the 12-month interview (Fig. 1). Of these 23 deaths, 14 were among thoracic surgery patients, 6 among those who had AAA repair, and 3 among the THA patients. Because changes in HRQL are relevant only for persons who survive, subjects were deleted from the analyses from the time of their death forward.
Data Collection
All subjects were evaluated a median of 1 day before surgery. Data were prospectively collected on medical history, including 26 underlying chronic medical conditions, physical examination, and mental status. The MOS SF-36 was included preoperatively and at 1, 6, and 12 months after surgery. The SF-36 includes questions that evaluate eight health dimensions: health perception, physical function, role limitations due to physical health problems and role limitations due to emotional problems, mental health, social function, vitality, and bodily pain.9 Subjects also completed the Specific Activity Scale (SAS), an ordinally scaled, four-class measure of cardiovascular physical functioning (1 = best to 4 = worst) based on the metabolic expenditures of various personal care, housework, occupational, and recreational activities before and at 6 months after surgery.40,41
To determine whether participants considered their health to be unchanged, better, or worse when compared with their preoperative health, the 6-month follow-up interview included five dimension-specific health transition questions (Appendix A).42 Finally, during all four interviews patients were asked the following question as a measure of global health:
If you were to rate your current health on a scale from 0 to 100 with 100 being perfect health and 0 being death, what number would you rate yourself today?
Responses from the non-SF-36 questions were used to evaluate the validity of the observed changes in the SF-36 after each of the elective procedures. Patients were interviewed at 1, 6, and 12 months after surgery to determine whether the SF-36 would be sensitive to short-term disability from elective surgery, and to the long-term benefits of surgery or declines in health due to progression of the underlying condition.
Rationale for the Selection of Surgical Procedures and Expected Changes in Quality of Life
We used the framework provided by the physical, mental, and combined constructs represented in the SF-36,11 to classify the expected surgery-specific changes in SF-36 scores over time (Table 1). A range of anticipated changes in HRQL were derived from published condition-specific descriptions of outcomes after THA, thoracic surgery for lung cancer, and repair of AAA. We have used these to evaluate the responsiveness of the SF-36 relative to expected longitudinal changes in clinical status over time.
Table 1.
Total Hip Arthroplasty
Previous studies of outcomes after THA have demonstrated dramatic improvements in HRQL, particularly for the dimensions of pain and physical functioning.36,39 From published findings, we anticipated that after transient procedure-related declines in physical health, THA patients would report large improvements on the SF-36 subscales that share variance with the physical construct and moderate improvements on the combined construct of the SF-36.11 Because chronic pain can influence the severity of symptoms related to emotional well-being, we also expected subscales that share variance with the mental construct to improve moderately after surgery Table 1.
Thoracic Surgery for Lung Cancer
As described by Dales et al.,43 we expected patients to report substantial declines in physical functioning during the immediate postoperative period followed by improvements with possible return to preoperative levels by 12 months after surgery for those who remained disease-free. If the SF-36 is a useful tool for the longitudinal assessment of persons with lung cancer, it should be sensitive to the immediate physical and emotional consequences of lost respiratory capacity due to surgery and postoperative pain, as well as the long-term effect of uncertainty about prognosis (Table 1).
Abdominal Aortic Aneurysm Repair
Repair of AAA is a procedure with known moderate to severe short-term morbidity, therefore we expected improvement among survivors on the subscales that represent the mental health construct. Because vascular disease of the aorta rarely occurs in isolation, we expected these patients to have more coexistent vascular disease at the time of surgery. Previous investigators have demonstrated that the SF-36 is sensitive to the effect of coexistent medical conditions.44–46 For these reasons we expected the AAA patients to have pronounced short-term declines in SF-36 subscales that represent the physical and combined constructs. For those who survived the procedure, we expected a return to preoperative levels of HRQL, rather than improvements related to the procedure (Table 1).
Statistical Analysis
Health Status Scores
The scoring algorithm for the SF-36 was identical to that described by Ware et al.17 Mean scores were calculated by surgical procedure. The percentages of persons with the highest and lowest possible preoperative SF-36 scores also were recorded.
Changes in Health Status over Time
Mean changes in health status scores at 1, 6, and 12 months after surgery relative to preoperative scores were calculated for the three surgical groups. Student's paired t tests were used to assess the significance of within-subject changes in the SF-36 subscales and the 0 to 100 global rating scale. The Wilcoxon Sign-Rank Test was used to test for significant change in the SAS at 6 months after surgery.47
Responsiveness of the SF-36
To assess the responsiveness of each SF-36 subscale, we compared the square of the absolute value of the T statistic (12-month after-surgery score − preoperative score/SD of the difference) for each of the subscales relative to the subscale with the largest T statistic. This comparison simultaneously accounts for the magnitude of change over time and the variability in each subscale. The relative efficiency statistic (T1/T2)2, first described by Liang et al.,48 identifies which SF-36 subscale at each point in time is most responsive for the three procedures under investigation.
Validity of the SF-36
To evaluate the validity of the longitudinal changes observed with the SF-36 for the three surgical groups, we conducted three statistical comparisons.49 The first test compared the magnitude of the correlations between changes in the SF-36 and changes in the overall-health-rating question (0 –100) and SAS class. We interpreted the clinical significance of these observed correlations by adopting a framework described by McHorney et al.,11 in which correlations that are greater than or equal to 0.70 are classified as “high,” those between 0.7 and 0.3 are considered to be “moderate,” and those less than 0.3 are considered to be “low.” Because response to the 0 to 100 rating question was based on a person's synthesis of overall health, our expectation was that changes in the response to this question would be moderately correlated with the SF-36 subscales that contribute to the combined construct (health perception and vitality) and that the remainder of the subscales would have low correlations (<0.3) with the 0 to 100 rating question.
The SAS covers some physical activities that are similar to those found in the SF-36 physical functioning scale; therefore we expected changes in this measure to correlate moderately (0.3–0.7) with changes in the SF-36 subscales that measure the physical construct (physical functioning, role limitations due to physical problems, and bodily pain). Because poorer cardiovascular functioning can lead to fatigue, we expected moderate correlations with changes in SAS and the SF-36 vitality scale. We expected low correlations between the SAS and the remaining SF-36 scales.
A person's impression of transition in HRQL is the closest proxy to a “gold standard” for change in HRQL (Appendix A).42 Our second statistical comparison uses data from transition questions to compare the magnitude of the dimension-specific changes in SF-36 scores for participants who reported that they were better, the same, or worse at 6 months after surgery. To provide a reference point for the clinical interpretation of the SF-36 scores, the third test statistic uses Student's t tests for pairwise comparisons of preoperative and postoperative scores to published age- and gender-adjusted population-based scores.17 To adjust the published scores, they were weighted to match the surgery-specific distribution of age and gender in our sample.
Tests of Clinical Validity: Procedure-Specific Analyses of SF-36 Scores
Within the AAA subgroup, validity of the observed SF-36 subscale changes was also assessed using clinical parameters by comparing mean change at 1 year after surgery among those with more than three coexistent medical conditions relative to those with fewer than three comorbidities. From the work of Greenfield et al.,44–46 we expected that those with more medical comorbidity would have greater declines in HRQL over time.
Reliability of the SF-36
To determine whether the SF-36 was equally reliable across different surgical categories, we calculated Cronbach's coefficient α as a measure of internal consistency for each of the multi-item subscales of the SF-36 in each of the groups.50 Significance was defined as a two-tailed p value ≤.05 for all statistical tests.
Population
The 454 patients (54% female; 96% white) had a mean (± SD) age of 67 ± 9 years (Table 2). Common comorbid conditions were hypertension (42%), arthritis (22%), ischemic heart disease (22%), and diabetes mellitus (8%); 238 (52%) of the patients had only one or no comorbid conditions (Table 2). Patients in the three surgical groups varied by demographic and medical characteristics.
Table 2.
The preoperative SF-36 scores that contribute to the physical construct (physical function, role limitations due to physical limitations, and bodily pain) were lowest for patients scheduled for THA, whereas scores on the SF-36 health perception subscale were lower for patients with potentially life-threatening conditions (lung cancer or AAA) (Table 3). More than one third of the preoperative scores were at the ceiling or floor of the role-physical and role-emotional subscales and the social-functioning subscale, indicating that for these three subscales, scores after surgery could only stay the same or change in one direction. Reliability estimates based on Cronbach's coefficient α ranged from 0.92 to 0.86 for the SF-36 subscales.
Table 3.
Responsiveness of the SF-36 to Within-Procedure Change in Health Status over Time
In this section we first compare the observed changes in SF-36 scores to the expected changes described in Table 1. We then report which SF-36 scale is most responsive for each procedure, and finally we compare the preoperative and postoperative SF-36 scores to a population-based sample. As was hypothesized, for patients undergoing any one of the elective surgeries under consideration, SF-36 scores obtained 1 month after surgery demonstrated significant declines in physical functioning and role limitations due to physical problems (Table 4).
Table 4.
Total Hip Arthroplasty
An unexpected early benefit from surgery was large and significant improvement in bodily pain. Improvements in pain became even more pronounced by 6 months and were sustained at 12 months after surgery. The dramatic improvements observed across the mental and combined health constructs were of greater magnitude than anticipated. By 12 months after surgery, the THA patients had substantial and significant (p < .05) improvements across all eight SF-36 dimensions of HRQL.
For patients undergoing THA, bodily pain was the most responsive subscale at all three points in time. However, at 6 and 12 months after surgery, the scales for physical functioning and role limitations due to physical problems had relative efficiency statistics above 0.47, indicating that they were also moderately responsive.
The clinical relevance of the observed changes in SF-36 scores can be appreciated when the THA patients’ scores are compared with those from an age- and gender-adjusted population-based sample (Fig. 2). For seven of the eight subscales, the preoperative scores were significantly lower than scores from the population sample. However, by 6 months after surgery, all eight subscales were the same or better for the THA patients relative to the population-based sample. This improvement was sustained at 12 months after surgery.
Thoracic Surgery for Lung Cancer
As expected, patients who had thoracotomy reported increased pain and significant declines in all dimensions of health that contribute to the physical construct. Unexpectedly, at 1 month after surgery, mental health, role limitations due to mental health, and health perception did not decline. However, by 6 months after surgery, health perception became significantly poorer than preoperative levels and did not improve by 12 months after surgery. Patients continued to report significant declines in physical function, bodily pain, and vitality at 6 and 12 months after surgery. As expected, subscales that contribute to the mental construct improved by 12 months after surgery.
For persons undergoing resection of lung cancer, the bodily pain scale was the most responsive SF-36 scale at 1 month after surgery. However, at later follow-up times, 6 of the 8 subscales were moderately to highly responsive (relative efficiency statistic 0.4 – 0.9), suggesting that the patients treated with thoracotomy for lung cancer experienced positive and negative changes across many dimensions of health, rather than along a particular construct.
When compared with a population-based sample, preoperative patients with non-small-cell lung cancer had significantly poorer HRQL principally among the scales that share variance with the mental construct (social function, role limitations due to mental health problems, and mental health) and the combined construct (health perception and vitality) (Fig. 2). As only patients with localized disease are considered candidates for surgery, it is not surprising that preoperative HRQL along the physical construct was not significantly worse than that of an age- and gender-adjusted population-based sample. At 1 month after surgery, all SF-36 subscales except for health perception were significantly lower than the scores reported by a comparable population-based sample. Although lower than preoperative levels, by 12 months after surgery, the SF-36 physical functioning, bodily pain, and health perception scales were no longer different from those reported in the population-based sample (Fig. 2).
Abdominal Aortic Aneurysm Repair
As was expected, patients undergoing AAA repair reported significant declines in most of the scales that contribute to the physical and combined constructs at 1 month after surgery, but these declines returned to preoperative levels or better by 6 months after surgery (Table 4). The return to preoperative levels was sustained at 12 months after surgery. Unexpectedly, by 6 months after surgery, patients reported significant improvements in scales that contribute to the mental construct relative to preoperative levels.
Because patients who had AAA repair were still recovering from surgery at 1 month, the most responsive subscales changed in the negative direction; these included role limitations due to physical problems, vitality, physical functioning, and social functioning (relative efficiency statistic > 0.5). By 6 months after surgery, the responsivity of the AAA patients was greatest for the subscales that contribute to the mental construct of the SF-36. As was expected, the 12-month change scores for six of the eight subscales were negative (−7 to −12 points) for AAA patients who had three or more medical comorbidities before surgery. These change scores were significantly lower than those for AAA patients who had fewer comorbidities.
When compared with a population-based sample, preoperative AAA patients had significantly greater role limitations due to physical or emotional problems, and significantly poorer mental health scores. At 1 month after surgery, except for pain and role limitations due to mental health problems, all of the SF-36 scores were significantly lower for AAA patients. However, by 6 months after surgery, all eight subscales were the same or better than those of a comparable population-based sample, and these improvements were sustained at 12 months after surgery (Fig. 2).
The 0 to 100 health-rating scores improved significantly for THA patients at 6 and 12 months after surgery and declined significantly for the patients with lung cancer. The SAS improved only for the THA patients.
Validity of the SF-36
There was evidence of correlational validity when change scores on the SF-36 were compared with the observed changes using other questions designed to capture similar information. The strongest evidence was moderate correlation between changes at 6 months for the SF-36 subscales that contribute to the combined construct (vitality and health perception) and 6-month changes observed with the second measure of global health, the 0 to 100 rating question (r = .4;p = .0001). Of note, changes in the 0 to 100 rating question were also moderately correlated with changes in the SF-36 mental health score (r = .4;p = .0001). Statistically significant but lower correlations were observed for the remainder of the SF-36 subscales (r = .1 to .3). Similarly, changes in the two measures of physical functioning (the SF-36 physical function subscale and the SAS) were also moderately correlated (r = .3;p = .0001). As would be expected, change on the SAS did not correlate with SF-36 subscales designed to measure mental health.
A second test of validity compared the mean change in SF-36 scores for persons who categorized themselves as improved, the same, or worse after surgery. These analyses demonstrated that the mean improvements in SF-36 subscale scores were greatest for those subjects who reported improvement in their health at 6 months after surgery (Table 5). Declines in dimension-specific health were observed for all five subscales for those who reported that these dimensions were worse at 6 months after surgery; however, for four of the five subscales, the declines were not statistically significant (Table 5). For those subjects who reported no change in HRQL at 6 months after surgery, on average, SF-36 scores improved significantly for two of the five dimensions tested, but the magnitude of these SF-36 improvements was significantly lower when compared with subjects who reported HRQL improvements.
Table 5.
DISCUSSION
In this report, we demonstrate that the SF-36 was responsive in the expected direction over time, as judged by what is known clinically about recovery after the three surgeries studied; that no one subscale or construct was responsive for all three procedures at all of the points in time after surgery; and that the observed changes in the SF-36 tended to have a biphasic pattern after these elective surgeries.
Our strongest evidence for the validity of the SF-36 as an outcome tool after major elective surgery lies in the relation between the expected clinical response for each of these surgeries and the observed longitudinal changes in SF-36 scores. These data are consistent with and extend the findings of other investigators who have found the SF-36 to be responsive after knee and hip arthroplastic surgeries,31,35 and surgery for primary hyperparathyroidism,38 and run contrary to the minimal change in SF-36 scores observed after transurethral prostatectomy,37 and cataract surgery.51
Our findings with regard to the relative responsiveness of each of the SF-36 subscales demonstrate that conclusions about the most sensitive subscale are greatly influenced by the type of procedure and the timing of postoperative follow-up. These data support the importance of multidimensional assessments of health after elective surgery and indicate that the specific timing of the postoperative evaluation with the SF-36 may influence the interpretation of benefit derived from a procedure. It is noteworthy that preoperative ceiling effects, such as those observed for the two role-functioning and social-functioning subscales in this cohort, greatly influence responsiveness.
A biphasic pattern of decline in HRQL followed by significant long-term gains was observed for the two procedures for which it would be expected, after THA and after thoracic surgery for patients with lung cancer. The observed changes were most dramatic for patients who underwent THA, a procedure specifically designed to improve physical capacity and to decrease pain. The average decline in social function at 1 month after surgery for the cohort overall was equivalent to a change from being limited in one's ability to visit with friends some of the time to being limited most of the time. These declines in health status encountered 1 month after surgery are unlikely to be captured reliably in routinely collected administrative or medical record data and have implications for the provision of in-home services as economic pressures lead to shorter hospital lengths of stay.
Some dimension-specific changes in health did not conform to a biphasic pattern. For example, patients who had AAA repairs reported significant improvement in health perception despite worsening across many dimensions of health at 1 month after surgery. This positive change early during the postoperative recovery may reflect their relief at surviving a major life-threatening condition. Conversely, for those patients who had thoracic surgery, persistent declines in health perception, physical function, and vitality were observed at 12 months after surgery. The declines in HRQL for this group may have been influenced by the poor or uncertain long-term prognosis for many of these patients.
The validity of the SF-36 as a measure of outcome after elective surgery is supported by many of our results. As mentioned above, the strongest evidence for validity is the direction and magnitude of dimension-specific changes observed in each surgical group relative to what would be expected clinically. Moderate correlational validity is supported by the associations between the SF-36 subscales and the other indicators of physical function and global health perception that were measured in the cohort. Further evidence for construct validity of the SF-36 is provided by the positive association between the responses to the health transition questions, an independent patient impression of change in HRQL, and the observed change in SF-36 scores over time. However, it is important to note that participants who reported no change in dimension-specific health status over time actually had positive changes in their SF-36 scores. It is possible that at 6 months after surgery subjects answering the health transition questions were biased to recall preoperative functioning as better than it actually was at the time of the preoperative interview.
A strength of this study is that we have comparative data for three common surgical procedures collected simultaneously. Data are available that describe changes in health status after elective surgery.30,40,44,47,50 But, for the most part, these investigations have been limited to a small sample size and one procedure. In the aggregate, changes in SF-36 scores may prove to be useful indicators of quality of care. However, before comparisons can be made across different health care settings, reliable and valid methods for case-mix adjustment are needed.52
Our study is limited by a problem that is generalizable to all research evaluating health status outcomes—the lack of a single reference standard for HRQL against which the SF-36 and other instruments can be measured. The sensitivity of instruments for detecting small changes in health status therefore remains uncertain, but the patterns of change and the consistency of results with the different instruments suggest that the differences detected with the SF-36 are valid. Other limitations are the uncertain generalizability of data from a study conducted in a university-based tertiary care population and the 1-year follow-up time. We assessed preoperative health status after the patient had decided to undergo surgery, but before the surgery itself. Therefore the timing of data collection may have biased the preoperative scores and the subsequent differences that were observed.
Future research should examine the outcomes of specific operations in different populations, define the duration of impaired functional status after elective surgery, and when possible, compare patients undergoing surgery with those randomized to alternative treatments. Policy makers who set priorities for allocating health care resources should use standard approaches to compare the outcomes of different medical interventions.
Transition Questions
REFERENCES
- 1.Klevit HD, Bates AC, Castanares T, Kirk EP, Sipes-Metzler PR, Wopat R. Prioritization of health care services: a progress report by the Oregon Health Services Commission. Arch Intern Med. 1991;151:912–6. [PubMed] [Google Scholar]
- 2.Hadorn DC. Setting health care priorities in Oregon: cost-effectiveness meets the rule of rescue. JAMA. 1991;265:2218–25. [PubMed] [Google Scholar]
- 3.Daniels N. Is the Oregon plan fair? JAMA. 1991;265:2232–5. [PubMed] [Google Scholar]
- 4.Lansky D, Butler JBV, Waller FT. Using health status measures in the hospital setting: from acute care to ‘outcomes management.’ Med Care. 1992;30:57–73. doi: 10.1097/00005650-199205001-00006. [DOI] [PubMed] [Google Scholar]
- 5.Kaplan RM, Bush JW, Berry CC. Health status: types of validity and the Index of Well-Being. Health Serv Res. 1976;11:478–507. [PMC free article] [PubMed] [Google Scholar]
- 6.Bergner M, Bobbit RA, Carter WB, et al. The Sickness Impact Profile: development and final revision of a health status measure. Med Care. 1981;19:787–805. doi: 10.1097/00005650-198108000-00001. [DOI] [PubMed] [Google Scholar]
- 7.Jette AM, Davies AR, Cleary PD, et al. The functional status questionnaire: reliability and validity when used in primary care. J Gen Intern Med. 1986;1:143–9. doi: 10.1007/BF02602324. [DOI] [PubMed] [Google Scholar]
- 8.Hunt SM, McEwen J, McKenna SP. Measuring health status: a new tool for clinicians and epidemiologists. J R Coll Gen Pract. 1985;35:857–9. [PMC free article] [PubMed] [Google Scholar]
- 9.Stewart AL, Hays RD, Ware JE. The MOS Short-form General Health Survey: reliability and validity in a patient population. Med Care. 1988;26:724–35. doi: 10.1097/00005650-198807000-00007. [DOI] [PubMed] [Google Scholar]
- 10.Ware JE, Sherbourne CD. The MOS 36-Item Short-Form Health Survey (SF-36), I: conceptual framework and item selection. Med Care. 1992;30:473–83. [PubMed] [Google Scholar]
- 11.McHorney CA, Ware JE, Raczek AE. The MOS 36-Item Short Form Health Survey (SF-36), II: psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care. 1993;31:247–63. doi: 10.1097/00005650-199303000-00006. [DOI] [PubMed] [Google Scholar]
- 12.McHorney CA, Lu JFR, et al. The MOS 36-Item Short-Form Health Survey (SF-36), III: tests of data quality, scaling assumptions, and validity among diverse patient groups. Med Care. 1994;32(11):40–66. doi: 10.1097/00005650-199401000-00004. [DOI] [PubMed] [Google Scholar]
- 13.Hays RD, Sherbourne CD, Mazel RM. The RAND 36-Item Health Survey 1.0. Health Econ. 1993;2:217–27. doi: 10.1002/hec.4730020305. [DOI] [PubMed] [Google Scholar]
- 14.Ware JE, Kosinski M, Bayliss MS, McHorney CA, Rogers WH, Raezele A. Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: summary of results from the Medical Outcome Study. Med Care. 1995;33:AS264–79. [PubMed] [Google Scholar]
- 15.Nerenz DR, Repansky DP, Whitehouse MD, et al. Ongoing assessment of health status in patients with diabetes mellitus using the SF-36 and diabetes TYPE Scale. Med Care. 1992;30:MS112–24. doi: 10.1097/00005650-199205001-00010. [DOI] [PubMed] [Google Scholar]
- 16.Kurtin PS, Davies AR, Meyer KB, et al. Patient-based health status measurements in outpatient dialysis: early experiences in developing an outcomes assessment program. Med Care. 1992;30:MS136–49. doi: 10.1097/00005650-199205001-00012. [DOI] [PubMed] [Google Scholar]
- 17.Ware JE, Snow KK, Kosinski M, Gandek B. Boston, Mass. 1993. SF-36 Health Survey: Manual and Interpretation Guide. Nimrod Press. [Google Scholar]
- 18.Stewart AL, Greenfield S, Hays RD, et al. Functional status and well-being of patients with chronic medical conditions, results from the Medical Outcomes Study. JAMA. 1989;262:907–13. [PubMed] [Google Scholar]
- 19.Brazier JE, Harper R, Jones NMB, et al. Validating the SF-36 Health Survey questionnaire: new outcome measure for primary care. BMJ. 1992;305:160–4. doi: 10.1136/bmj.305.6846.160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Fryback DG, Dasbach EJ, Klein R, et al. The Beaver Dam Health Outcomes Study: initial catalog of health state quality factors. Med Decis Making. 1993;13(11):89–102. doi: 10.1177/0272989X9301300202. [DOI] [PubMed] [Google Scholar]
- 21.Johnson PA, Goldman L, Orav EJ, et al. Comparison of the Medical Outcomes Study Short-Form 36-Item Health Survey in black patients and white patients with acute chest pain. Med Care. 1995;33:145–60. [PubMed] [Google Scholar]
- 22.Weinberger M, Kirkman MS, Samsa GP, et al. The relationship between glycemic control and health-related quality of life in patients with non-insulin-dependent diabetes mellitus. Med Care. 1994;32:1173–81. doi: 10.1097/00005650-199412000-00002. [DOI] [PubMed] [Google Scholar]
- 23.Wu AW, Rubin HR, Mathews WC, et al. A health status questionnaire using 30 items from the Medical Outcomes Study: preliminary validation in persons with early HIV infection. Med Care. 1991;29:786–98. doi: 10.1097/00005650-199108000-00011. [DOI] [PubMed] [Google Scholar]
- 24.Hays RD, Shapiro MF. An overview of generic health-related quality of life measures for HIV research. Qual Life Res. 1992;1:91–7. doi: 10.1007/BF00439716. [DOI] [PubMed] [Google Scholar]
- 25.Bousquet J, Knani J, Dhivert H, et al. Quality of life in asthma, I: internal consistency and validity of the SF-36 questionnaire. Am J Respir Crit Care Med. 1994;149:371–5. doi: 10.1164/ajrccm.149.2.8306032. [DOI] [PubMed] [Google Scholar]
- 26.Krumholz HM, McHorney CA, Clark L, et al. Changes in health after elective percutaneous coronary revascularization. Med Care. 1996;34:754–9. doi: 10.1097/00005650-199608000-00003. [DOI] [PubMed] [Google Scholar]
- 27.Fifer S, Mathias SD, Patrick DL, et al. Untreated anxiety among adult primary care patients in a health maintenance organization. Arch Gen Psychiatry. 1994;51:740–50. doi: 10.1001/archpsyc.1994.03950090072010. [DOI] [PubMed] [Google Scholar]
- 28.Hays RD, Wells KB, Sherbourne CB, et al. Functioning and well-being outcomes of patients with depression compared to chronic medical illness. Arch Gen Psychiatry. 1995;52:11–9. doi: 10.1001/archpsyc.1995.03950130011002. [DOI] [PubMed] [Google Scholar]
- 29.Wells KB, Burnam MA, Rogers W, et al. The course of depression in adult outpatients: results from the Medical Outcomes Study. Arch Gen Psychiatry. 1992;49:788–94. doi: 10.1001/archpsyc.1992.01820100032007. [DOI] [PubMed] [Google Scholar]
- 30.Kantz ME, Harris WJ, Levitsky K, et al. Methods for assessing condition-specific and generic functional status outcomes after total knee replacement. Med Care. 1992;30:MS240–52. doi: 10.1097/00005650-199205001-00024. [DOI] [PubMed] [Google Scholar]
- 31.Ritter MA, Albohm MJ, Keating EM, et al. Comparative outcomes of total joint arthroplasty. J Arthroplasty. 1995;10:737–41. doi: 10.1016/s0883-5403(05)80068-3. [DOI] [PubMed] [Google Scholar]
- 32.Dawson J, Fitzpatrick R, Murray D, Carr A. Comparison of measures to assess outcomes in total hip replacement surgery. Qual Health Care. 1996;5:81–8. doi: 10.1136/qshc.5.2.81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Anderson JG, Wixson RL, Tsai D, et al. Functional outcomes and patient satisfaction in total knee patients over the age of 75. J Arthroplasty. 1996;11:831–40. doi: 10.1016/s0883-5403(96)80183-5. [DOI] [PubMed] [Google Scholar]
- 34.O'Boyle CA, McGess H, Hickey A, O'Malley K, Joyce CRB. Individual quality of life in patients undergoing hip replacement. Lancet. 1992;339:1088–91. doi: 10.1016/0140-6736(92)90673-q. [DOI] [PubMed] [Google Scholar]
- 35.Cleary PD, Reilly DT, Greenfield S, et al. Using patient reports to assess health-related quality of life after total hip replacement. Qual Life Res. 1993;2:3–11. doi: 10.1007/BF00642884. [DOI] [PubMed] [Google Scholar]
- 36.Phillips RC, Lansky DJ. Outcomes management in heart valve replacement surgery: early experience. J Heart Valve Dis. 1992;1:42–50. [PubMed] [Google Scholar]
- 37.Keoghane SR, Lawrence KC, Jenkinson CP, et al. The Oxford laser prostate trial: sensitivity to change of three measures of outcome. Urology. 1996;47:43–7. doi: 10.1016/s0090-4295(99)80380-x. [DOI] [PubMed] [Google Scholar]
- 38.Burney RE, Jones KR, Coon JW, et al. Assessment of patient outcomes after operation for primary hyperparathyroidism. Surgery. 1996;120:1013–9. doi: 10.1016/s0039-6060(96)80048-1. [DOI] [PubMed] [Google Scholar]
- 39.Currie IC, Wilson YG, Baird RN, Larmont PM. Treatment of intermittent claudication: the impact on quality of life. Eur J Vasc Endovasc Surg. 1995;10:356–61. doi: 10.1016/s1078-5884(05)80057-7. [DOI] [PubMed] [Google Scholar]
- 40.Goldman L, Hashimoto B, Cook EF, Loscalzo A. Comparative reproducibility and validity of systems for assessing cardiovascular functional class: advantages of a new Specific Activity Scale. Circulation. 1981;64:1227–34. doi: 10.1161/01.cir.64.6.1227. [DOI] [PubMed] [Google Scholar]
- 41.Lee TH, Shammash JB, Ribeiro JP, Hartley LH, Sherwood J, Goldman L. Estimation of maximum oxygen uptake from clinical data: performance of the Specific Activity Scale. Am Heart J. 1988;115:203–4. doi: 10.1016/0002-8703(88)90545-5. [DOI] [PubMed] [Google Scholar]
- 42.Bindman AB, Keane D, Lurie N. Measuring health changes among severely ill patients: the floor phenomenon. Med Care. 1990;28:1142–52. doi: 10.1097/00005650-199012000-00003. [DOI] [PubMed] [Google Scholar]
- 43.Dales RE, Belanger R, Shamji FM, et al. Quality-of-life following thoracotomy for lung cancer. J Clin Epidemiol. 1994;47:1443–9. doi: 10.1016/0895-4356(94)90088-4. [DOI] [PubMed] [Google Scholar]
- 44.Greenfield S, Apolone G, McNeil BJ, Cleary PD. The importance of co-existent disease in the occurrence of postoperative complications and one-year recovery in patients undergoing total hip replacement. Comorbidity and outcomes after hip replacement. Med Care. 1993;31:141–54. doi: 10.1097/00005650-199302000-00005. [DOI] [PubMed] [Google Scholar]
- 45.Cleary PD, Greenfield S, Mulley AG, et al. Variations in length of stay and outcomes for six medical and surgical conditions in Massachusetts and California. JAMA. 1991;266:73–9. [PubMed] [Google Scholar]
- 46.Greenfield S, Sullivan L, Dukes KA, Silliman R, D’Agostino R, Kaplan SH. Development and testing of a new measure for case mix for use in office practice. Med Care. 1995;33(11):AS47–55. [PubMed] [Google Scholar]
- 47.Rosner B. Boston, Mass: Duxbury Press. 1986. Fundamentals of Biostatistics. pp. 286–7. [Google Scholar]
- 48.Liang MH, Larson MG, Cullen KE, Schwartz JA. Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis Rheum. 1985;28:542–7. doi: 10.1002/art.1780280513. [DOI] [PubMed] [Google Scholar]
- 49.McDowell I, Newell C. New York, NY: Oxford University Press; 1987. Measuring Health: A Guide to Rating Scales and Questionnaires. p. 14. [Google Scholar]
- 50.Nunally JC. New York, NY: McGraw Hill; Psychometric Theory. pp. 190–255. [Google Scholar]
- 51.Mangione CM, Phillips RS, Lawrence MG, et al. Improved visual function and attenuation of age-related declines in health-related quality of life after cataract extraction. Arch Ophthalmol. 1994;112:1419–25. doi: 10.1001/archopht.1994.01090230033017. [DOI] [PubMed] [Google Scholar]
- 52.Cleary PD, Greenfield S, McNeil BJ. Assessing quality of life after surgery. Control Clin Trials. 1991;12:189–203S. doi: 10.1016/s0197-2456(05)80023-6. [DOI] [PubMed] [Google Scholar]