Skip to main content
North American Spine Society Journal logoLink to North American Spine Society Journal
. 2024 Nov 9;20:100571. doi: 10.1016/j.xnsj.2024.100571

The oswestry disability index in elective single level lumbar fusion: Is 3 months follow up enough?

Heeren Makanji a,b,, Matthew J Solomito a
PMCID: PMC11697366  PMID: 39759222

Abstract

Background

Prospective, longitudinal collection of patients reported outcomes (PRO) has become an essential metric in orthopedics. Despite the utility of PROs, data collection presents a significant challenge to the healthcare system. There is a need to better understand if serial data collection over a 1 to 2 year postoperative period is truly warranted. The purpose of this study was to determine if continued PRO collection after 3 months postop is needed in patients that underwent lumbar fusion.

Methods

This retrospective study utilized 239 patients that underwent an elective single level lumbar fusion between April 1, 2020 and February 1, 2023. Changes in the Oswestry Disability Index (ODI) scores over a 1 year period were assessed for all patients. Patients were placed into 1 of 3 study groups, those that improved by 10 points, those that worsened by 10 points, and those that did not change (score change less than 10 points in either direction). Movement between study groups, the minimal clinical important difference (MCID), and patient satisfaction were analyzed for each patient.

Results

Improvement between preoperative and the 3 month postoperative evaluations was noted for most patients. There was limited change in scores after 3 months. Patients in the improved group continued to improve through 1 year postoperative. Patients in the worsened group continued to decline by 1 year postoperative. Patients in the no change group demonstrated the highest potential to change groups.

Conclusions

Patients with substantial improvement or worsening at 3 months have a low likelihood of substantial clinical change thereafter. This subset of patients may not need further evaluation with PROs, but those in the worsened group may benefit from other interventions to potentially alter their course.

Keywords: Lumbar spine, Outcomes, Patient reported outcomes, Quality improvement, Satisfaction, Surgery

Introduction

Single level lumbar fusions are 1 of the most common spine procedures in the United States and have increased in prevalence by 62.3% from 2004 through 2015; the rate Is expected to increase continually in proportion with the aging United States population [1,2]. The rise in lumbar fusions and associated costs have prompted recent changes to the Centers for Medicare Services (CMS) to begin transitioning the United States healthcare system from a fee based system to a value based system[[3], [4], [5], [6]]. As a result of this transition, prospective, longitudinal collection of patients reported outcomes (PRO) has become a priority has been placed on patients attaining the minimally important clinical difference (MCID) by 1 year [7,8]. It has been suggested that longitudinal data collection for at least 1 year is needed to assess patient benefit and the “value” of certain interventions [4,6,8,9].

The most common disease specific patient reported outcomes tool to assess lower back pain and dysfunction is the Oswestry Disability Index (ODI) [[10], [11], [12], [13], [14]]. The ODI was first developed in 1976 and published in 1980; since its inception the ODI has had 4 revisions, but has remained consistent with its initial psychometric properties [[11], [12], [13]]. The ODI's ubiquitous adoption stems from its high degree of specificity and sensitivity along with its adaptability to assess both current function and changes in function following surgical intervention [[11], [12], [13]].

Despite the utility of PROs in healthcare both as a clinical and research tool, the necessary longitudinal data collection can be expensive and difficult [7,9]. Furthermore, missing data points and the potential for selection bias can reduce the utility of PRO data in a healthcare setting [9]. Therefore, recent studies have sought to question the need for 1 and 2-year outcomes. One such study suggested that the ODI scores at 1 year can adequately predict 2 year outcomes and therefore calls into question the need for collecting data 2 years post-operatively [7]. However, consensus around earlier time points is lacking; 1 study has indicated that while the population data showed limited change in scores between the 3 month and 1 year time point, there was significant variation within the individual patient scores over the same time period [9]. However, this study looked at a variety of lumbar conditions and procedures through a national registry, in which granular clinical data may be incomplete or unavailable. Hence, there is a need to further evaluate the relationship between the 3-month postoperative time point and the utility of subsequent data collection in a homogenous population of patients undergoing a similar spine procedure. The purpose of this study was to determine if additional time points for ODI data collection are necessary after the 3 month postoperative evaluation in patients that undergo elective single level lumbar fusion.

Materials and methods

Patient selection

This retrospective study was approved by our center's Institutional Review Board. This study was performed as a secondary analysis of a larger study designed to evaluate the utility and effectiveness of patient reported outcomes scores in common elective orthopedic procedures. The target population for this study consisted of patients between the ages of 18 and 89 years old who underwent an elective single level lumbar fusion between April 2020 and February 2023. Patients were enrolled from the practices of 7 board certified, fellowship trained spine surgeons treated at a free-standing orthopedic specialty hospital. Patients were included in this study if they had a preoperative optimization appointment, remained inpatient for at least 1 day following their surgery, and completed their patient reported outcomes surveys, specifically the ODI, preoperatively and at 3, 6 and 12 months following their index surgical procedure.

Patients were excluded from this study if they had a history of illicit drug use, had a diagnosis of dementia, were undergoing surgery due to a trauma, were actively being treated for cancer, were undergoing a planned staged procedure, had any surgical interventions within 13 months before or after their index surgery, were using worker's compensation, or if the patient had known neurological deficits (eg, postpolio, cerebral palsy, etc.). The 13 month restriction was designed to eliminate potential confounding factors that may influence the data analysis should a preoperative and follow-up assessment overlap.

Data collection and analysis

All PROs were collected via Force (Force Therapeutics, New York, NY), and online patient engagement platform that allows for collection of PROs following a predefined clinical protocol. The ODI scores were collected for all patients and the differences in scores were assessed between each sequential assessment (e.g. preoperative to 3 months postoperative, 3 month postoperative to 6 month postoperative, etc.) Although, the MCID for the ODI is 12.8 points [9,14], recent studies have indicated that the MCID, a statistically defined threshold value, may not provide significant clinical relevance [15,16]. Therefore, for the purposes of this study a 10 point difference was used to indicate a clinically relevant change in ODI scores, as previously established in literature [11,17,18].

Patients were placed into 1 of 3 groups based on the score difference between the preoperative and 3 month postoperative assessment. Patients improving by 10 points were placed into the improved group, patients whose scores worsened by 10 points were placed to the worsened group, and patients whose change was less than 10 points were placed into the no change group.

Score changes between a patient's preoperative score and each subsequent postoperative score were assessed using the same 10 point threshold to determine if they improved their score, remained the same, or worsened over the course of their 1 year recovery period. Additional variables of interest included: if a patient reached the published minimal clinically important difference (MCID) threshold of a 12.8 point change in score by 1 year post fusion, and the patient reported satisfaction with their surgical outcomes [9,14]. Patient satisfaction, collected at 6 months postoperatively, was assessed on a 5 point Likert scale ranging from extremely dissatisfied to extremely satisfied; however, for the purposes of this study patients indicating they were satisfied or extremely satisfied were considered to be satisfied with their outcomes and patients reporting neutral, dissatisfied, or extremely dissatisfied were considered dissatisfied with their outcomes.

Statistical analysis

Descriptive statistics were computed for all variables of interest, as well as all demographic variables (ie age, body mass index [BMI], sex, race, ethnicity, insurance status, and work status) and surgical approach. Continuous, normally distributed variables are presented as means and standard deviation. Count-based data are presented as a percentage of the total study cohort. Differences in count data and binary data elements were assessed using Chi-square tests or Fisher Exact Tests when appropriate, while continuous variables were assessed using 1-way ANOVA. Tukey HSD post hoc testing was used to determine differences within group data specifically assessing changes in the ODI score overtime within the 3 study groups. A p-value of 0.05 or less was considered statistically significant. All statistical testing was performed using STATA 17.0 (StataCorp, College Station, TX).

Results

A total of 239 patients out of a potential 450 patients were included in this study indicating a 53% completion rate for all ODI time points from preoperative to 1 year postoperative (Table 1). There were no differences in demographic variables among the 3 study groups with the noted exception of insurance type (p=.046). There was no difference in surgical approach among the 3 study groups (p=.642) (Table 2).

Table 1.

Demographic comparison of study groups.

Study cohort Improved group No change group Worsened group P-value
N 239 151 76 12
Age 62.9 ± 12.1 62.1 ± 11.9 65.1 ± 11.9 60.2 ± 16.2 .177
BMI 30.7 ± 6.2 30.9 ± 6.5 30.6 ± 5.5 28.7 ± 5.7 .568
Sex
 Male 105 (43.9%) 65 (43.0%) 33 (43.4%) 7 (58.3%) .220
 Female 134 (56.1%) 86 (57.0%) 43 (56.6%) 5 (42.7%)
Race
 African American 11 (4.6%) 9 (5.9%) 2 (2.6%) 0 (0%) .604
 Caucasian 214 (89.5%) 140 (92.7%) 63 (82.8%) 11 (91.7%)
 Other 14 (5.9%) 2 (1.4%) 11 (14.6%) 1 (8.3%)
Ethnicity
 Not Hispanic 225 (94.1%) 140 (92.7%) 73 (96.1%) 12 (100%) .501
 Hispanic 14 (5.9%) 11 (7.3%) 3 (3.9%) 0 (0%)
Work Status
 Full Time 75 (31.4%) 52 (34.4%) 20 (26.3%) 3 (25.0%) .534
 Part Time 12 (5.0%) 7 (4.6%) 5 (6.6%) 0
 Retired 120 (50.2%) 74 (49.0%) 40 (52.6%) 6 (50.0%)
 Disabled 12 (5.0%) 5 (1.9%) 4 (5.3%) 3 (25.0%)
 Not Employed 20 (8.3%) 13 (8.6%) 7 (9.2%) 0
Insurance Status
 Commercial 8 (3.3%) 3 (2.0%) 3 (3.9%) 2 (16.7%) .046
 Medicare 224 (93.7%) 145 (96.0%) 70 (92.2%) 9 (75.0%)
Medicaid 7 (3.0%) 3 (2.0%) 3 (3.9%) 1 (8.3%)
CCI 2 [2] 2 [2] 3 [2] 2 [2] .143

CCI, Charlson Comorbidity Index – CCI data presented as median [interquartile range].

Table 2.

Comparison of surgical approaches by study group.

Study cohort Improved group No change group Worsened group P-value
N 239 151 76 12 .166
Anterior 42 (17.6%) 28 (18.5%) 12 (15.8%) 2 (16.7%)
Combined Approach 70 (29.3%) 51 (33.7%) 15 (19.7%) 4 33.3%)
Oblique 5 (2.1%) 5 (3.3%) 0 (0%) 0 (0%)
Posterior/Transforaminal 122 (51.0%) 67 (44.3%) 49 (64.4%) 6 (50.0%)

Overall trends in the ODI scores indicated significant improvement between the preoperative and 3 month postoperative time point, and the minimal change in ODI scores at both 6 months and 12 months postoperative (Fig. 1). There were statistically significant differences among the 3 study groups at each of the postoperative time points (p<.001 for each comparison); however, within study groups there were only significant differences between the preoperative time point and subsequent time points for those patients in the improved group (p<.001) and the worsened group (p=.043). There was no difference in ODI scores at any time point in the no change group (p=.059) (Fig. 2).

Fig. 1.

Fig 1

Box and Whisker plots illustrating the variability of the Oswestry Disability Index (ODI) scores for the entire study cohort at each of the 4 collection time points. The * indicates significant differences in ODI scores compared to the preoperative ODI score.

Fig. 2.

Fig 2

Oswestry disability index (ODI) scores for each of the 3 study groups over a 1 year postoperative period. Columns represent study group averages while the lines represent 1 standard deviation around the mean. The * indicates statistically significant differences among the study groups at each of the follow up time periods.

Analysis of individual patients demonstrated different recovery paths based on their study group. Those patients in the improved group were noted to improve or maintain their improvement; 83% of patients continued to show limited improve, while only 2 patients (1.3%) worsened, and 23 patients (15.2%) reported no change in their ODI scores. In the no change group, 48% of the group improved their ODI scores at subsequent time points, 4 patients (5.2%) reported worse ODI scores, and 46.8% of patients demonstrated no further change in their scores. Finally, those patients in the worsened group, 5 patients (41.6%) continued to worsen at subsequent time points, 5 patients (41.6%) reported no ODI change, and 2 patients (16.8%) improved their score.

When evaluating whether patients reached MCID and patient reported satisfaction, none of the patients in the worsened group reached MCID and none of the patients indicated they were satisfied with their surgical outcomes. A total of 27% of patients in the no change group reached MCID, and 48.6% indicated they were satisfied with their surgical procedure. The largest majority to reach MCID was found in the improved group with 82.1% reaching the MCID threshold of 12.8 points, and 78.8% of patients indicated they were satisfied with the outcomes of their surgical procedure.

Discussion

The ODI is the most commonly used disease-specific PRO for lumbar spine procedures and can provide the necessary patient level data to accurately and effectively improve the quality of care for these patients as well as understand how a patient perceives the value of their care. However, despite the ODI's relative simplicity, requiring between 5 and 10 minutes of a patient's time to complete, longitudinal data collection still remains challenging and time consuming. Therefore, there is a need to balance the benefits of PRO data with the cost to collect data, while also accounting for significant survey burden to the patient. This study was designed to determine if additional data collection using the ODI was clinically useful after the 3 month postoperative time point.

The results of this study demonstrated a significant overall improvement in a patient's ODI scores from the preoperative assessment to the first postoperative assessment conducted 3 months post single level lumbar fusion. The 6 month ODI scores as well as the 1 year postoperative scores indicated minimal variation in the average score suggesting that there was minimal additional improvement after the first postoperative assessment in this cohort. This finding was consistent with previous literature evaluating the predictability of the 3 month postoperative score to the 1 year postoperative outcome score on the ODI, in which the study noted that population data showed little change after the initial 3 month postoperative time point [9]. However, the data presented in that study indicated substantial variation with the data, as demonstrated by the large standard deviations associated with the data, indicating that variation at the patient level could be substantially greater than the population based data [7,9].

What sets this study apart from previous studies on this topic was the subanalysis that was designed to understand the movement of patients from 1 score category to another (ie improved, no change, or worsened). The data suggested that overall movement between the 3 study groups was limited. Overall only 14.6% of the patients in the entire cohort moved among the 3 study groups after 3 months, 3.4% in the improved group, 7.3% in the no change group, and 3.9% in the worsened group. Therefore, this data suggests that once a patient was placed into a study group at the 3 month assessment there was a limited likelihood for change from that group, at subsequent time points.

Further investigation into these score trends indicated that almost all patients in the improved group either maintained their gains or continued to show some improvement. Only 2 patients of 151 patients (1.3%) demonstrated a worse ODI score at 1 year postoperative than their 3 month scores. However, even these 2 patients improved their ODI score from the preoperative assessment. Therefore, given the limited likelihood for continued improvement, patients in the improved group may represent a subgroup of patients that would not benefit from continued serial assessment using the ODI, and thus would reduce survey burden and administrative costs associated with PRO collection for this group.

Patients in the no change group based on their 3 month postoperative assessment showed greater variation in their data at subsequent time points, than the other study groups. Only 46% of these patients remained in the no change category at the 2 subsequent assessment time points: 48.8% of this group reported improved scores by 1 year postoperative, and 5.2% of the group reported worse ODI scores. Therefore, this finding suggests that patients reporting no change at 3 months may have the greatest potential to move a category, and thus additional resources and attention can be given to this population to ensure a successful outcome. Additionally, these findings suggest that patients within the no change group require further evaluation with PROS at later time points compared to patients in the improved or worsened group.

Patients in the worsened group either maintained their score or worsened by the 1 year postoperative ODI assessment. Again only 2 patients in the worsened group improved from their 3 month postoperative score, and, unlike the improved group, both patients at 1 year still indicated a worse ODI score than at their preoperative assessment. Therefore, these data suggest that patients in the worsened group may be on a different functional recovery path as assessed by the ODI and therefore may benefit from additional clinical investigation (eg advanced diagnostic imaging) to better understand why their functional recovery is limited.

Although this study did not look at the statistical predictability of 3 month ODI scores to other later time points as done in the studies by Asher et al. [9] and Kim et al. [7], this study demonstrates that the actual relationship between scores may be less clinically meaningful than movement between categorical groups. This hypothesis was further supported by the results that indicated that no patient in the worsened group reached the MCID threshold nor did any of these patients indicate they were satisfied with their overall outcomes. In contrast, nearly 80% of patients in the improved category by 3 months postop reported they were satisfied with their overall outcomes and reached MCID.

This study was not without limitations. It was a retrospective study design and thus was limited by design and the inability to demonstrate causation between scores and outcomes. Furthermore, patient-reported satisfaction is only assessed at the 6 month postoperative time point, per our institutions current standard of care, and thus may not fully reflect a patient's impression of satisfaction at earlier or later time points. The homogenous patient population regarding sex, age, BMI, and surgical approach can be seen as both a benefit and a limitation. The benefit of similar demographics provides an opportunity for stronger conclusions to be drawn; however, limits the overall generalizability of the study. Additionally, this study was designed to assess longitudinal changes in the ODI over a 1 year time point, which limited the selection of patients to those patients that completed the ODI at all 4 time points, which may have led to unintended selection bias. Furthermore, only 53% of patients with preoperative ODIs completed all of their postoperative ODI assessments, and therefore, the results may be skewed by a population more likely to complete their PROs. Finally, the worsened group was comprised of only 12 patients; although this makes clinical sense, the small sample makes it difficult to draw strong evidentiary conclusions.

Conclusions

In short, the results of this study demonstrated that there was limited change in ODI scores when assessed as an average of the study cohort after the 3 month postoperative assessment. Although variation existed in ODI scores between patients, after assessing the score change using a 10 point threshold, individual patients showed minimal changes in group categorization. Therefore, this 10 point threshold could be used to identify patients at 3 months that may be on a different course of functional recovery as assessed by the ODI. Patients that fall into the worsened group may benefit from early diagnostic imaging and intervention, while patient in the improved group may not receive benefit from serial ODI collection, and therefore could potentially stop collection early in an effort to reduce survey burden and costs associated with continued PRO administration.

Funding

This study did not receive any external funding.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

FDA device/drug status: Not applicable.

Author disclosures: HM: Nothing to disclose. MJS: Nothing to disclose.

References

  • 1.Martin BI, Mirza SK, Spina N, Spiker WR, Lawrence B, Brodke DS. Trends in lumbar fusion procedures rates and associated hospital costs for degenerative spinal diseases in the United States, 2004 to 2015. Spine (Phila Pa 1976) 2019;44(5):369–376. doi: 10.1097/BRS.0000000000002822. [DOI] [PubMed] [Google Scholar]
  • 2.Reisener MJ, Hughes AP, Okano I, et al. The association of transversus abdominis plane block with length of stay, pain, and opioid consumption after anterior or lateral lumbar fusion: a retrospective study. Eur Spine J. 2021;30(12):3738–3745. doi: 10.1007/s00586-021-06855-8. [DOI] [PubMed] [Google Scholar]
  • 3.Druss BG, Marcus SC, Olfson M, et al. The most expensive medical conditions in America. Health Aff (Millwood) 2002;21:105–111. doi: 10.1377/hlthaff.21.4.105. [DOI] [PubMed] [Google Scholar]
  • 4.Lohr KN, Schroeder SA. A strategy for quality assurance in Medicare. N Engl J Med. 1990;322:707–712. doi: 10.1056/nejm199003083221031. [DOI] [PubMed] [Google Scholar]
  • 5.May EL. Financial planning in a value-based world. Healthc Exec. 2014;29(3):10–18. [PubMed] [Google Scholar]
  • 6.Centers for Medicare and Medicaid Services (CMS) HHS:Medicare program; hospital inpatient prospective payment systems for acute care hospitals and the long-term care hospital prospective payment system and fiscal year 2015rates; quality reporting requirements for specific providers; reasonable compensation equivalents for physician services in excluded hospitals and certain teaching hospitals; provider administrative appeals and judicial review; enforcement provisions for organ transplant centers; and electronic health record (EHR) incentive program. Final rule. Fed Regist. 2014;79:49853–50536. [PubMed] [Google Scholar]
  • 7.Kim EJ, Chotai S, Archer KR, Bydon M, Asher AL, Devin CJ. Need for two-year patient-reported outcomes scores for lumbar spine surgery is procedure-specific. Spine (Phila Pa 1976) 2017;42:1331–1338. doi: 10.1097/BRS.0000000000002087. [DOI] [PubMed] [Google Scholar]
  • 8.McGirt MJ, Parker SL, Asher AL, Norvell D, Sherry N, Devin CJ. Role of prospective registries in defining the value and effectiveness of spine care. Spine (Phila Pa 1976) 2014;39(22):S117–S128. doi: 10.1097/BRS.0000000000000552. [DOI] [PubMed] [Google Scholar]
  • 9.Asher AL, Chotai S, Devin CJ, et al. Inadequacy of 3-month Oswestry disability index outcome for assessing individual longer-term patient experience after lumbar spine surgery. J Neurosurg Spine. 2016;25:170–180. doi: 10.3171/2015.11.SPINE15872. [DOI] [PubMed] [Google Scholar]
  • 10.Djurasovic M, Glassman SD, Dimar JR, Crawford CH, Bratcher K, Carreon LY. Changes in the Oswestry disability index that predict improvement after lumbar fusion. J Neurosurg Spine. 2012;17:486–490. doi: 10.3171/2012.8.SPINE12614. [DOI] [PubMed] [Google Scholar]
  • 11.Vianin M. Psychometric properties and clinical usefulness of the Oswestry Disability Index. J Chiro med. 2008;7:161–163. doi: 10.1016/j.jcm.2008.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Cook CE, Garcia AN, Wright A, Shaffrey C, Gottfried O. Measurement properties of the oswestry disability index in recipients of lumbar spine surgery. Spine (Phila Pa 1976) 2020;46(2):E118–E125. doi: 10.1097/BRS.0000000000003732. [DOI] [PubMed] [Google Scholar]
  • 13.Tonosu J, Takeshita K, Hara N, et al. The normative score and the cut-off value of the Oswestry disability index (ODI) Eur Spine J. 2012;21(8):1596–1602. doi: 10.1007/s00586-012-2173-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Copay AG, Glassman SD, Subach BR, Berven S, Schuler TC, Carreon LY. Minimum clinically important difference in lumbar spine surgery patients: a choice of methods using the oswestry disability index, medical outcomes study questionnaire short form 36. Spine J. 2008;8:968–974. doi: 10.1016/j.spinee.2007.11.006. [DOI] [PubMed] [Google Scholar]
  • 15.Franceschini M, Boffa A, Pignotti E, Andriolo L, Zaffagnini S, Filardo G. The minimal clinically important difference changes greatly based on different calculation methods. AJSM. 2023;51(4):1067–1073. doi: 10.1177/03635465231152484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Solomito MJ, Kia C, Makanji H. The minimal clinically important difference for the oswestry disability index substantially varies based on calculation method. Spine. 2024 doi: 10.1097/BRS.0000000000005074. Published online Jun 18, 2024. [DOI] [PubMed] [Google Scholar]
  • 17.Davidson M, Keating JL. Oswestery disability questionnaire (ODQ) Aust J. Physio ther. 2005;51(4):270. doi: 10.1016/s0004-9514(05)70016-7. [DOI] [PubMed] [Google Scholar]
  • 18.Davidson M, Keating JL. A comparison of five low back disability questionnaires: reliability and responsiveness. Phys Ther. 2002;82(1):8–24. doi: 10.1093/ptj/82.1.8. [DOI] [PubMed] [Google Scholar]

Articles from North American Spine Society Journal are provided here courtesy of Elsevier

RESOURCES