Skip to main content
Journal of Clinical Orthopaedics and Trauma logoLink to Journal of Clinical Orthopaedics and Trauma
. 2021 Jan 13;16:125–131. doi: 10.1016/j.jcot.2021.01.002

Construct validity and responsiveness of commonly used patient reported outcome instruments in decompression for lumbar spinal stenosis

Karthik Vishwanathan a,, Ian Braithwaite b
PMCID: PMC7920003  PMID: 33717946

Abstract

Background

Validity and responsiveness of Oswestry disability index (ODI), Roland Morris disability questionnaires (RMDQ), Short Form-12 Physical Component Score (SF-12 PCS) and Short Form-12 Mental Component Score (SF-12 MCS) in patients undergoing open decompression for lumbar canal stenosis has not been previously reported.

Methods

Outcome assessment was prospectively evaluated using the ODI, RMDQ, SF-12 PCS and SF-12 MCS pre-intervention and at average follow-up of three months post-intervention. Pearson correlation coefficient was used to evaluate the association between change in values of ODI, RMDQ, SF-12 PCS and SF-12 MCS. Distribution based methods (Effect size [ES], standardised response mean [SRM]) and anchor based method (Area under the curve [AUC] of receiver operating curve [ROC]) were used to determine responsiveness. AUC value ≥ 0.70 is considered as adequate level of responsiveness and the outcome instrument with the largest AUC is considered to be the most responsive outcome instrument.

Results

This study included 77 participants. Responsiveness was assessed at a mean follow-up of 12 weeks postoperatively. There was significant strong correlation between ODI and RMDQ (r = 0.65, p < 0.0001). The ES of ODI, RMDQ, SF-12 PCS and SF-12 MCS were 1.54, 1.48, 1.85 and 0.51 respectively. The SRM of RMDQ, ODI, SF-12 PCS and SF-12 MCS were 1.22, 1.17, 1.0 and 0.47 respectively. AUC of ODI, RMDQ, SF-12 PCS and SF-12 MCS were 0.83–0.88, 0.82 to 0.86, 0.78 to 0.81 and 0.69 to 0.70 respectively.

Conclusion

It is recommended to use either ODI or RMDQ as region specific patient reported outcome instrument and SF-12 PCS as a health related quality of life outcome instrument to evaluate outcome after decompressive laminectomy for lumbar canal stenosis.

Keywords: Validity, Responsiveness, Oswestry disability index, Roland morris disability questionnaire, SF-12, Lumbar stenosis

Abbreviations: ODI, Oswestry Disability Index; RMDQ, Roland Morris disability questionnaires; SF-12 PCS, Short Form-12 Physical Component Score; SF12-MCS, Short Form-12 Mental Component Score; HRQoL, Health Related Quality of Life; ES, Effect Size; SRM, Standardised Response Mean; AUC, Area under the curve; ROC, Receiver Operating Curve; VAS, Visual Analogue Scale; NRS, Numerical Rating Scale; MCID, Minimal Clinically Important Difference

1. Introduction

Though lumbar stenosis is primarily treated non-operatively, operative intervention might be necessary in case of failure of nonoperative treatment or in the presence or worsening of neurological deficit. Posterior decompression performed as sole procedure for lumbar spinal stenosis is an effective intervention.1

The Oswestry Disability Index (ODI),2,3 Roland Morris Disability Questionnaire (RMDQ)4,5 and the Short Form-12 (SF-12)6,7 have been used to evaluate patient reported outcomes after surgical treatment of spinal canal stenosis of lumbar spine.

Validity and responsiveness are used specifically in the context of a particular population, condition and intervention. Evidence suggests that psychometric evaluation of ODI,8 RMDQ8 and SF-129 have limited applicability because all included studies in these systematic reviews were on non-specific backache treated non-operatively hence the findings could not be extrapolated to condition specific causes of backache treated surgically. Commonly used patient reported outcome instruments need to be evaluated for validity and responsiveness in specific population of patients undergoing lumbar spine surgery.10

Construct validity refers to the strength of the association between the change in the value of two or more patient reported outcome instruments that measure similar or dissimilar outcome domains.11,12 It is expected that outcome instruments measuring similar domains shall demonstrate stronger association whereas outcome instruments measuring dissimilar domains shall demonstrate weaker association. Responsiveness is the capability of the patient reported outcome instrument to accurately identify change in the clinical condition of the patient subsequent to an intervention.11,12

There is no study which has compared the construct validity and responsiveness of the commonly utilized patient reported outcome measures such as ODI, RMDQ and SF-12 in a homogenous cohort of patients undergoing decompression surgery for lumbar stenosis. Hence the objective of our investigation was to compare and report the construct validity and responsiveness of ODI, RMDQ and SF-12.

2. Methods

Approval for the present study was obtained from the Research and clinical audit department of the Countess of Chester hospital NHS Foundation Trust (Approval number 2520). The participants gave written informed consent for participating voluntarily in the study.

Consecutive patients undergoing lumbar decompression surgery for clinically and radiologically confirmed lumbar canal stenosis were included in this prospective longitudinal observational study. Patients had lower limb pain non-responding to conservative methods of management. Patients had either neurogenic claudication or sciatica secondary to compression of the nerve root in the lateral recess. Patients that underwent posterior spinal instrumentation and fusion for lumbar stenosis were excluded.

The patients completed the ODI version 2.0, RMDQ and SF-12 PCS and SF-12 MCS preoperatively on the day of the surgery prior to the surgical intervention. The postoperative follow-up was at 6–12 weeks and the patients completed the ODI version 2.0, RMDQ and SF-12 responses in a separate outpatient cubicle to ensure privacy. The patients also completed a global perceived change questionnaire consisting of their assessment of the alteration in the clinical condition after the decompression surgery and rating of the success of the operative intervention.

The ODI version 2.0 13 is rated from 0 (best functional outcome) to 100 (worst functional outcome). The 24-item RMDQ is rated from 0 (best functional outcome) to 24 (worst functional outcome). The Short Form-12 version 1 [standard – 4 weeks recall] was used to evaluate the health related quality of life (HRQoL). The physical component score (SF-12 PCS) and the mental component score (SF-12 MCS) components of SF-12 were calculated using Quality Metric Health outcomes scoring software version 4.0 (QualityMetric Inc., Lincoln, RI, USA). The SF-12 MCS and SF-12 PCS are rated from 0 (worst health related quality of life) to 100 (best health related quality of life).

A six-point global perception rating scale (external anchor-1) to evaluate clinical condition of the patient following the decompression surgery was answered by the patients. The options were ‘cured’, ‘much better’, ‘a bit better’, ‘the same’, ‘a bit worse’ and ‘much worse’. For ROC analysis, patients opting for the responses ‘cured’ and ‘much better’ were categorized as ‘responder’ cohort while ‘a bit better’, ‘the same’, ‘a bit worse’ and ‘much worse’ were classed as ‘non-responder’ cohort. For the question, ‘Has the operation been a success?‘(external anchor-2) the options were ‘yes’, ‘partially’ and ‘no’. For ROC analysis, the response ‘yes’ was categorized as ‘successful’ cohort while the response ‘partially’ and ‘unsuccessful’ were classed as ‘unsuccessful’ cohort.

2.1. Construct validity

Correlation between two outcome instruments has been used to determine construct validity in the spine research.9,14,15 Correlation coefficient ≥0.50 suggests that the outcome instruments being compared are measuring similar domains16 and denotes convergent construct validity. Correlation coefficient <0.50 suggests that the outcome instruments are measuring dissimilar domains16 and denotes divergent construct validity. The strength of the association can also be interpreted using the value of the correlation coefficient: 0 to 0.19 (very weak correlation), 0.20 to 0.39 (weak correlation), 0.40 to 0.59 (moderate correlation), 0.60 to 0.79 (strong correlation) and 0.80 to 1.0 (very strong correlation).17

We formulated the following hypothesis (1) there would be a convergent construct validity between the ODI and RMDQ (2) there would be convergent construct validity between the ODI and SF-12 PCS (3) there would be convergent construct validity between the RMDQ and SF-12 PCS (4) ODI will have higher correlation to SF-12 PCS and lower correlation to SF-12 MCS [divergent construct validity] (5) RMDQ will have higher correlation to SF-12 PCS and lower correlation to SF-12 MCS [divergent construct validity] (6) SF-12 PCS will have higher correlation to ODI and RMDQ whereas it will demonstrate lower correlation to SF-12 MCS because both measure unrelated and dissimilar domains [divergent construct validity]. Though SF-12 is a HRQoL instrument, it is assumed that ODI and RMDQ shall show higher correlation to the SF-12 PCS instead of the SF-12 MCS.

Results of correlation between the changes in scores of outcomes instruments (instead of the absolute scores) concurring with atleast 75% of the formulated hypotheses15,16 is considered as an accepted method to determine construct validity.

2.2. Responsiveness

Distribution based and anchor based methods were used to determine responsiveness. The two distribution based methods involved the estimation of the effect size (ES) and the standardised response mean (SRM). The difference in score was calculated by subtracting the postoperative follow-up score from the preoperative score. The ES was obtained by dividing the standard deviation of the score of the outcome instrument preoperatively from the mean difference in score of an outcome instrument. The SRM was obtained by dividing the standard deviation of the difference in score of the outcome instrument from the mean difference in score of the outcome instrument.

The interpretation of ES and SRM was: value around 0.3 (small effect); value around 0.5 (medium effect) and value ≥ 0.8 (large effect; preferred value for outcome instrument having adequate responsiveness).18 Larger the value of effect more is the responsiveness of the outcome instrument.

Receiver operating curve (ROC) analysis evaluates the area under the curve (AUC), 95% confidence interval of the AUC, p-value, sensitivity and specificity of the patient reported outcome instrument to detect changes. AUC of an outcome instrument ≥0.70 suggests adequate responsiveness.9,11,15,16 The AUC values of 0.70–0.79 suggests fair accuracy, 0.80 to 0.89 suggests good accuracy and 0.90 to 0.99 suggests excellent accuracy of the patient reported outcome instrument to discriminate.19

It is expected that disease or region specific outcome instruments are expected to be demonstrating higher responsiveness compared to HRQoL instruments because they measure different domains.18 Since ODI, RMDQ and SF-12 PCS evaluate physical function of the patient our hypothesis was that ODI, RMDQ and SF-12 PCS would demonstrate large effect size (ES and SRM ≥ 0.8) and adequate responsiveness (AUC > 0.70). Since SF-12 MCS evaluates the mental component of health related quality of life, it was hypothesized that SF-12 MCS might demonstrate lower than large effect size (ES and SRM < 0.8) and inadequate responsiveness (AUC ≤ 0.70).

3. Results

There were 77 participants in this prospective longitudinal observational study. The demographic features of the participants are presented in Table 1.

Table 1.

Shows demographics and clinical features of the patients at baseline.

Variable (total number of cases = 77)
Age in years [mean ± SD]) (n = 77) 66 ± 11.9 (range: 27–89 years)
Gender (n = 77)
 Male/Female 36 (46.8%)/41 (53.2%)



Location of pain (n = 77)
 Backache 1 (1.3%)
 Leg pain 17 (22.1%)
 Backache and leg pain 59 (76.6%)



Affected extremity (n = 74)
 Single leg pain/Pain in both legs 34 (45.9%)/40 (54.1%)



Sensory symptoms
 Presence of paresthesias (n = 72) 47 (65.3%)
 Presence of numbness (n = 73) 50 (68.5%)



Side of operation (n = 77)
 Right/Left/Bilateral 10 (13%)/21 (27.3%)/46 (59.7%)



Number of affected levels (n = 77)
Single level/Two levels/Three levels 48 (62.3%)/21 (27.3%)/8 (10.4%)



Affected levels (n = 77)
 L2/L3 1 (1.3%)
 L3/L4 1 (1.3%)
 L4/L5 41 (53.2%)
 L5/S1 5 (6.5%)



 L3/L4 + L4/L5 11 (14.3%)
 L4/L5 + L5/S1 10 (13%)
 L2/L3 + L3/L4 + L4/L5 4 (5.2%)
 L3/L4 + L4/L5 + L5/S1 4 (5.2%)



Type of surgery (n = 77)
 Primary decompression 75 (97.4%)
 Revision decompression 2 (2.6%)

Age is presented as mean and SD (standard deviation).

Rest all data presented as proportion and percentage.

The mean postoperative follow-up evaluation was at 3 months (range: 5 weeks–33 weeks). 71 patients (92.2%) completed the questionnaire pertaining to their perception of success of the surgical procedure in curing their symptoms and 70 patients (90.9%) completed the questionnaire pertaining to the effectiveness of the decompression surgery to cure them. Table 2 shows the patient response to the global perceived effect. There was significant improvement in the values of ODI, RMDQ, SF-12 PCS and SF-12 MCS at postoperative final follow-up compared to their preoperative values at baseline (Table 3). Fig. 1 shows significant improvement in the mean values of all the patient reported outcome instruments because no overlapping of the 95% confidence interval of the mean values was observed.

Table 2.

Global perceived change scale after decompression for lumbar spinal stenosis.

Outcome response Frequency (%)
Change in clinical condition from patient perspective (external anchor-1)
Patient response (N = 70)
Cured 26 (37.1%)
Much better 28 (40%)
Bit better 9 (12.9%)
The same 4 (5.7%)
Bit worse 1 (1.4%)
Much worse 2 (2.9%)
Created sub-groups (N = 70)
Responder 54 (77.1%)
Non-responder 16 (22.9%)
Success of operation (external anchor-2)
Patient response (N = 71)
Yes 47 (66.2%)
Partially 20 (28.2%)
No 4 (5.6%)
Created sub-groups (N = 71)
Successful 47 (66.2%)
Unsuccessful 24 (33.8%)

Table 3.

Depicts overall preoperative, postoperative values, difference between preoperative and postoperative values and statistical significance of the change in the value of outcome instruments (SD = standard deviation). Mean of change in score obtained by subtracting mean final follow-up score from mean preoperative score. (ODI = Oswestry disability index; RMDQ = Roland Morris Disability Questionnaire; SF-12 PCS = Physical component of Short Form-12; SF-12 MCS = Mental component of Short Form-12).

Outcome instrument Preoperative mean ± SD Final follow-up mean ± SD Mean of change in score ± SD 95% Confidence interval of change in score P - value
ODI 45.8 ± 15.6 21.1 ± 18.5 24.0 ± 20.5 19.0 to 29.1 <0.0001
RMDQ 12.8 ± 4.8 5.6 ± 5.5 7.1 ± 5.8 5.7 to 8.5 <0.0001
SF-12 PCS 30.2 ± 6.0 41.2 ± 11.0 −11.1 ± 11.1 −13.8 to −8.3 <0.0001
SF-12 MCS 45.6 ± 12.0 51.4 ± 10.9 −6.1 ± 13.0 −9.3 to −2.9 <0.0001

Fig. 1.

Fig. 1

Shows mean values and 95% CI of patient reported outcome measures preoperatively and at final follow-up postoperatively (ODI = Oswestry disability index; RMDQ = Roland Morris Disability Questionnaire; SF-12 PCS = Physical component of Short Form-12; SF-12 MCS = Mental component of Short Form-12).

3.1. Construct validity

The strength of the association between various patient reported outcome instruments is presented in Table 4.

Table 4.

Shows correlation amongst change in score of various patient reported outcome instruments. Pearson’s correlation coefficient (r) was used to check the strength of association and ∗ indicates significant p value. (ODI = Oswestry disability index; RMDQ = Roland Morris Disability Questionnaire; SF-12 PCS = Physical component of Short Form-12; SF-12 MCS = Mental component of Short Form-12).

Patient reported outcome instruments ODI r (p-value) RMDQ r (p-value) SF-12 PCS r (p-value) SF-12 MCS r (p-value)
ODI 1
RMDQ 0.650 (p < 0.0001)∗ 1
SF-12 PCS - 0.646 (p < 0.0001)∗ - 0.511 (p < 0.0001)∗ 1
SF-12 MCS - 0.378 (p = 0.03)∗ - 0.478 (p < 0.0001)∗ 0.106 (p = 0.40) 1

There was strong correlation between the change in the values of ODI and the RMDQ, ODI and the SF-12 PCS thereby suggesting convergent construct validity. The change in values of RMDQ and SF-12 PCS demonstrated moderate correlation signifying convergent construct validity. There was weak correlation between the change in values of ODI and SF-12 MCS hence suggesting divergent construct validity. Though the change in values of RMDQ and both the components of SF12 suggested moderate correlation, the value of the correlation coefficient between RMDQ and SF-12 PCS was higher than that between the RMDQ and the SF-12 MCS hence implying divergent construct validity. There was insignificant and very weak correlation between the change in values of SF-12 PCS and SF-12 MCS thereby suggesting divergent construct validity. The results concurred with all the six formulated hypotheses.

The changes in ODI and RMDQ demonstrated the highest value of the correlation coefficient whereas the least value of the correlation coefficient was observed for the association between SF-12 PCS and SF-12 MCS.

3.2. Responsiveness

The various measures of responsiveness are presented in Table 5.

Table 5.

Shows the values of various measures of responsiveness.

Measures of responsiveness ODI RMDQ SF12 PCS SF12 MCS
Standardised response mean (SRM) 1.17 1.22 1.0 0.47
Effect size (ES) 1.54 1.48 1.85 0.51
Receiver operating curve (Change in clinical condition – external anchor 1)
Area under the curve (AUC) 0.83 0.82 0.78 0.70
95% Confidence interval of AUC 0.72 to 0.94 0.70 to 0.93 0.66 to 0.91 0.53 to 0.87
p-value 0.001 0.001 0.004 0.04
Sensitivity 0.73 0.59 0.69 0.80
Specificity 0.92 1 0.92 0.54
Receiver operating curve (success of operative intervention – external anchor 2)
Area under the curve (AUC) 0.88 0.86 0.81 0.69
95% Confidence interval of AUC 0.80 to 0.97 0.77 to 0.95 0.71 to 0.92 0.54 to 0.84
p-value <0.0001 <0.0001 <0.0001 0.02
Sensitivity 0.82 0.82 0.66 0.82
Specificity 0.86 0.82 0.91 0.52

ODI, RMDQ and SF-12 PCS demonstrated large effect values of SRM and ES thereby suggesting adequate responsiveness. SF-12 MCS demonstrated medium effect values in both SRM and ES and was lower than 0.8 thereby signifying inadequate responsiveness. Of all the instruments compared in the present study, SF-12 MCS had the lowest values of ES and SRM. It is noteworthy that the ES value of SF-12 PCS was greater than that of region and disease specific measures such as ODI and RMDQ. The SRM values of ODI and RMDQ were greater than that of SF-12 PCS.

Anchor based method using both external anchors demonstrated that the AUC value of ODI, RMDQ and SF-12 PCS were greater than 0.70 thereby suggesting adequate responsiveness. The AUC value of SF-12 MCS was acceptable during use of external anchor-1 (change in clinical condition) whereas the AUC value was not acceptable during the use of external anchor-2 (success of operative intervention). Our observation was that the AUC values of ODI were slightly higher than that of RMDQ and the AUC values of both ODI and RMDQ were larger than of SF-12 PCS during use of both the external anchor (Fig. 2, Fig. 3). ODI and RMDQ demonstrated good accuracy with the use of both the external anchors. The SF-12 PCS demonstrated good accuracy with external anchor-2 (success of operative intervention) and fair accuracy with external anchor-1 (change in clinical condition).

Fig. 2.

Fig. 2

Shows the receiver operating curve of various patient reported outcome measures using change in clinical condition from the patient’s perspective as the external anchor (ODI = Oswestry disability index; RMDQ = Roland Morris Disability Questionnaire; SF-12 PCS = Physical component of Short Form-12; SF-12 MCS = Mental component of Short Form-12).

Fig. 3.

Fig. 3

Shows the receiver operating curve of various patient reported outcome measures using success of the operative intervention from the patient’s perspective as the external anchor (ODI = Oswestry disability index; RMDQ = Roland Morris Disability Questionnaire; SF-12 PCS = Physical component of Short Form-12; SF-12 MCS = Mental component of Short Form-12).

4. Discussion

The rationale for studying functional outcome in a homogenous cohort (similar pathology and similar surgical intervention) is that functional outcome tends to differ in various spinal pathologies and in different surgical procedures. Fekete el al noticed a significant difference in the functional outcome at three months postoperatively for various spinal pathologies such as spinal stenosis, disc herniation, spondylolisthesis, degenerative spinal deformity and degenerative disc disease.20 Several authors have emphasized the importance of reporting patient reported outcome instruments in a homogenous cohort of surgically treated patients with diagnosis specific spinal conditions.21, 22, 23, 24 Two studies21,22 observed significant difference in the pattern of improvement in ODI after decompression and fusion for various condition specific spinal pathologies such as stenosis, disc pathology, spondylolisthesis, instability and scoliosis. The largest improvement in the Oswestry disability index was observed in spondylolisthesis whereas the least improvement in the Oswestry disability index was reported in non-union of spinal fusion. These studies highlighted the significance of diagnostic stratification of lumbar spine conditions.

4.1. Construct validity

Our results concurred with all the six a priori hypotheses pertaining to construct validity. The ODI and the RMDQ are indices of physical function and functional disability due to low back pain.10,18,25 They do not measure the intensity of pain which is usually measured by either the Visual Analogue Scale (VAS) or the Numerical Rating Scale (NRS). Correlation between changes in score of ODI and RMDQ has been evaluated in previous studies to establish construct validity.14 Sheahan et al. opined that RMDQ is an ideal outcome measure for evaluating construct validity.14

The physical component subscale of SF-36 has shown better correlation to instruments measuring functional disability whereas the mental component subscale of SF-36 has shown poorer correlation to instruments measuring functional disability.26 As the SF-12 has been developed from the SF-36, it can be extrapolated that SF-12 PCS would show higher correlation and association to ODI and RMDQ whereas the SF-12 MCS would demonstrate lesser correlation and association to ODI and RMDQ.

We observed that the ODI and the RMDQ showed significant correlation with SF-12 MCS whereas the SF-12 PCS did not show a significant correlation to SF-12 MCS. Probable explanation is that mental health status might be related to functional disability.Since PCS and MCS components of the SF-12 measure dissimilar and unrelated constructs they probably did not demonstrate significant correlation. Our results concur with the findings from a previous study27 wherein SF-12 MCS showed significant correlation to ODI and RMDQ whereas the correlation between SF-12 MCS and SF-12 PCS was insignificant.

The change in SF-12 PCS showed a higher correlation to the change in ODI in comparison to the RMDQ thereby suggesting that if the aim of the study is to determine physical function after lumbar stenosis surgery, ODI might be more appropriate patient reported outcome instrument. The change in SF-12 MCS showed a higher correlation to the change in RMDQ in comparison to the ODI thereby suggesting that if the aim of the study is to evaluate psychological function after lumbar stenosis surgery, RMDQ might be more appropriate patient reported outcome instrument.

4.2. Responsiveness

Distribution based methods such as ES and SRM rely on the distribution of the data before and after the surgical intervention. Both have the same numerator but the denominator differs. Large variation in the baseline scores of the outcome instruments prior to intervention can lead to reduction of the ES whereas large variation in the change in the scores of the outcome instruments can reduce the SRM. Distribution based methods have been criticized for not taking patient relevant information into consideration and this gap is addressed by anchor based methods such as AUC. It is probable that distribution based methods might show greater ES and SRM whereas the patients might feel that the treatment has not been effective. Studies using distribution based methods suggest that outcome instrument demonstrating the largest ES or SRM is the instrument with the highest responsiveness.28,29 For ES and SRM large effect size was observed for all outcome instruments except SF-12 MCS.

Anchor based method such as AUC relies on the response of the patient to a clinically relevant global rating question and the main advantage of using anchor based method is it captures clinically relevant and useful information from every individual patient of the cohort. Authors have recommended the use of anchor based methods30,31 particularly AUC15,32 to evaluate responsiveness. Hence we considered the outcome instrument with the largest AUC to be the most responsive instrument.29 An outcome instrument is considered to demonstrate adequate responsiveness if the area under the curve of the instrument is ≥ 0.70.9,15

In our study, the 95% confidence intervals (CI) of the AUC of all the outcome instruments tended to overlap thereby inferring that the difference in AUC between the outcome instruments might not be significant. Though there are no global guidelines, it is debatable as to whether consensus can be evolved to report the 95% confidence intervals of the AUC and only report significance if there is no overlapping of the 95% confidence intervals. Also it is important to determine the 95% CI of the difference of AUC of outcome instruments instead of relying on the 95% CI of the AUC. Though most statistical software reports the 95% CI of the AUC of an outcome instrument, we are not aware of any statistical package that calculates the 95% CI of the difference of AUC between various outcome instruments. Previous studies that evaluated the responsiveness of patient reported outcome measures in spine surgery have reported only the AUC and did not report the 95% confidence intervals of the AUC.23,32, 33, 34, 35, 36, 37

Minimal clinically important difference (MCID) of an outcome instrument is defined as the lowest value that is beyond the measurement error of the outcome instrument and is perceived to be beneficial by the patient after an effective intervention. The mean change in score for the ODI, RMDQ, SF-12 PCS and SF-12 MCS were 24.0 ± 20.5, 7.1 ± 5.8, −11.1 ± 11.1 and −6.1 ± 13.0 respectively. The values of MCID in surgically treated spine conditions quoted in the literature for ODI ranges from 4 to 16.6 32, 33, 34, 35, 36, 37, for RMDQ ranges from 1.3 to 1.5 32,33, for SF-12 PCS ranges from 2.5 to 8.8 35, 36, 37 and for SF-12 MCS from 9.3 to 10.1. 35,36 The improvement in the values of ODI, RMDQ and SF-12 PCS are both clinically and statistically significant whereas the improvement in the SF-12 MCS is statistically significant but might not be clinically significant because the mean change score is lower than the reported threshold for MCID value of SF-12 MCS.

Though ODI and RMDQ are not specific instruments for lumbar stenosis, they are disease [low back ache] specific outcome instruments.26 It is understood that disease specific and region specific patient reported outcome instruments will show higher responsiveness compared to HRQoL instruments because disease and region specific outcome instruments would contain more items that are pertinent to functional outcome as compared to HRQoL instruments.

The strengths of the study are this is the first study to contrast construct validity and responsiveness of ODI, RMDQ and SF-12 in a homogenous cohort. We used both anchor based and distribution based methods for evaluation of responsiveness and used two external anchors for AUC in order to enhance the robustness of the study. Our sample was larger than most of the relevant studies on surgically treated lumbar spine conditions.7,23,32,34, 35, 36, 37 As per COSMIN checklist our sample can be considered as good sample size (50–99 patients).38 It is preferred that study on evaluation of psychometric properties of an outcome instrument has sample size of more than 50 participants.11,15 Consecutive cases were selected and this could have reduced the occurrence of selection bias. Previously validated outcome instruments in spine research were used in the present study and this could possibly have reduced information bias. As this is a single surgeon series who was involved in patient selection and treatment, confounding bias could have reduced as well.

The results need to interpreted after due consideration of the following limitations. The present study included a homogenous group and it is possible that the effect size might not necessarily be the same for all stenosis patients in all centres. The mean follow-up in our study was three months and is justified by previously published work.20,39 Fekete et al.20 have concluded that a three month follow-up is sufficient for patients that underwent spinal decompression for pathology such as spinal stenosis because there was little change in the value of patient reported outcome measures after three months postoperatively. Clinically significant improvement in patient reported outcome instruments is observed within three months after surgical intervention for lumbar stenosis.39

5. Conclusion

Based on comparison of clinimetric properties such as construct validity and responsiveness, it is recommended to use either ODI or RMDQ as region specific patient reported outcome instrument and SF-12 PCS as a HRQoL outcome measure to evaluate outcome after decompression laminectomy for lumbar stenosis. SF-12 PCS can be used independently for evaluation of HRQoL after lumbar decompression for spinal stenosis.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of competing interest

The authors declare absence of any conflict of interest pertaining to the present study, authorship and publication of this article.

CRediT authorship contribution statement

Karthik Vishwanathan: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization, Project administration. Ian Braithwaite: Conceptualization, Methodology, Software, Data curation, Resources, Supervision, Investigation, Writing - review & editing, Project administration.

Acknowledgment

None to declare.

References

  • 1.Machado G.C., Ferreira P.H., Yoo R.I. Surgical options for lumbar spinal stenosis. Cochrane Database Syst Rev. 2016;11:CD012421. doi: 10.1002/14651858.CD012421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hebert J.J., Abraham E., Wedderkopp N. Patients undergoing surgery for lumbar spinal stenosis experience unique courses of pain and disability: a group-based trajectory analysis. PLoS One. 2019;14 doi: 10.1371/journal.pone.0224200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wang C., Yin X., Zhang L. Posterolateral fusion combined with posterior decompression shows superiority in the treatment of severe lumbar spinal stenosis without lumbar disc protrusion or prolapsed: a retrospective cohort study. J Orthop Surg Res. 2020;15:26. doi: 10.1186/s13018-020-1552-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ulrich N.H., Burgstaller J.M., Held U. The influence of single-level versus multilevel decompression on the outcome in multisegmental lumbar spinal stenosis: analysis of the Lumbar Spinal Outcome Study (LSOS) Data. Clin Spine Surg. 2017;30:1367–1375. doi: 10.1097/BSD.0000000000000469. [DOI] [PubMed] [Google Scholar]
  • 5.Minamide A., Yoshida M., Simpson A.K. Minimally invasive spinal decompression for degenerative lumbar spondylolisthesis and stenosis maintains stability and may avoid the need for fusion. Bone Joint Lett J. 2018;100:499–506. doi: 10.1302/0301-620X.100B4.BJJ-2017-0917.R1. [DOI] [PubMed] [Google Scholar]
  • 6.Mobbs R.J., Li J., Sivabalan P., Raley D., Rao P.J. Outcomes after decompressive laminectomy for lumbar spinal stenosis: comparison between minimally invasive unilateral laminectomy for bilateral decompression and open laminectomy: clinical article. J Neurosurg Spine. 2014;21:179–186. doi: 10.3171/2014.4.SPINE13420. [DOI] [PubMed] [Google Scholar]
  • 7.Chen C.Y., Chang C.W., Lee S.T. Is rehabilitation intervention during hospitalization enough for functional improvements in patients undergoing lumbar decompression surgery? A prospective randomized controlled study. Clin Neurol Neurosurg. 2015;129(Suppl 1):S41–S46. doi: 10.1016/S0303-8467(15)30011-1. [DOI] [PubMed] [Google Scholar]
  • 8.Chiarotto A., Maxwell L.J., Terwee C.B., Wells G.A., Tugwell P., Ostelo R.W. Roland-morris disability questionnaire and Oswestry disability index: which has better measurement properties for measuring physical functioning in nonspecific low back pain? Systematic review and meta-analysis. Phys Ther. 2016;96:1620–1637. doi: 10.2522/ptj.20150420. [DOI] [PubMed] [Google Scholar]
  • 9.Chiarotto A., Terwee C.B., Kamper S.J., Boers M., Ostelo R.W. Evidence on the measurement properties of health-related quality of life instruments is largely missing in patients with low back pain: a systematic review. J Clin Epidemiol. 2018;102:23–37. doi: 10.1016/j.jclinepi.2018.05.006. [DOI] [PubMed] [Google Scholar]
  • 10.Stokes O.M., Cole A.A., Breakwell L.M., Lloyd A.J., Leonard C.M., Grevitt M. Do we have the right PROMs for measuring outcomes in lumbar spinal surgery? Eur Spine J. 2017;26:816–824. doi: 10.1007/s00586-016-4938-x. [DOI] [PubMed] [Google Scholar]
  • 11.Terwee C.B., Bot S.D., de Boer M.R. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60:34–42. doi: 10.1016/j.jclinepi.2006.03.012. [DOI] [PubMed] [Google Scholar]
  • 12.Mokkink L.B., Terwee C.B., Patrick D.L. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63:737–745. doi: 10.1016/j.jclinepi.2010.02.006. [DOI] [PubMed] [Google Scholar]
  • 13.Fairbank J.C., Pynsent P.B. The Oswestry disability index. Spine. 2000;25:2940–2952. doi: 10.1097/00007632-200011150-00017. [DOI] [PubMed] [Google Scholar]
  • 14.Sheahan P.J., Nelson-Wong E.J., Fischer S.L. A review of culturally adapted versions of the Oswestry Disability Index: the adaptation process, construct validity, test-retest reliability and internal consistency. Disabil Rehabil. 2015;37:2367–2374. doi: 10.3109/09638288.2015.1019647. [DOI] [PubMed] [Google Scholar]
  • 15.Yao M., Zhu S., Tian Z.R. Cross-cultural adaptation of Roland-Morris Disability Questionnaire needs to assess the measurement properties: a systematic review. J Clin Epidemiol. 2018;99:113–122. doi: 10.1016/j.jclinepi.2018.03.011. [DOI] [PubMed] [Google Scholar]
  • 16.Prinsen C.A.C., Mokkink L.B., Bouter L.M. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1147–1157. doi: 10.1007/s11136-018-1798-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bhatt S., Boody B.S., Savage J.W., Hsu W.K., Rothrock N.E., Patel A.A. Validation of Patient-Reported Outcomes Measurement Information System computer adaptive tests in lumbar disc herniation surgery. J Am Acad Orthop Surg. 2019;27:95–103. doi: 10.5435/JAAOS-D-17-00300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.DeVine J., Norvell D.C., Ecker E. Evaluating the correlation and responsiveness of patient-reported pain with function and quality-of-life outcomes after spine surgery. Spine. 2011;36:S69–74. doi: 10.1097/BRS.0b013e31822ef6de. [DOI] [PubMed] [Google Scholar]
  • 19.Carter J.V., Pan J., Rai S.N., Galandiuk S. ROC-ing along: evaluation and interpretation of receiver operating characteristic curves. Surgery. 2016;159:1638–1645. doi: 10.1016/j.surg.2015.12.029. [DOI] [PubMed] [Google Scholar]
  • 20.Fekete T.F., Loibl M., Jeszenszky D. How does patient-rated outcome change over time following the surgical treatment of degenerative disorders of the thoracolumbar spine? Eur Spine J. 2018;27:700–708. doi: 10.1007/s00586-017-5358-2. [DOI] [PubMed] [Google Scholar]
  • 21.Glassman S.D., Carreon L.Y., Djurasovic M. Lumbar fusion outcomes stratified by specific diagnostic indication. Spine J. 2009;9:13–21. doi: 10.1016/j.spinee.2008.08.011. [DOI] [PubMed] [Google Scholar]
  • 22.Carreon L.Y., Djurasovic M., Canan C.E., Burke L.O., Glassman S.D. SF-6D values stratified by specific diagnostic indication. Spine. 2012;37:804–808. doi: 10.1097/BRS.0b013e318247821b. [DOI] [PubMed] [Google Scholar]
  • 23.Godil S.S., Parker S.L., Zuckerman S.L., Mendenhall S.K., Glassman S.D., McGirt M.J. Accurately measuring the quality and effectiveness of lumbar surgery in registry efforts: determining the most valid and responsive instruments. Spine J. 2014;14:2885–2891. doi: 10.1016/j.spinee.2014.04.023. [DOI] [PubMed] [Google Scholar]
  • 24.Ghogawala Z., Resnick D.K., Watters W.C., 3rd Guideline update for the performance of fusion procedures for degenerative disease of the lumbar spine. Part 2: assessment of functional outcome following lumbar fusion. J Neurosurg Spine. 2014;21:7–13. doi: 10.3171/2014.4.SPINE14258. [DOI] [PubMed] [Google Scholar]
  • 25.Geere J.H., Geere J.A., Hunter P.R. Meta-analysis identifies back pain questionnaire reliability influenced more by instrument than study design or population. J Clin Epidemiol. 2013;66:261–267. doi: 10.1016/j.jclinepi.2012.06.024. [DOI] [PubMed] [Google Scholar]
  • 26.Garg A., Pathak H., Churyukanov M.V., Uppin R.B., Slobodin T.M. Low back pain: critical assessment of various scales. Eur Spine J. 2020;29:503–518. doi: 10.1007/s00586-019-06279-5. [DOI] [PubMed] [Google Scholar]
  • 27.Stienen M.N., Smoll N.R., Joswig H. Influence of the mental health status on a new measure of objective functional impairment in lumbar degenerative disc disease. Spine J. 2017;17:807–813. doi: 10.1016/j.spinee.2016.12.004. [DOI] [PubMed] [Google Scholar]
  • 28.Hara N., Matsudaira K., Masuda K. Psychometric assessment of the Japanese version of the zurich claudication questionnaire (ZCQ): reliability and validity. PLoS One. 2016;28 doi: 10.1371/journal.pone.0160183. 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Fujimori T., Miwa T., Oda T. Responsiveness of the Japanese Orthopaedic Association Back Pain Evaluation Questionnaire in lumbar surgery and its threshold for indicating clinically important differences. Spine J. 2019;19:95–103. doi: 10.1016/j.spinee.2018.05.013. [DOI] [PubMed] [Google Scholar]
  • 30.Husted J.A., Cook R.J., Farewell V.T., Gladman D.D. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–468. doi: 10.1016/s0895-4356(99)00206-1. [DOI] [PubMed] [Google Scholar]
  • 31.Revicki D., Hays R.D., Cella D., Sloan J. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol. 2008;61:102–109. doi: 10.1016/j.jclinepi.2007.03.012. [DOI] [PubMed] [Google Scholar]
  • 32.Mannion A.F., Junge A., Grob D., Dvorak J., Fairbank J.C. Development of a German version of the Oswestry Disability Index. Part 2: sensitivity to change after spinal surgery. Eur Spine J. 2006;15:66–73. doi: 10.1007/s00586-004-0816-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Stucki G., Liang M.H., Fossel A.H., Katz J.N. Relative responsiveness of condition-specific and generic health status measures in degenerative lumbar spinal stenosis. J Clin Epidemiol. 1995;48:1369–1378. doi: 10.1016/0895-4356(95)00054-2. [DOI] [PubMed] [Google Scholar]
  • 34.Parker S.L., Adogwa O., Paul A.R. Utility of minimum clinically important difference in assessing pain, disability and health state after transforaminal lumbar interbody fusion for degenerative lumbar spondylolisthesis. J Neurosurg Spine. 2011;14:598–604. doi: 10.3171/2010.12.SPINE10472. [DOI] [PubMed] [Google Scholar]
  • 35.Parker S.L., Mendenhall S.K., Shau D. Determination of minimum clinically important difference in pain, disability, and quality of life after extension of fusion for adjacent-segment disease. J Neurosurg Spine. 2012;16:61–67. doi: 10.3171/2011.8.SPINE1194. [DOI] [PubMed] [Google Scholar]
  • 36.Parker S.L., Mendenhall S.K., Shau D.N. Minimum clinically important difference in pain, disability, and quality of life after neural decompression and fusion for same level recurrent lumbar stenosis: understanding clinical versus statistical significance. J Neurosurg Spine. 2012;16:471–478. doi: 10.3171/2012.1.SPINE11842. [DOI] [PubMed] [Google Scholar]
  • 37.Parker S.L., Adogwa O., Mendenhall S.K. Determination of minimum clinically important difference (MCID) in pain, disability, and quality of life after revision fusion for symptomatic pseudoarthrosis. Spine J. 2012;12:1122–1128. doi: 10.1016/j.spinee.2012.10.006. [DOI] [PubMed] [Google Scholar]
  • 38.Terwee C.B., Mokkink L.B., Knol D.L., Ostelo R.W., Bouter L.M., de Vet H.C. Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res. 2012;21:651–657. doi: 10.1007/s11136-011-9960-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Thomas K., Faris P., McIntosh G. Decompression alone vs decompression plus fusion for claudication secondary to lumbar spinal stenosis. Spine J. 2019;19:1633–1639. doi: 10.1016/j.spinee.2019.06.003. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Clinical Orthopaedics and Trauma are provided here courtesy of Elsevier

RESOURCES