Skip to main content
Physiotherapy Canada logoLink to Physiotherapy Canada
. 2014 Nov 4;66(4):359–366. doi: 10.3138/ptc.2013-49

No Differences in Outcomes in People with Low Back Pain Who Met the Clinical Prediction Rule for Lumbar Spine Manipulation When a Pragmatic Non-thrust Manipulation Was Used as the Comparator

Kenneth Learman *,, Christopher Showalter , Bryan O'Halloran , Megan Donaldson §, Chad Cook §
PMCID: PMC4403352  PMID: 25922557

ABSTRACT

Purpose: To investigate differences in pain and disability between patients treated with thrust manipulation (TM) and those treated with non-thrust manipulation (NTM) in a group of patients with mechanical low back pain (LBP) who had a within-session response to an initial assessment and met the clinical prediction rule (CPR). Methods: Data from 71 patients who met the CPR were extracted from a database of patients in a larger randomized controlled trial comparing TM and NTM. Treatment of the first two visits involved either TM or NTM (depending on allocation) and a standardized home exercise programme. Data analysis included descriptive statistics and a two-way ANOVA examining within- and between-groups effects for pain and disability, as well as total visits, total days in care, and rate of recovery. Results: No between-group differences in pain or disability were found for NTM versus TM groups (p=0.55), but within-subjects effects were noted for both groups (p<0.001). Conclusions: This secondary analysis suggests that patients who satisfy the CPR benefit as much from NTM as from TM.

Key Words: decision support techniques; low back pain; manipulation, spinal


Recent research has focused on identifying subgroups of populations likely to benefit from specific interventions; people with low back pain (LBP) have been targeted because of the prevalence of this condition and the monetary costs (both direct medical costs and indirect costs to society) associated with managing it. A clinical prediction rule (CPR) has been developed to identify a subgroup of people with LBP who are likely to benefit from spinal manipulative therapy (SMT),1 and this derivation study has been validated with an appropriate four-arm randomized clinical trial (RCT).2 The test items of the CPR are comprised of the following: (1) pain duration <16 days; (2) pain that does not refer below the knee; (3) a Fear-Avoidance Beliefs Questionnaire work subscale (FABQ-w) score <19; (4) hypomobility in at least one lumbar segment; and (5) hip internal rotation of >35° in at least one lower extremity. An individual needed to demonstrate four out of five predictors outlined above to meet the CPR (CPR+).1 Further studies have challenged the generalizability of the initial findings of the validation study, as independent RCTs using various manual therapy techniques, including SMT, failed to validate them.3,4

A recently published RCT provided evidence that lumbar manipulation is not superior to non-thrust manipulation for people with mechanical LBP.5 This study did not use CPR+ as an inclusion criterion, although meeting or not meeting the CPR was included as a dichotomous variable. A further sub-analysis of the larger RCT suggested that CPR+ status had a stronger prognostic than prescriptive effect, as the CPR variable was a universal predictor for all four models of LBP recovery for patients receiving manual therapy.6 Simply put, in this study, CPR+ status was prognostic for doing well with physiotherapy, regardless of the manual therapy provided in the short term (first two sessions) or more variable interventions in the longer term (session 2 through discharge). The four models of LBP recovery—(1) pain reduction, (2) disability reduction, (3) total visits received, and (4) patient perceived extent of recovery—were chosen to define the construct more globally. To account for the potential effect of CPR+ status as an inclusion criterion, we proposed that analyzing only the CPR+ from the original RCT data would more clearly delineate the impact of CPR status on between-groups differences for pain and disability when comparing thrust versus non-thrust manipulation. The purpose of the present study, therefore, was to determine whether people who were CPR+ for spinal manipulation showed better outcomes with thrust manipulation (TM) than with non-thrust manipulation (NTM). We hypothesized that those who were CPR+ and received TM would show a significantly larger treatment effect than those receiving NTM.

Methods

Participants

The study used a condition-stratified secondary analysis involving 71 people with LBP who met the CPR for success with lumbar manipulation (CPR+), extracted from a database of patients who were part of a larger RCT comparing TM and NTM. Data for the RCT were collected between January 1, 2011, and February 28, 2012. Patients in our study were from multiple distinct outpatient physical therapy practices in the United States and were selected from the RCT if they met the lumbar manipulation CPR (at least 4/5 predictors).1 The RCT was an equipoise controlled experimental study (see Figure 1). The Human Ethics Review Board of Walsh University approved this study; all patients provided informed consent, and all patients' rights were protected.

Figure 1.

Figure 1

Consort flow diagram for initial study enrollment and current study secondary analysis.

To be eligible for enrolment in the RCT, patients had to be at least 18 years of age with mechanically reproducible LBP and had to demonstrate a within-session change in self-perceived symptoms or therapist-perceived signs during the physical examination. Exclusion criteria were the presence of any red flags for orthopaedic manual therapy (OMT) or conservative care, including tumours, metabolic diseases, rheumatoid arthritis, osteoporosis, or prolonged history of steroid use; signs consistent with nerve root compression (reproduction of LBP or leg pain with straight leg raise at less than 45°, diminished lower-extremity deep tendon reflexes, diminished/absent sensation to pinprick in any lower-extremity dermatome or myotome, weakness of the lower extremity); prior lumbar surgery; or current pregnancy.

Procedures

All 71 patients included in our study were treated by experienced physical therapists who had undergone extensive OMT training and had received certification as either a certified orthopaedic manual therapist (COMT) or a Fellow of the American Academy of Orthopaedic Manual Physical Therapists (FAAOMPT). The trial used a pragmatic design in which clinicians were allowed to use whichever allocated thrust or non-thrust technique they felt would best serve the patient. Treatment was designed to intentionally reflect actual clinical practice and involved an experimental element only for the first two visits; the randomization of the first two visits was similar to the design of the CPR derivation and validation studies.1,2 Treatment administered during the first two visits involved either TM or NTM (depending on allocation) and a standardized home exercise programme that involved the same movements used in the CPR study.1,2 The clinician was allowed to determine the length of time between the first two visits, reflecting actual clinical practice; the length of time ranged from 1 to 4 days.

After the second visit, clinicians were allowed to perform any treatment procedure they felt would benefit each patient, including any physical therapy-related technique, whether it involved strengthening methods, movement-based methods, or other interventions, as long as the clinician felt it fit within the patient's treatment plan. Patients were discharged once the clinician felt they had achieved maximal improvement within the current treatment programme or if they self-discharged. There was no limitation on total number of visits. This subject stratification included those patients who met the inclusion criteria for the RCT as well as those who met at least four of five of the inclusion criteria for the CPR for lumbar manipulation.

Outcomes measures

At baseline, all patients provided demographic information and completed several self-report questionnaires, followed by a standardized history and physical examination. Height, weight, age, gender, race, and duration of symptoms in weeks were captured, as well as total days under physical therapy care and total number of visits.

Self-report findings were collected at three time points—at baseline, after two visits, and at discharge—with the exception of the FABQ-w findings, which were captured at baseline and after two visits only. All outcomes measures were collected and mailed to a third-party database steward, who created and managed the data set but was not involved in patient care. A statistician, blinded to group allocation, carried out the analyses.

The self-report questionnaires included the numeric pain rating scale (NPRS), which was used to capture the patient's level of pain using an 11-point ordinal scale (0=no pain, 10=worst pain imaginable). A two-point change in score represents a meaningful change.7 The FABQ-w8 was used to quantify patients' fear of pain and beliefs about avoiding activity. FABQ-w change scores were calculated by subtracting the data collected after the second visit from the baseline scores. The Oswestry Disability Index (ODI)9 was used to measure disability; a ≥50% reduction from baseline on the ODI has been considered a clinically important outcome.10 Self-report of recovery (0%=not at all, 100%=totally recovered) was captured at discharge, using a variant of the single alphanumeric evaluation, which has been used with patients with shoulder pain11 and LBP.12

Total visits and length of episode of care were calculated for each patient. Total visits comprised the initial and subsequent visits. Length of episode of care was calculated by subtracting the date of the initial visit from the date of the last recorded visit or discharge date. At the end of the study, physical therapists were also asked what type of treatment techniques they used, which adjunctive treatment methods were used after the first two (controlled) visits, and whether any adverse events occurred.

Data analysis

Data analyses included descriptive statistics and baseline levels for pain and disability, FABQ-w scores, and rate of recovery. All variables of interest were examined for the underlying assumptions of parametric statistics. Normality and the presence of outliers were assessed with histograms, stem-and-leaf plots, boxplots, and Kolmogorov–Smirnov (K-S) statistics. Because many of the variables in the patient characteristics violated the underlying assumption of normality with many outliers, we analyzed all continuous patient characteristic variables using independent samples t-tests and confirmed the results using Mann–Whitney U non-parametric statistics. Since there were so few non-white patients, we dichotomized the “race” variable by collapsing the non-white patients into a single group. The dichotomized successful ODI outcome data, the acute condition data (symptom duration ≤16 days), irritability, and the dichotomized race data were analyzed using Fisher's exact test. Gender was analyzed using chi-square analysis.

The main outcomes of interest (pain and disability at baseline, visit 2, and discharge) demonstrated a tendency toward normality, with non-significant K-S statistics for both groups at baseline and visit 2 for the NPRS and ODI and significant K-S statistics for both groups for the NPRS and ODI at discharge. ANOVA has been shown to be robust to non-normality when sample sizes are at least 20 per group and the groups lack outliers.13 The dependent variable distributions were examined for the presence of outliers by converting each case to standardized z scores. Since none of the standardized z scores exceeded the threshold of 3.29, as recommended by Tabachnick and Fidell,13 no outliers were present. Multivariate normality and linearity was examined using bivariate scatterplots, and several pairs of dependent variables were found to violate the assumptions at a mild to moderate level; however, multivariate ANOVA (MANOVA) has been found to be robust to non-normality when sample sizes are close to equal and larger than 10 per cell.13 We tested for multivariate outliers using Mahalanobis distances and found none. Homogeneity of variance-covariance structure, assessed with Box's M test, was found to be non-significant (F(21, 16526.7)=0.90, p=0.59), which made Wilks' lambda (Λ) test appropriate for interpretation of analysis. The results of the data screening for normality and linearity outlined above would suggest that while violations of the assumptions were present. These were generally mild to moderate and within the known tolerance of MANOVA; nonetheless, we analyzed the main outcomes of interest—pain and disability score data—using a repeated-measures MANOVA (RMMANOVA) and, to be safe, checked the results appropriately using non-parametric statistics (Mann–Whitney U for between-group analyses and Friedman's test for within-group analyses). All statistical analyses were performed using SPSS 20.0 (IBM, Chicago, IL) with significance set at α=0.05.

Results

A comparison of baseline patient characteristics revealed no statistically significant between-group differences (see Table 1). The RMMANOVA yielded no significant between-group differences for pain or disability (p=0.55). There also was no time×group interaction (p=0.55).

Table 1.

Descriptive Characteristics of Patients Who Met the Clinical Prediction Rule for Individuals Likely to Benefit from Spinal Manipulation.

No. of patients*
Variable All patients
(n=71)
Thrust group
(n=37)
Non-thrust group
(n=34)
p-value
Mean (SD) age, y 45.1 (12.8) 43.8 (11.5) 46.5 (14.1) 0.37
Sex, M / F 34 / 37 21 / 16 13 / 21 0.16
Race 0.10
 White 65 36 29
 Non-white 6 1 5
 Black 2 0 2
 Hispanic 1 0 1
 Asian 2 1 1
 Other 1 0 1
Mean (SD) height, m 1.71 (0.11) 1.72 (0.11) 1.71 (0.11) 0.49
Mean (SD) weight, kg 74.1 (14.1) 74.9 (12.6) 73.2 (15.8) 0.62
Mean (SD) BMI, kg/m2 25.0 (35.1) 25.1 (2.8) 24.9 (3.4) 0.80
Mean (SD) duration of symptoms, wk 26.1 (123.5) 7.0 (13.8) 46.9 (176.9) 0.20
Symptom duration <2.5 wk
 Yes 40 24 16 0.16
 No 31 13 18
Mean (SD) baseline ODI, % 27.9 (15.7) 26.5 (14.6) 29.4 (16.8) 0.45
Mean (SD) baseline NPRS 5.0 (2.2) 4.9 (2.4) 5.1 (2.1) 0.60
Mean (SD) baseline FABQ-w 9.1 (8.0) 8.4 (7.8) 10.0 (8.3) 0.39
Condition irritable
 Yes 15 10 5 0.26
 No 55 27 28
Mean (SD) no. of total visits 5.6 (4.7) 5.4 (4.1) 5.8 (5.3) 0.67
Mean (SD) length of episode of care, d 27.6 (24.8) 29.4 (25.2) 25.7 (24.6) 0.54
Mean (SD) % recovery 85.6 (20.6) 85.6 (21.1) 85.5 (20.4) 0.98
ODI 50% improvement
 Yes 51 29 22 0.29
 No 20 8 12
Compliance with exercise
 Very compliant 26 22 16 0.12
 Compliant 15 10 14
 Not compliant 5 4 1
 Extremely non-compliant 2 1 0
 Missing 1 0 3
*

Unless otherwise indicated.

ODI=Oswestry Disability Index; NPRS=numeric pain rating scale; FABQ-w=Fear Avoidance Beliefs Questionnaire work subscale.

We did find a significant within-subjects effect for time (Wilks' Λ=.27; F(4, 65)=43.49, p<0.001; multivariate η2=0.73, meaning that 73% of the variance observed is explained by time). Further analysis suggests a violation of sphericity (the assumption of equal variances for time points in a repeated design), with Mauchly's W of 0.78 χ2(2)=16.28, p<0.01 for the NPRS and Mauchly's W of 0.85 χ2(2)=10.98, p=0.001 for the ODI. Using the Greenhouse-Geisser correction for univariate analysis revealed a significant effect for time on both the NPRS (F(1.65, 292.02)=109.52, p<0.001, η2=0.62) and the ODI (F(1.74,5527.80)=82.69, p<0.001, η2=0.55). Pairwise comparisons indicated significant differences for each time point for both the NPRS and the ODI with p<0.001 (see Figures 2 and 3).

Figure 2.

Figure 2

Comparison of estimated marginal means for the NPRS score over time for the thrust and non-thrust manipulation groups. Error bars represent 95% CIs for the values. Within-groups analyses were significant at each time point; however, no between-groups differences were significant at any time points.

Figure 3.

Figure 3

Comparison of estimated marginal means for the ODI score over time for the thrust and non-thrust manipulation groups. Error bars represent 95% CIs. Within-groups analyses were significant at each time point; however, no between-groups differences were significant at any time points.

No between-groups differences were noted for number of treatment sessions (p=0.56), days in physical therapy care (p=0.20), patient perception of recovery (p=0.98), or successful 50% ODI reduction rate (p=0.98).

Discussion

Our study sought to investigate differences in pain and disability between TM and NTM treatment groups among people with mechanical LBP who showed a within-session response to an initial assessment and met the prescriptive CPR for lumbar manipulation.1,2 When those receiving NTM were compared with those receiving TM, no between-groups differences were found for patients who met the prescriptive CPR for lumbar manipulation. This finding differs from those previously reported, including Cleland and colleagues' RCT,14 which compared a standardized NTM with two different types of TM and found statistically significant differences between short-term outcomes and 6-month outcomes.14 We believe that there are three possible reasons for this difference in results: first, there are notable methodological differences between our study and that of Cleland and colleagues;14 second, the CPR may be prognostic as well as prescriptive; and, third, there are notable differences in the comparator groups in the two studies.

As previously identified, there were significant methodological differences between our study and others that might contribute to the differences in the results. First, our study is a secondary database analysis of a larger trial that included both patients who did and patients who did not meet the CPR at baseline; it has been noted that secondary analyses can sometimes produce different results than primary RCTs and should be treated with caution, since this form of analysis is not as strong as a primary RCT. Cleland and colleagues14 conducted an RCT that included only CPR+ patients. The primary purpose of that study was to examine two forms of TM and one form of NTM. Second, we did not limit inclusion by age or baseline ODI, which means that our data set included older people and those with a wider variety of disabilities, reflecting circumstances common in clinical practice. Third, we required a within-session change (either reduced pain or increased movement) during physical examination for inclusion in our study; this factor is important to consider because within-session changes have been identified as having positive prognostic impact15,16 and have been widely recommended to guide the choice of intervention.17,18 Finally, because our study was a secondary analysis of a larger trial, and because our sample size was marginally smaller than Cleland and colleagues',14 our statistical analysis has slightly less power and a greater chance of type I and type II errors. Although power alone is unlikely to account for the differences in our findings, it is important to recognize this as a potential influence.

Another reason for the difference in findings may be that meeting the CPR has a powerful prognostic influence on outcome; those who meet the rule are likely to have a good outcome regardless of MT intervention. This was reported in a previous study by our group6 that identified meeting the CPR as the universal prognostic variable for four different low-back-related outcome measures used. Other authors19 have also suggested that the five variables of the CPR identify patients who have a greater chance of improvement regardless of intervention provided. In addition, as noted above, a within-session change was required for inclusion in our study; together, this factor and CPR+ status may have resulted in a strong prognostic effect.

Another fundamental difference between our study and earlier studies was the application of the NTM technique. Rather than standardizing the technique to a general, prescriptive, global mobilization technique that would require a specific time and grade over two specific spinous processes (L4-L5),14 our study allowed clinicians to use their clinical reasoning to determine the type and grade of NTM technique that they thought would be most beneficial to each individual patient. This NTM application, designed to be pragmatic and to reflect common clinical practice, enhances the generalizability of our results. Although we believe that allowing clinicians to pragmatically apply the NTM technique based on their own judgment was the most appropriate way to design the initial RCT, we would be negligent not to report the evidence from studies of short-term and immediate effects that question the efficacy of a therapist-determined SMT technique compared with a general or randomly assigned technique for treatment of the lumbar spine.20,21 It is also worth noting how our comparator NTM intervention differed from that of the original CPR validation studies, which used a benign set of active exercises for the comparator group that are not advocated by LBP guidelines.22,23 A stronger effect of TM in earlier trials is likely to reflect the use of a comparator with less effect; differences in outcomes between our study and that of Cleland and colleagues14 may also be the result of differences in the application of NTM.

With respect to the current literature on the CPR for lumbar spine manipulation, our results concur with those of Hancock and colleagues3 in that the analysis does not support the widely held hypothesis that the CPR predicts a subpopulation of LBP patients likely to benefit more from TM techniques.2,14,24 Yet Hancock and colleagues3 did not differentiate patients who received TM from those who received NTM; the lack of between-groups differences noted in their results led to criticism of their research design, since the derivation and validation studies examined a specifically applied general TM technique.25 Hancock and colleagues' pragmatically designed study sought to determine whether previously reported results would generalize to a broader definition of manipulative therapy more typically applied in clinical practice as compared to a placebo control. Our study addressed the primary criticism of Hancock and colleagues'3 work by separating out the effects of NTM and TM as the initial intervention to determine whether this initial criticism had merit. Our results suggest that the criticism is unfounded, since we found no between-groups differences based on TM versus NTM intervention results.

Finally, both our study and that of Hancock and colleagues3 found a clinically meaningful reduction in pain and disability following 4 weeks of care. This suggests that, regardless of group assignment (whether to different OMT groups or to sham treatment),3 people who were CPR+ saw clinically beneficial outcomes. Meeting the CPR was prognostic for better results than not meeting the CPR.3

Our study has several limitations. First, because our study was a secondary analysis of a larger RCT, the protection against threats to validity that the randomization procedure afforded the original trial is less robust, despite an analysis demonstrating that patient characteristics at baseline were not different; previous research has shown that secondary database analyses of subgroups may produce more type I and type II errors.3,26 Since our subgroup analysis produced results consistent with those of the larger RCT from which the secondary data set was derived, however, it is more likely to have produced believable results. The sample size is relatively small for treatment groups; therefore, generalizations should be made cautiously, since small sample sizes increase the potential for type II errors.

Second, our study lacks long-term follow-up to determine potential benefits after discharge. Previous studies used longer-term follow-up (from 3 months3 to 6 months2,14). Long-term follow-up provides greater support for the use of a particular treatment approach, and, while our study included longer-term follow-up data for some patients, depending on the patient and physiotherapist, the average length of care episode was only 28 days. Certainly, we saw no temporal trend suggesting greater improvement with TM over time.

Finally, our study lacks a true control group. Although enough evidence exists to suggest that OMT is an efficacious treatment for LBP, methodological differences in studies may question the generalizability of those results to our patients.

Conclusions

This sub-analysis of a data set from a larger RCT suggests that people who satisfy a prescriptive CPR for lumbar spinal manipulation benefit as much from non-thrust manipulation driven by the physical therapist's clinical reasoning as from thrust manipulation. These results give clinicians further options if there are concerns about using TM on a specific patient who satisfies the CPR.

Key messages

What is already known on this topic

Patients who meet a prescriptive clinical prediction rule (CPR) for spinal manipulation respond favourably and quickly to TM. It was not known how well the CPR generalizes to forms of non-thrust manipulation (NTM).

What this study adds

The results of this secondary analysis suggest that a therapist derived NTM regimen produces similar improvements in pain and disability to the TM protocol that the CPR was designed to predict. Non-thrust manipulation may be as effective as TM for patients meeting the CPR.

Physiotherapy Canada 2014; 66(4);359–366; doi:10.3138/ptc.2013-49

References


Articles from Physiotherapy Canada are provided here courtesy of University of Toronto Press and the Canadian Physiotherapy Association

RESOURCES