Abstract
Background
Delivering efficient and effective healthcare is crucial for a condition as burdensome as low back pain (LBP). Stratified care strategies may be worthwhile, but rely on early and accurate patient screening using a valid and reliable instrument. The purpose of this study was to evaluate the performance of LBP screening instruments for determining risk of poor outcome in adults with LBP of less than 3 months duration.
Methods
Medline, Embase, CINAHL, PsycINFO, PEDro, Web of Science, SciVerse SCOPUS, and Cochrane Central Register of Controlled Trials were searched from June 2014 to March 2016. Prospective cohort studies involving patients with acute and subacute LBP were included. Studies administered a prognostic screening instrument at inception and reported outcomes at least 12 weeks after screening. Two independent reviewers extracted relevant data using a standardised spreadsheet. We defined poor outcome for pain to be ≥ 3 on an 11-point numeric rating scale and poor outcome for disability to be scores of ≥ 30% disabled (on the study authors' chosen disability outcome measure).
Results
We identified 18 eligible studies investigating seven instruments. Five studies investigated the STarT Back Tool: performance for discriminating pain outcomes at follow-up was ‘non-informative’ (pooled AUC = 0.59 (0.55–0.63), n = 1153) and ‘acceptable’ for discriminating disability outcomes (pooled AUC = 0.74 (0.66–0.82), n = 821). Seven studies investigated the Orebro Musculoskeletal Pain Screening Questionnaire: performance was ‘poor’ for discriminating pain outcomes (pooled AUC = 0.69 (0.62–0.76), n = 360), ‘acceptable’ for disability outcomes (pooled AUC = 0.75 (0.69–0.82), n = 512), and ‘excellent’ for absenteeism outcomes (pooled AUC = 0.83 (0.75–0.90), n = 243). Two studies investigated the Vermont Disability Prediction Questionnaire and four further instruments were investigated in single studies only.
Conclusions
LBP screening instruments administered in primary care perform poorly at assigning higher risk scores to individuals who develop chronic pain than to those who do not. Risks of a poor disability outcome and prolonged absenteeism are likely to be estimated with greater accuracy. It is important that clinicians who use screening tools to obtain prognostic information consider the potential for misclassification of patient risk and its consequences for care decisions based on screening. However, it needs to be acknowledged that the outcomes on which we evaluated these screening instruments in some cases had a different threshold, outcome, and time period than those they were designed to predict.
Systematic review registration
PROSPERO international prospective register of systematic reviews registration number CRD42015015778.
Electronic supplementary material
The online version of this article (doi:10.1186/s12916-016-0774-4) contains supplementary material, which is available to authorized users.
Keywords: Low back pain, Screening, Prognosis, Risk, Predictive validity
Background
A current trend in health service delivery towards the provision of stratified models of care [1–3] offers potential to optimise treatment benefits, reduce harms and maximise healthcare efficiency. Stratified approaches aim to match patients to the most appropriate care pathways on the basis of their presentation. A common approach bases stratification on patients’ prognostic profile, which requires early, accurate screening using a valid and reliable instrument. By so doing, care decisions aim to offer treatment to those who need it most and avoid over-treatment of those who need it least.
Better matching of patients to care is particularly important for a condition as burdensome as low back pain (LBP) [4, 5]. The prognosis of chronic LBP – when symptoms persist beyond 3 months – is poor [6]. This warrants a focus on the potential for intervention to be appropriately targeted prior to the development of chronic pain. Improved understanding of factors associated with chronic LBP [7–10] has led to the development of self-report questionnaires containing multiple variables known to have prognostic relevance. These prognostic screening instruments (PSIs; also referred to as predictive tools) assess certain characteristics of an individual’s pain experience (including pain intensity and functional impairment) and certain psychosocial factors (e.g. beliefs, catastrophisation, anxiety and depression). These prognostic variables have been shown to be associated with specific outcome measures and time frames [11].
PSIs are widely recommended to inform the management of LBP [12–15], with updated international guidelines encouraging the use of risk stratification to guide care decisions. A possible consequence of these broad recommendations is that PSIs are likely to be used for purposes other than the specific purpose for which they were intended and in varied clinical settings. These factors may impact instrument performance, with implications for care decisions based on screening.
As the use of PSIs to inform care delivery becomes more widely adopted, it is important to further consider the uncertainty that surrounds their accuracy [16, 17]. We investigate how PSIs perform (individually and generally) when administered for the purpose of predicting the likely course of LBP. The aim of this review was to determine how well LBP PSIs discriminate between patients who develop a poor outcome and those who do not in adults with LBP of less than 3 months duration.
Methods
This systematic review is reported in accordance with the statement for Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) [18] (see Additional file 1).
Registration
Our protocol was registered a priori on the PROSPERO International prospective register of systematic reviews (http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42015015778)
Data sources and searches
Between June 23 and July 7, 2014, eight electronic databases (Medline (OvidSP), CINAHL (EBSCO host), EMBASE (OvidSP), PsycINFO (OvidSP), PEDro, Cochrane Central Register of Controlled Trials (CENTRAL) (OvidSP), Web of Science (ISI) and SciVerse SCOPUS) were systematically searched by a single reviewer to identify eligible studies. No time limits were applied, but studies were limited to English language publications and those involving human participants. Search terms included the following keywords and their variations: low back pain, sciatica, radiculopathy, risk, screening, questionnaire, instrument, prediction, prognosis, validity. While LBP was of principle interest, studies were not excluded if they involved participants with leg pain/sciatica or radiculopathy (conditions which involve a low-back disorder and are usually accompanied by LBP). Table 1 shows the full search strategy. The reference lists of all included articles and relevant review articles were later searched to identify any additional studies. Searching of all databases was updated on June 29 and December 22, 2015, and June 30, 2016.
Table 1.
# | Searches |
---|---|
1 | Back Pain/ |
2 | Low Back Pain/ |
3 | Sciatica/ |
4 | Radiculopathy/ |
5 | (back pain or low back pain or radiculopathy or sciatica or back?ache or lumbago).mpa |
6 | (pain or ache or aching or complaint or dysfunction or disability or disorder).mpa |
7 | Back or spine or lumbar or lumbar spine or low*back).mpa |
8 | 6 and 7 |
9 | 1 or 2 or 3 or 4 or 5 or 8 |
10 | (screen* or risk screen* or risk).mpa |
11 | (tool or questionnaire or instrument).mpa |
12 | 10 and 11 |
13 | 9 and 12 |
14 | (predict* or prognosis or prediction rule* or early identification or predictive validity or predictive factors or prognostic or prognostic indicators).mpa |
15 | 13 and 14 |
16 | Limit 15 to (English language and humans) |
amp: title, abstract, original title, name of substance word, subject heading word, keyword heading word, protocol supplementary concept word, rare disease supplementary concept word, unique identifier
Eligibility criteria
Types of participants
Studies were eligible if they involved adults (aged 18 or over) with ‘recent onset’ LBP (i.e. acute LBP (0–6 weeks) or subacute LBP (6 weeks to 3 months)), with or without leg pain. Studies involving participants with recent-onset and participants with chronic symptoms were included with the intention of requesting from study authors the data from the ‘recent onset’ participants only. Studies including participants with pain in other body regions were considered eligible if more than 75% had LBP. Cohorts of compensable and non-compensable patients presenting to primary, secondary and tertiary care settings were eligible for inclusion. It was also considered appropriate to include individuals registered on workers compensation databases, because it was assumed that this occurs in conjunction with presentation to a healthcare provider. Participants may have presented with a first episode of pain or report episodic/recurrent LBP, provided that the current painful episode was immediately preceded by a minimum of one pain-free month as suggested previously [19].
Types of studies
Prospective cohort studies meeting a Level I or Level II quality standard according to the National Health and Medical Research Council of Australia (NHMRC) evidence hierarchy for prognostic studies [20] were included. According to this standard, participants in these studies must have been recruited as a consecutive series of new presentations in any healthcare setting and been subject to longitudinal assessment. Studies classified as NHMRC Level III and IV evidence, including retrospective cohort studies, analysis of a single arm of a randomised controlled trial or case series reports, were excluded. Included studies involved the application of a previously developed PSI within the first 3 months of an episode of LBP and reported follow-up outcomes at a minimum of 12 weeks from initial screening.
We defined a PSI as an instrument that met all of the following criteria: (1) a self-report questionnaire; (2) assesses multiple factors or constructs that have predictive validity for patients with musculoskeletal pain; and (3) was developed to provide prognostic information for musculoskeletal conditions. The broad term of ‘musculoskeletal’ pain rather than LBP was selected to define the PSIs to avoid exclusion of instruments that had been developed for use with musculoskeletal conditions and subsequently validated for LBP cohorts. Studies were not excluded on the basis of how the instrument was developed, or the primary intention of the instrument (ascribed by the developers). For example, the Keele STarT Back Tool (SBT) was developed to include only ‘modifiable’ prognostic factors and was specifically intended for the purpose of matching subgroups of patients to stratified care pathways. Of primary importance to us was the inclusion of all instruments currently being widely used to offer prognostic information, or considered by the wider community of clinicians and researchers to be able to offer prognostic information. Included studies were required to report associations between the PSI scores and participant outcomes, and aimed, a priori, to evaluate the instrument for its predictive validity. Development studies were excluded to avoid including PSIs that had been insufficiently validated for clinical application [21].
Types of outcomes
To be included, studies must have reported one or more of the following outcomes:
Pain intensity as measured using a visual analogue scale, numeric rating scale (NRS), verbal rating scale or Likert scale
Disability as measured by validated self-report questionnaires
Sick leave or days absent from work or return to work status
Self-reported recovery using a global perceived effect scale or a Likert (recovery) scale
Study selection
Following removal of duplicate articles, two reviewers independently assessed the titles and abstracts of studies identified by the search for eligibility. AW assessed all the articles; EK and LG each assessed 50% of the articles. All reviewers applied a checklist of inclusion and exclusion criteria. Disagreements were discussed. The full paper was obtained for further assessment if necessary. Full texts of studies potentially fulfilling the eligibility criteria were retrieved, with subsequent independent assessment of all articles undertaken by EK and LG. Reasons for study exclusion were noted on a checklist with any disagreements resolved by discussion.
Data extraction and analysis
EK and either LG or LR independently reviewed the full text of eligible studies and extracted relevant data using a standardised spreadsheet. Extracted data included details of the healthcare setting, recruitment, study population, number of participants, loss to follow-up, symptom duration, LBP history, compensability, concomitant treatments, outcome measurement, statistical analyses, and reporting quality. Discrepancies in extracted data were identified and checked. If the required data could not be extracted, authors were emailed with the specific enquiry. If no response was received, authors were re-emailed after 2 weeks, and (finally) after a further week.
Predictive validity is conventionally assessed using receiver operating characteristic (ROC) curve analysis, with area under the curve (AUC) statistic being the most routinely reported measure of performance [22]. AUC values provide an overall measure of the discriminative ability of the instrument. Values range from 0.5 to 1.0, where 0.5 indicates that the instrument is no better than chance at discriminating those participants who will have a poor outcome, from those who will recover. AUC values of < 0.6 suggest that the instrument or screening test should be regarded as ‘uninformative’; 0.6–0.7 indicates ‘poor’ discrimination; 0.7–0.8 ‘acceptable’; 0.8–0.9 ‘excellent’; and above 0.9 ‘outstanding’ [23, 24].
Where possible, we extracted AUC values with 95% confidence intervals to enable analysis and comparison of the PSIs. When AUC values were not provided, study authors were requested to either (1) calculate AUC values for the recent-onset participants or (2) provide primary data to allow calculation of AUC values. If the authors chose to calculate AUC values, we offered further instruction on how to do so. The primary outcome of this study was pain intensity at follow-up; poor outcome was pain ≥ 3 on an 11-point NRS, which was based on Grotle et al. [25] and Traeger et al. [26], and follows evidence that many people with scores of < 3 consider themselves to be recovered [27]. All study authors who reported obtaining pain NRS scores were requested to dichotomise pain outcomes according to this definition. Authors then re-analysed their results or offered outcome data and baseline screening scores to enable us to undertake ROC analysis. When authors were willing to assist with dichotomising disability outcomes, scores of ≥ 30% disabled (on their chosen disability outcome measure) were classified as ‘poor outcome’. A similar approach to revision of the ROC analyses was undertaken. No attempt was made to request re-definition of sick leave and recovery outcomes (secondary outcomes of this study).
Meta-analysis was planned considering the potential to pool data according to (1) individual PSIs and (2) specific outcomes. For data pooling to be appropriate, it was considered important that (1) outcome measures were defined consistently, (2) the clinical settings were similar (e.g. all primary care), and (3) uniform statistical analyses had been applied. Interpretation of random effects models was planned due to assumed variability in participant cohorts. Meta-analyses, including tests for statistical heterogeneity (using I 2 test) were undertaken using MedCalc Statistical Software (version 14.12.0). A post-hoc sensitivity analysis was undertaken to explore the influence of study variation in classification of poor disability outcomes on the meta-analysis.
Assessment of methodological quality
EK and either LG or LR independently undertook the risk of bias (ROB) assessment using the Quality in Prognostic Studies (QUIPS) tool [28]. This tool was developed specifically for assessing bias in studies of prognostic factors. Items across six domains (study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding, and statistical analysis and reporting) were considered individually for each study. A guideline was used to classify each item as ‘high’, ‘moderate’ or ‘low’ risk of bias. If insufficient information was available to assess potential bias, that domain was rated ‘unclear’. An overall ROB was established for each individual study according to Bruls et al. [29]. The overall ROB for a study was rated as ‘low’ (indicating a high quality study) when all or most (4–6) of the six bias domains were fulfilled, with each domain rated as ‘low’ or ‘moderate’. The overall ROB was rated as ‘high’ (indicating a low quality study) when one or more of the six bias domains were rated as ‘high’ or ‘unclear’. Disagreements in ratings were discussed and, if not resolved, a third reviewer (SH) was consulted. Studies rated as having a ‘low’ risk of bias using the QUIPS tool were considered ‘high quality’.
Results
Study selection
Our initial search identified 1557 studies for potential inclusion, from which 110 full text articles were retrieved. Twenty-one studies satisfied all criteria for inclusion. Three further studies were identified through repeat searching. The authors of 13 studies were contacted to request data pertaining specifically to the recent onset participants. Unsuccessful attempts to obtain these data meant that six studies were excluded [30–35]. Eighteen studies were finally included in this review.
Details of studies accepted and rejected during the selection process are illustrated in Fig. 1. Table 2 details the studies that were excluded based on the participants’ pain duration at baseline screening. Key study characteristics and results are summarised in Table 3 (at the end of the manuscript).
Table 2.
Reference | Prognostic screening instruments | Reason for exclusion |
---|---|---|
Bergstrom et al. (2011) [62] | MPI-S | Mixed cohort;b authors did not differentiate an acute/subacute group |
Bernstein et al. (1994) [63] | SCL-90-R | Chronic pain cohort (pain > 3 months) |
Morso et al. (2011) [64] | PainDETECT questionnaire | Chronic pain cohort (pain duration 3–12 months) |
Late exclusions:a | ||
Fischer et al. (2014) [30] | HKF-R10 | Mixed cohort;b authors did not differentiate an acute/subacute group |
Hurley et al. (2001) [31] | ALBPSQ | Mixed cohortb,c |
Linton et al. (2011) [32] | OMPSQ (Short Form) | Mixed cohortb,c |
Morso et al. (2013) [65] | SBT | Mixed cohortb,c |
Morso et al. (2014) [33] | SBT | Mixed cohortb,c |
Cats-Baril et al. (1991) [35] | VDPQ | Mixed cohort;b unable to contact authors to request data from recent onset participants |
aStudy authors were contacted (or contact attempts were made) prior to study exclusion
bCombination of acute/subacute/chronic pain participants
cAuthors unable to provide data for ‘recent-onset’ participants
MPI-S Multidimensional Pain Inventory (Swedish version), SCL-90-R Symptom Checklist 90 Revised, HKF-R10 Heidelberg Short Early Risk Assessment Questionnaire, ALBPSQ Acute Low Back Pain Screening Questionnaire, OMPSQ Orebro Musculoskeletal Pain Screening Questionnaire, SBT STarT Back Screening Tool, VDPQ Vermont Disability Prediction Questionnaire
Table 3.
Reference | Country of investigation and clinical setting | Definition of poor outcome | N at baseline, (n at follow-up, % at follow-up) | Discrimination (AUC (95% confidence interval)) |
---|---|---|---|---|
STarT Back Screening Tool | ||||
Beneciuk et al. 2012 [43] | USA Outpatient physiotherapy clinics |
At 6 months: aPain NRS score ≥ 3 aDisability (ODI score ≥ 30%) |
73 (55, 75.3%) |
aPain 0.61 (0.45–0.76) aDisability 0.75 (0.60–0.90) |
Field & Newell, 2012 [44] | UK Chiropractic clinics |
At 90 days: aPain NRS score ≥ 3 |
477 (151, 31.7%) | aPain 0.597 (0.499–0.694) |
Hill et al. 2008 [46] | UK General practice clinics |
6 months: RMDQ score ≥ 7 aPain NRS score ≥ 3 aDisability ≥ 30% RMDQ |
177 at follow-up. (N at baseline not specified |
aPain 0.70 (0.62–0.88) aDisability 0.81 (0.75–0.88) |
Kongsted et al. 2015 [38] | Denmark Chiropractic clinics |
3 months: aPain NRS score ≥ 3 aDisability ≥ 30% RMDQ |
754 (604, 80.1%) |
aPain 0.56 (0.49–0.60) aDisability 0.67 (0.62–0.73) |
Newell et al. 2014 [45] | UK Chiropractic clinics |
At 90 days: aPain NRS score ≥ 3 |
284 (192, 67.6%) | aPain 0.59 (0.48–0.69) |
Orebro Musculoskeletal Pain Screening Questionnaire; Acute Low Back Pain Screening Questionnaire | ||||
Gabel et al. 2011 [39] | Australia Physiotherapy outpatient clinics |
At 6 months: Functional status ≥ 10% Problem severity > 1 Absenteeism > 0 days Long term absenteeism > 28 days aPain NRS score ≥ 3 aDisability (SFI score ≥ 30%) |
66 (58, 87.9%) (OMPSQ - Original) |
Functional status 0.88 (0.78–0.99) Problem severity 0.85 (0.72–0.97) Absenteeism 0.86 (0.76–0.96) Long-term absenteeism 0.85 (0.73–0.96) aPain 0.84 (0.71–0.97) aDisability 0.80 (0.67–0.92) |
Grotle et al. 2006 [25] | Norway General practitioner/Chiropractor/Physiotherapy clinics (27% recruited through advertisement) |
At 6 & 12 months: Pain NRS score ≥ 3 Disability (RMDQ score > 4) Sick leave (> 30 days) |
123 (112, 91.1%) | Pain 0.62 (0.51–0.73) Disability 0.68 (0.56–0.80) Sick leave 0.80 (0.66–0.93) |
Heneweer et al. 2007 [66] | Netherlands Physiotherapy clinics |
Not recovered at 12 weeks aPain NRS score ≥ 3 aDisability QBPDS ≥ 30%) |
66 (56, 84.8%) | Non-recovery 0.64 (0.5–0.79) aPain 0.64 (0.50–0.78) aDisability 0.67 (0.54–0.8) |
Jellema et al. 2007 [52] | Netherlands General practice clinics |
12 months: score of ‘slightly improved’ or worse at two or more follow-up time points | 314 (296, 94.3%) | Non-recovery 0.61 (0.54–0.67) |
Law et al. 2013 [37] | China Physiotherapy clinics in public hospitals |
12 months post discharge: Failure to return to work Prolonged sick leave (> 30 days) |
241 (220, 91.3%) | Return to work 0.69 (0.62–0.76) Prolonged sick leave 0.71 (0.64–0.78) |
Nonclercq et al. 2012 [42] | Belgium Emergency facility or outpatient clinic |
At 6 months: Pain index score > 16 ODI ≥ 20% Functional index < 45 Work absence > 30 days aPain NRS score ≥ 3 aDisability ≥ 30% ODI |
91 (73, 80%) | Pain 0.73 (no confidence intervals) Functional index 0.79 (no confidence intervals) Absenteeism 0.83 (standard error 0.71) Disability 0.75 (no confidence intervals) aPain 0.70 (standard error 0.66) aDisability 0.72 (standard error 0.86) |
Schmidt et al. 2016 [48] | Germany General practice clinics |
6 months: Disability ≥ 4/11 (dichotomised mean response to three GCPS disability items) |
181 (112, 62%) | Disability (OMPSQ scale sum score) 0.79 (0.67–0.90) Disability (OMPSQ item sum score) 0.77 (0.66–0.87) |
Vermont Disability Prediction Questionnaire | ||||
Hazard et al. 1996 [49] | USA Vermont Department of Labour and Industry database |
Not returned to work at 3 months | 166 (163, 98%) | Return to work 0.92 (no confidence interval or standard error reported) |
Hazard et al. 1997 [50] | USA Vermont Department of Labour and Industry database |
Not returned to work at 3 months | 304 (268, 88.2%) | Return to work 0.78 (no confidence interval or standard error reported) |
Absenteeism Screening Questionnaire | ||||
Truchon et al. 2012 [51] | Canada Quebec Workers Compensation Board database |
12 months: Absenteeism > 182 cumulative days |
535 (310, 58%) | Absenteeism 0.73 (no confidence intervals or standard error reported) |
Chronic Pain Risk Score | ||||
Turner et al. 2013 [61] | USA Primary care |
4 months Pain grades 3 & 4 aPain NRS ≥ 3 |
458 (425, 92.8%) | Pain grades 3 & 4 0.67 (0.59–0.72) aPain 0.67 (0.59–0.72) |
Back Disability Risk Questionnaire | ||||
Shaw et al. 2009 [40] | USA Occupational health clinics |
3 months: Pain ≥ 5 Disability ≥ 50% aPain NRS score ≥ 3 aDisability ≥ 30% RMDQ |
568 (519, 91.4%) |
aPain 0.61 (0.56–0.66) aDisability 0.66 (0.62–0.70) |
Hancock Clinical Prediction Rule | ||||
Williams et al. 2014 [41] | Australia General practice clinics, Pharmacists or physiotherapy clinics |
3 months: No sustained recovery (0 or 1/10 on a NRS for 7 consecutive days) aPain NRS ≥ 3 |
956 (937, 82%) | Sustained recovery 0.60 (0.56–0.64) aPain 0.62 (0.60–0.65) |
aUnpublished data for ‘recent onset’ participants, provided on request
NRS numeric rating scale, ODI Oswestry Disability Index, RMDQ Roland Morris Disability Questionnaire, SFI Spine Functional Index, QBPDS Quebec Back Pain Disability Scale, GCPS Graded Chronic Pain Scale, OMPSQ Orebro Musculoskeletal Pain Screening Questionnaire
Study characteristics
Included studies were conducted between 1996 and 2015, in 10 different countries – USA (n = 5), UK (n = 3), Australia (n = 2), Netherlands (n = 2), and one in each of Norway, Denmark, China, Belgium, Germany, and Canada (Table 3). Seventeen studies included in this review were undertaken in primary healthcare settings, defined, according to the World Health Organization Declaration of Alma-Ata (1978), as involving the individual’s “first level of contact” with “promotive, preventive, curative and rehabilitative services” ([36] p. 2). One investigation [37] was conducted in a Hospital outpatient physiotherapy setting, considered ‘secondary care’. Five studies included ‘working adult’ populations; 13 studies included ‘general adult’ participants (some of whom were employed). Of those 13 studies, three were undertaken in Physiotherapy settings, four in Chiropractic clinics, six in General Practice settings, two in a Hospital emergency/Outpatient department and two in combinations of these healthcare settings.
PSIs
Seven instruments satisfied our criteria for classification as a PSI: the SBT (five studies), the Orebro Musculoskeletal Pain Screening Questionnaire (OMPSQ; seven studies), the Vermont Disability Prediction Questionnaire (VDPQ; two studies), the Back Disability Risk Questionnaire (BDRQ; one study), the Absenteeism Screening Questionnaire (ASQ; one study), the Chronic Pain Risk Score (CPRS; one study), and the Hancock Clinical Prediction Rule (HCPR; one study). The PSIs are summarised in Table 4.
Table 4.
Instrument | Summary of instrument | Scoring method | Cut-off scores/subgrouping |
---|---|---|---|
STarT Back Tool (SBT) [46] | 9-item, self-report questionnaire; items screen for predictors of persistent disabling back pain and include radiating leg pain, pain elsewhere, disability (2 items), fear, anxiety, pessimistic patient expectations, low mood and how much the patient is bothered by their pain; all 9-items use a response format of ‘agree’ or ‘disagree’, with exception to the bothersomeness item, which uses a Likert scale. | Two scores are produced – an overall score and a distress (psychosocial) subscale | Total scores of 3 or less = low risk If total score is 4 or more: - Those with psychosocial subscale scores of 3 or less = medium risk - Those with psychosocial subscale scores of 4 or more = high risk |
Orebro Musculoskeletal Pain Screening Questionnaire (OMPSQ) [67] and Acute Low Back Pain Screening Questionnaire (ALBPSQ) [68] | 25-item, self-report questionnaires; items screen for six factors: self-perceived function, pain experience, fear-avoidance beliefs, distress, return to work expectancy, and pain coping | Total score calculated from 21 items and can range from 2 to 210 points; higher values indicate more psychosocial problems | A cut-off of 105 proposed for indicating those ‘at risk’ of persisting problems |
OMPSQ (Short form) [32] | 10-item questionnaire covering five domains: self-perceived function, pain experience, fear-avoidance beliefs, distress, and return to work expectancy; demonstrated to have similar discriminative ability to original OMPSQ | Scores range from 0 to 100 (higher scores indicate higher risk) | A cut-off of 50 recommended to indicate those ‘at risk’ of persisting pain related disability |
Vermont Disability Prediction Questionnaire (VDPQ) [49] | 11-item self-report questionnaire; assesses perceptions of who was to blame for the injury, relationships with co-workers and employer, confidence that they will be working in 6 months, current work status, job demands, availability of job modifications, length of time employed, and job satisfaction | Hand scored (maximum score of 23) | No optimal cut-off recommended |
Back Disability Risk Questionnaire (BDRQ) [40] | 16-item self-report questionnaire; items include demographics, health ratings, workplace concerns, pain severity, mood, and expectations for recovery | Sum score calculated | No optimal cut-off recommended |
Absenteeism screening questionnaire (ASQ) [51] | 16-item, self-report questionnaire; assesses potential occupational back pain disability and risk factors including: work factors (3), physical health (2), supervisor response (1), pain (2), mood (2), wellness/job satisfaction (3), and expectations for recovery (1); mixture of nominal, ordinal and interval scale response options | ‘Flag’ related items are summed and level of risk categorised as low, medium or high | 0–1 flag items = low risk 2–3 items = medium risk 4–9 items = high risk |
Chronic Pain Risk Score (CPRS) [61] | Three graded chronic pain scale ratings of pain intensity, three ratings of activity interference, the number of activity limitation days, the number of days with pain in the past 6 months, depressive symptoms, the number of painful sites | Maximum score of 28 (higher scores indicate greater risk) | No optimal cut-off recommended |
Hancock Clinical Prediction Rule (HCPR) [69] | 3-item self-report questionnaire, items assess baseline pain (≤ 7/10), pain duration (≤ 5 days) and number of previous painful episodes (≤ 1) | Status on the prediction rule determined by calculating the number of predictors of recovery present | Risk classification based on the number of predictors of recovery present (0–3) |
Outcomes
Six studies assessed pain intensity (using a NRS) as a primary outcome and a further eight studies assessed pain as a secondary outcome. Measures of work absenteeism or self-reported recovery ratings were reported as primary outcomes in four studies each. Disability was assessed as a primary outcome in five studies and as a secondary outcome in a further five studies. Definitions of ‘poor outcome’ (after an episode of LBP) were highly variable. For studies identifying pain as the primary outcome, poor outcome was variably defined as NRS scores of > 0 [38], > 1 [39], > 2 [25], and > 4 [40]; one study [41] defined sustained recovery from LBP by NRS scores of 0 or 1 for 7 consecutive days; one study [42] used a composite pain index.
Meta-analysis
SBT
Discrimination of pain outcomes
The five studies [38, 43–46] investigating the SBT used pain as an outcome measure. All authors provided raw data for statistical analysis or followed guidance for analysis of their recent onset data. Consistent classification of ‘poor outcome’ allowed pooling of AUC values (pooled AUC = 0.59 (0.55–0.63); Table 5). Discriminative performance was ‘non-informative’. There was no evidence of statistical heterogeneity (I 2 = 0.00%, P = 0.47).
Table 5.
PSI | Outcome | Studies (Total N) | Heterogeneity I 2 (P) | Pooled AUC value | 95% confidence interval |
---|---|---|---|---|---|
SBT | Pain (≥ 3) | 5 studies (1153) | 0.00% (0.47) | 0.59 | 0.55–0.63 |
SBT | Disability (≥ 30%) | 3 studies (821) | 80.95% (0.01) | 0.74 | 0.66–0.82 |
OMPSQ | Pain (≥ 3) | 4 studies (360) | 40.95% (0.17) | 0.69 | 0.62–0.76 |
OMPSQ | Disability (≥ 30%) | 3 studies (512) | 0.00% (0.42) | 0.75 | 0.69–0.82 |
OMPSQ | 6 month absenteeism (> 28 days) | 3 studies (243) | 0.00% (0.86) | 0.83 | 0.75–0.90 |
OMPSQ | 12 month absenteeism (> 30 days) | 2 studies (440) | 0.00% (0.90) | 0.71 | 0.64–0.78 |
AUC Area Under the Curve, SBT STarT Back Tool, OMPSQ Orebro Musculoskeletal Pain Screening Questionnaire
Discrimination of disability outcomes
Three SBT studies [38, 43, 46] included disability as an outcome measure. ‘Poor outcome’ (in disability terms) was defined consistently. The pooled AUC value of 0.74 (0.66–0.82) indicated ‘acceptable’ [23, 24] discrimination. There was substantial statistical heterogeneity (I 2 = 80.95%, P = 0.005). To explore the source of heterogeneity, two studies [38, 46] that did not have overlapping confidence intervals were separately removed. Heterogeneity was no longer significant in both analyses (P > 0.05), with impact on the AUC values (Table 6).
Table 6.
AUC | 95% Confidence interval | I 2 (P) | |
---|---|---|---|
All studies included | 0.74 | 0.66–0.82 | 80.85% (0.01) |
Hill et al. (2008) [46] removed | 0.68 | 0.63–0.73 | 0.00% (0.37) |
Kongsted et al. (2015) [38] removed | 0.80 | 0.74–0.86 | 0.00% (0.42) |
AUC Area Under the Curve
OMPSQ
Discrimination of pain outcomes
Four of the seven studies [25, 39, 42, 47] investigating the OMPSQ included pain as an outcome measure. Consistent classification of ‘poor outcome’ was achieved, allowing pooling of all AUC values (pooled AUC = 0.69 (0.62–0.76); Table 5). Discriminative performance was ‘poor’. Statistical heterogeneity was moderate but not statistically significant (I 2 = 40.95%, P = 0.17).
Discrimination of disability outcomes
Five OMPSQ studies included disability as an outcome measure. Three studies classified ‘poor outcome’ as ≥ 30% disability [39, 42, 47], one used ≥ 20% [25] and one used ≥ 40% [48]. Despite different definitions, the results were pooled and post-hoc sensitivity analysis confirmed this to be acceptable (Table 7). Discriminative performance was ‘acceptable’ [23, 24] (pooled AUC = 0.75 (0.69–0.82)). There was no evidence of statistical heterogeneity (I 2 = 0.00%, P = 0.64).
Table 7.
AUC | 95% Confidence interval | I 2 (P) | |
---|---|---|---|
All studies included | 0.75 | 0.69–0.82 | 0.00% (0.64) |
Schmidt et al. (2016) [48] removed (≥ 40%) | 0.73 | 0.65–0.81 | 0.00% (0.60) |
Grotle et al. (2006) [25] removed (≥ 20%) | 0.75 | 0.69–0.82 | 0.00% (0.50) |
Schmidt et al. (2016) [48] and Grotle et al. (2006) removed [25] | 0.74 | 0.65–0.82 | 0.00% (0.42) |
AUC Area Under the Curve
Discrimination of absenteeism outcomes
The OMPSQ offers ‘excellent’ discrimination of prolonged absenteeism at 6 months (pooled AUC from three studies [25, 39, 42] = 0.83 (0.75–0.90); and ‘acceptable’ discrimination of prolonged absenteeism at 12 months (pooled AUC from two studies [25, 37] = 0.71 (0.64–0.78). There was no statistical heterogeneity (I 2 = 0.00%, P = 0.86).
All instruments
Discrimination of pain outcomes
Twelve investigations in primary care settings (using five different PSIs) reported pain outcomes at medium term follow-up. Poor outcome was consistently defined as NRS scores ≥ 3. Data were pooled for studies using the SBT and OMPSQ. Meta-analysis enabled visual comparison of the discriminative performances of all instruments (Fig. 2). The pooled performance was ‘poor’ (pooled AUC = 0.63 (0.60–0.65)). The I 2 of 51.16% may represent moderate statistical heterogeneity (P = 0.08).
Discrimination of disability outcomes
Nine studies (involving three PSIs) reported disability outcomes at medium term follow-up. Poor outcome was consistently defined as ≥ 30% disabled, with the exception of two of the OMPSQ studies as noted previously (Grotle et al. [25] ≥ 20% and Schmidt et al. [48] ≥ 40%).
Data were pooled for studies using the SBT and the OMPSQ. Meta-analysis enabled visual comparison of the discriminative performances of all instruments (Fig. 3). The pooled performance was ‘acceptable’ (pooled AUC = 0.71 (0.66–0.76)) and indicated substantial heterogeneity (I 2 = 69.89%, P = 0.04). Graphical representation suggests that the SBT and the OMPSQ out-performed the BDRQ. Heterogeneity was resolved with removal of the BDRQ study: pooled AUC = 0.75 (0.70–0.80, I 2 = 0.00%, P = 0.98).
Discrimination of absenteeism outcomes
Variability in follow-up time-points and outcome measures used in studies with absenteeism outcomes [25, 39, 40, 42, 49–51] did not allow comparisons to be made between instruments.
Studies not included in the meta-analysis
The following four of studies were not included in a quantitative meta-analysis since they used outcome measures dissimilar to the measures used in the other included studies.
Jellema et al. 2007 [52] – OMPSQ
This study investigated the use of the OMPSQ in a general adult population for prediction of non-recovery at 12 months post-screening (defined as a score of slightly improved or worse on a 7-point Likert scale, at two or more follow-up time points). ‘Good’ instrument calibration was reported (i.e. agreement between predicted and observed risks); however, discriminative ability for predicting long-term global recovery was poor (AUC = 0.61 (0.54–0.67).
Hazard et al. 1996 [49] & 1997 [50] – VDPQ
These studies of prognostic screening indicated the potential utility of the VDPQ to predict return to work at 3 months post low back injury. The initial validation study [49] revealed ‘outstanding’ discriminative performance (AUC = 0.92, no confidence intervals obtained) and the subsequent study [50] suggested it was ‘acceptable’ (AUC = 0.78; no confidence intervals obtained).
Truchon et al. (2012) [51] – ASQ
This study suggested ‘acceptable’ discrimination of long-term absenteeism (>182 cumulative days) at 12-month follow-up using the ASQ (AUC = 0.73; no confidence intervals obtained).
Methodologic quality
Sixteen of the 18 included studies were assessed to have a low risk of bias and were thereby regarded to be of high quality. Two studies were regarded to have a high risk of bias primarily due to a high rate of loss to follow-up (> 40%). The assessment of individual study quality is reported in Table 8 (at the end of the manuscript).
Table 8.
Study | A. Study participation | B. Study attrition | C. Prognostic factor measurement | D. Outcome measurement | E. Study confounding | F. Statistical analysis and reporting | Overall assessment of risk of biasa |
---|---|---|---|---|---|---|---|
Beneciuk et al. 2012 [43] | Low | Moderate | Moderate | Low | Low | Low | Low |
Field & Newell 2012 [44] | Moderate | Moderate | Low | Low | Low | Low | Low |
Gabel et al. 2011 [39] | Moderate | Low | Moderate | Low | Low | Low | Low |
Grotle et al. 2006 [25] | Moderate | Low | Moderate | Low | Low | Moderate | Low |
Hazard et al. 1996 [49] | Moderate | Low | Low | Low | Low | Moderate | Low |
Hazard et al. 1997 [50] | Moderate | Low | Low | Low | Low | Low | Low |
Heneweer et al. 2007 [66] | Moderate | Low | Low | Low | Low | Low | Low |
Hill et al. 2008 [46] | Moderate | Moderate | Low | Low | Low | Low | Low |
Jellema et al. 2007 [52] | Low | Low | Low | Moderate | Low | Low | Low |
Kongsted et al. 2015 [38] | Low | Low | Low | Low | Low | Low | Low |
Law et al. 2013 [37] | Low | Moderate | Low | Low | Moderate | Low | Low |
Newell et al. 2014 [45] | Low | High | Moderate | Low | Low | Low | High |
Nonclercq et al. 2010 [42] | Moderate | Low | Low | Low | Low | Low | Low |
Shaw et al. 2009 [40] | Low | Low | Low | Low | Low | Low | Low |
Schmidt et al. 2016 [48] | Moderate | Moderate | Low | Low | Low | Low | Low |
Truchon et al. 2012 [51] | Moderate | High | Low | Moderate | Low | Moderate | High |
Turner et al. 2013 [61] | Moderate | Low | Low | Low | Low | Low | Low |
Williams et al. 2014 [41] | Low | Low | Low | Low | Low | Low | Low |
aThe overall assessment of risk of bias for a study was rated as ‘low’ when all or most (4–6) of the six bias domains were fulfilled, with each domain rated as ‘low’ or ‘moderate’. The overall risk of bias was rated as ‘high’ when one or more of the six bias domains were rated as ‘high’ or ‘unclear’. Studies with low overall risk of bias were considered high quality
Discussion
Based on high quality prognostic studies, this systematic review provides evidence that LBP PSIs perform poorly at assigning higher risk scores to individuals who develop chronic pain, than to those who do not. Clinicians can expect that a PSI, administered within the first 3 months of an episode of LBP will correctly classify a patient as high or low risk of developing chronic pain between 60% and 70% of the time. PSIs perform somewhat better at discriminating between patients who will and will not have persisting disability (70–80% probability of correct classification) and appear most successful (> 80% probability) at discriminating between patients who will or will not return to work successfully.
This review also informs about the prognostic performance of specific instruments. The OMPSQ and VDPQ appear to perform well at predicting return to work outcomes and the SBT and the OMPSQ have modest predictive value for disability outcomes, but the included instruments demonstrate little value for informing about likely pain outcomes. Problems associated with using a screening instrument for a purpose other than intended (i.e. based on interest in a specifically defined outcome, at a specific time point) have been introduced in this paper. The instruments included in this study were designed to predict outcomes at time points varying between 3 and 6 months. Two were designed to predict work absenteeism (VDPQ, ASQ), one to predict status on a chronic pain scale (CPRS), one to predict LBP recovery (HCPR), and one to predict functional limitation (SBT). Only two instruments (BDRQ, OMPSQ) were developed to predict more than one clinical outcome. This may have played a role in the poor performance of several of the instruments when evaluated according to the uniform methods we employed.
While our classification of the SBT as a PSI may be arguable, we considered that its clinical use as a prognostic instrument warranted its inclusion in this review. The NICE guidelines [15] recommend that clinicians use tools such as the SBT to identify patients at risk of poor outcome and tailor their management accordingly. Our findings suggest, however, that there is need for caution if the SBT is administered only for the purpose of predicting the risk of poor outcome. As a ‘stratified care tool’ with matched treatment pathways, the merits of the SBT have been reported elsewhere [2, 53].
While it is ideal that stratified care tools such as the SBT have high predictive validity this may not be realistic if the approach is to only include modifiable items during instrument development. Additionally, screening instruments designed for clinical use must be brief and simple to score. A trade-off of these factors may be reduced discriminative performance. It can be noted that the discriminative performance of the SBT is better in a UK General Practice setting than in Physiotherapy or Chiropractic settings – a finding consistent with the understanding that the usefulness of a screening instrument is highly setting-specific [44, 54] and optimal in the cohort for which it was developed [55]. In contrast, however, the ‘excellent’ performance of the OMPSQ for discriminating workers at risk of prolonged absenteeism regardless of country and across varied clinical settings suggests the wider utility of this PSI.
This study was prospectively registered with full adherence to the published protocol. We used the QUIPS methodological appraisal tool [28], a valid and reliable tool for evaluating prognostic studies. The general quality of included studies was assessed to be high with the exception of two studies that had high loss to follow-up [44, 51]. To our knowledge, this is the first quantitative synthesis and analysis of the discriminative performance of PSIs. All previous systematic reviews of PSIs have been unable to conduct meta-analyses of predictive accuracy because of clinical heterogeneity [9, 17, 56, 57]. It is also the first review to include studies testing the SBT. Additional data obtained from study authors facilitated data pooling from similar adult populations, with consistent follow-up time points and identical classifications of poor outcome. Pooling data from instruments that were designed with different purposes in mind may, however, limit the strength of the conclusions that can be drawn from this study.
ROC analyses are recommended for discriminative accuracy studies [58], but come with some limitations. A ROC analysis requires dichotomisation of outcomes, which means that the definition of ‘poor outcome’ can affect findings. In the absence of a general consensus on the definition of ‘poor outcome’, we followed previous studies and recommendations [24, 27, 59]. The selected cut-off score of ≥ 3/10 on a pain NRS was based on the understanding that many people with pain scores of < 3 consider themselves to be ‘recovered’ [1]. Boonstra et al. [60] support that people with pain NRS scores of ≤ 3 describe themselves to be experiencing only ‘mild’ symptoms. We classified participants who were ‘not recovered’ at follow-up (or those experiencing more than mild symptoms) as having a ‘poor outcome’. Since the outcome classification can influence discriminative performance, it would have been interesting to evaluate alternative cut-off points for poor outcome for each of the outcomes considered; this could be considered in further research. The definitions we applied were used by several included studies [25, 39, 42, 61]. In addition, AUC values (derived from the ROC analysis) are a function of sensitivity and specificity – both of which are influenced by cohort characteristics (e.g. symptom severity and psychological profile). Variations are therefore expected for the same instrument among different populations.
Recommendations for the management of LBP in primary care frequently include using available screening instruments to obtain information about ‘risk’ of a poor outcome. This review highlights that clinicians may need be cautious about placing too much weight on PSIs during their clinical assessment, under the misimpression that they are able to accurately determine chronic pain risk. Using PSIs to allocate care carries the risk that patients misclassified by PSIs as low-risk are undertreated and patients misclassified as high-risk are overtreated. Estimation of risk of poor disability outcomes and prolonged absenteeism are likely to be more accurate – indicating that it is necessary to consider the clinical outcomes of interest when seeking prognostic information.
It is important to note, however, that this study investigated the predictive performance of PSIs and does not inform whether the implementation of prognostic screening improves outcomes for adults with recent onset LBP. Alternative research approaches, namely randomised ‘impact’ trials [1], are required to address this question. Furthermore, it is relevant to consider whether the use of PSIs offers more accurate estimation of a patient’s course of LBP than clinician judgement. Previous studies comparing the discriminative performance of screening instruments (including the SBT and the OMPSQ) with primary care clinicians’ estimation of risk of poor outcome [52, 38] have failed to show superior capabilities of the questionnaires.
As highlighted in the PROGRESS recommendations [21], the validation of predictive models requires a succession of steps from development through to external validation and impact analysis – a process which has been only partially fulfilled by the PSIs in this review. Further research according to PROGRESS recommendations will allow improved confidence in the selection and application of available instruments. Less understood factors (e.g. structural pathology, sleep or social factors) should be further investigated and integrated into prognostic models to improve predictive accuracy beyond what is currently achievable. In addition, there remains a need to undertake further prospective clinical trials investigating the effectiveness of screening to direct stratified care approaches for patients with LBP. The performance of a stratified care instrument is best evaluated by an effect size derived from a randomised controlled trial.
Conclusions
LBP screening instruments administered in primary care perform poorly at assigning higher risk scores to individuals who develop chronic pain, than to those who do not develop chronic pain. Risks of a poor disability outcome and prolonged absenteeism are likely to be estimated with greater accuracy. While PSIs may have useful clinical application, it is important that clinicians who use screening tools to obtain prognostic information consider the potential for misclassification of patient risk and its consequences for care decisions based on screening. However, it needs to be acknowledged that the outcomes on which we evaluated these screening instruments in some cases had a different threshold, outcome and time period than those they were designed to predict.
Acknowledgements
The authors of this review gratefully acknowledge the contributions made by authors of included studies who provided additional information and/or raw/re-analysed data for inclusion in study meta-analyses. EK acknowledges with thanks the contribution of the University of South Australia and the Central Adelaide Local Health Network Inc. for providing scholarship funding and support for this research.
Funding
LR and SH did not receive funding support from any organisation for the submitted work. EK received Royal Adelaide Hospital Allied Health Research Grant funding (2014 and 2015) and the 2015 Dawes Scholarship. JM is supported by a National Health and Medical Research project grant ID 1047827. AT is supported by a National Health and Medical Research Council PhD Scholarship APP1075670. LG is supported by the Swiss National Science Foundation. GLM is supported by a National Health and Medical Research Council research fellowship NHMRC ID 106279. AW received financial compensation for her contribution to screening of the search results (research assistant employed by SH). This study was undertaken independently from research funders. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Authors’ contributions
EK, JM and GLM conceived the idea and designed the study. EK conducted the systematic searches, was responsible for the extraction, analysis and interpretation of data, and drafted and revised the manuscript. JM made substantial contributions to study conception and design, interpretation of results and revising the manuscript critically for intellectual content. AT made substantial contributions to the study design and revision of the manuscript. SH made substantial contributions to the study design and revision of the manuscript. LG assisted with screening of the database search results and made substantial contributions to data extraction, analysis and interpretation. LR assisted with data extraction, analysis and interpretation. GLM made substantial contributions to the study conception and design, and assisted with drafting and revision of the manuscript. All authors gave approval for the final version of the manuscript and agree to be accountable for all aspects of the work.
Competing interests
GLM has received support from: Pfizer, Kaiser Permanente USA, Results Physiotherapy USA, Agile Physiotherapy USA, workers compensation boards in Australia, North America and Europe, the International Olympic Committee and the Port Adelaide Football Club. He receives royalties for books on pain and rehabilitation, and speaker fees for lectures on pain and rehabilitation. All other authors had no financial relationships with any organisations that might have an interest in the submitted work, and no other relationships or activities that could appear to have influenced the submitted work.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Ethics approval for collection of human data was obtained by the authors of the individual studies included in this review. Further ethics approval was not required for this study.
Abbreviations
- ASQ
Absenteeism Screening Questionnaire
- AUC
area under the curve
- BDRQ
Back Disability Risk Questionnaire
- CPRS
Chronic Pain Risk Score
- HCPR
Hancock Clinical Prediction Rule
- LBP
low back pain
- NHMRC
National Health and Medical Research Council of Australia
- NRS
numeric rating scale
- ODI
Oswestry Disability Index
- OMPSQ
Orebro Musculoskeletal Pain Screening Questionnaire
- PRISMA
Preferred Reporting Items for Systematic reviews and Meta-Analysis
- PSI
prognostic screening instrument
- QBPDS
Quebec Back Pain Disability Score
- QUIPS
QUality In Prognostic Studies
- ROB
risk of bias
- ROC
receiver operating characteristic
- SBT
STarT Back Tool
- VDPQ
Vermont Disability Prediction Questionnaire
Additional file
Contributor Information
Emma L. Karran, Email: emma.karran@mymail.unisa.edu.au
James H. McAuley, Email: j.mcauley@neura.edu.au
Adrian C. Traeger, Email: a.traeger@neura.edu.au
Susan L. Hillier, Email: susan.hillier@unisa.edu.au
Luzia Grabherr, Email: luzia.grabherr@gmx.ch.
Leslie N. Russek, Email: leslie.russek@clarkson.edu
G. Lorimer Moseley, Email: lorimer.moseley@unisa.edu.au.
References
- 1.Hingorani AD, van der Windt DA, Riley RD, Abrams K, Moons KG, Steyerberg EW, Schroter S, Sauerbrei W, Altman DG, Hemingway H. Prognosis research strategy (PROGRESS) 4: stratified medicine research. BMJ. 2013;346:e5793. doi: 10.1136/bmj.e5793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Foster NE, Mullis R, Hill JC, Lewis M, Whitehurst DG, Doyle C, Konstantinou K, Main C, Somerville S, Sowden G. Effect of stratified care for low back pain in family practice (IMPaCT Back): a prospective population-based sequential comparison. Ann Fam Med. 2014;12(2):102–11. doi: 10.1370/afm.1625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Michel C, Ruhrmann S, Schimmelmann BG, Klosterkötter J, Schultze-Lutter F. A stratified model for psychosis prediction in clinical practice. Schizophr Bull. 2014;40(6):1533–42. doi: 10.1093/schbul/sbu025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gore M, Sadosky A, Stacey BR, Tai K-S, Leslie D. The burden of chronic low back pain: clinical comorbidities, treatment patterns, and health care costs in usual care settings. Spine. 2012;37(11):E668–77. doi: 10.1097/BRS.0b013e318241e5de. [DOI] [PubMed] [Google Scholar]
- 5.Vos T, Barber RM, Bell B, Bertozzi-Villa A, Biryukov S, Bolliger I, Charlson F, Davis A, Degenhardt L, Dicker D. Global, regional, and national incidence, prevalence, and years lived with disability for 301 acute and chronic diseases and injuries in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet. 2015;386(9995):743–800. doi: 10.1016/S0140-6736(15)60692-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.da Menezes Costa CL, Maher CG, Hancock MJ, McAuley JH, Herbert RD, Costa LO. The prognosis of acute and persistent low-back pain: a meta-analysis. CMAJ. 2012;184(11):E613–24. doi: 10.1503/cmaj.111271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Pincus T, Burton AK, Vogel S, Field AP. A systematic review of psychological factors as predictors of chronicity/disability in prospective cohorts of low back pain. Spine. 2002;27(5):E109–20. doi: 10.1097/00007632-200203010-00017. [DOI] [PubMed] [Google Scholar]
- 8.Steenstra I, Verbeek J, Heymans M, Bongers P. Prognostic factors for duration of sick leave in patients sick listed with acute low back pain: a systematic review of the literature. J Occup Environ Med. 2005;62(12):851–60. doi: 10.1136/oem.2004.015842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chou R, Shekelle P. Will this patient develop persistent disabling low back pain? JAMA. 2010;303(13):1295–302. doi: 10.1001/jama.2010.344. [DOI] [PubMed] [Google Scholar]
- 10.Melloh M, Elfering A, Presland CE, Roeder C, Barz T, Salathé CR, Tamcan O, Mueller U, Theis J. Identification of prognostic factors for chronicity in patients with low back pain: a review of screening instruments. Int Orthop. 2009;33(2):301–13. doi: 10.1007/s00264-008-0707-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cook CE, Learman KE, O’halloran BJ, Showalter CR, Kabbaz VJ, Goode AP, Wright AA. Which prognostic factors for low back pain are generic predictors of outcome across a range of recovery domains? Phys Ther. 2013;93(1):32–40. doi: 10.2522/ptj.20120216. [DOI] [PubMed] [Google Scholar]
- 12.Koes BW, van Tulder M, Lin C-WC, Macedo LG, McAuley J, Maher C. An updated overview of clinical guidelines for the management of non-specific low back pain in primary care. Eur Spine J. 2010;19(12):2075–94. doi: 10.1007/s00586-010-1502-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Delitto A, George SZ, Van Dillen L, Whitman JM, Sowa G, Shekelle P, Denninger TR, Godges JJ. Low back pain. Clinical practice guidelines linked to the international classification of functioning, disability, and health from the orthopaedic section of the American Physical Therapy Association. J Orthop Sports Phys Ther. 2012;42(4):A1–A57. doi: 10.2519/jospt.2012.42.4.A1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Van Tulder M, Becker A, Bekkering T, Breen A, del Real MTG, Hutchinson A, Koes B, Laerum E, Malmivaara A. Chapter 3 European guidelines for the management of acute nonspecific low back pain in primary care. Eur Spine J. 2006;15:s169–91. doi: 10.1007/s00586-006-1071-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.National Instutute for Health and Care Excellence (NICE) 2016. Low Back Pain and Management in over 16s: Assessment and Management. https://www.nice.org.uk/guidance/ng59,chapter/Recommendations. Accessed 7 Dec 2016.
- 16.van der Windt DA, Dunn KM. Low back pain research–Future directions. Best Pract Res Clin Rheumatol. 2013;27(5):699–708. doi: 10.1016/j.berh.2013.11.001. [DOI] [PubMed] [Google Scholar]
- 17.Hilfiker R, Bachmann LM, Heitz CA-M, Lorenz T, Joronen H, Klipstein A. Value of predictive instruments to determine persisting restriction of function in patients with subacute non-specific low back pain. Systematic review. Eur Spine J. 2007;16(11):1755–75. doi: 10.1007/s00586-007-0433-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9. doi: 10.7326/0003-4819-151-4-200908180-00135. [DOI] [PubMed] [Google Scholar]
- 19.de Vet HC, Heymans MW, Dunn KM, Pope DP, van der Beek AJ, Macfarlane GJ, Bouter LM, Croft PR. Episodes of low back pain: a proposal for uniform definitions to be used in research. Spine. 2002;27(21):2409–16. doi: 10.1097/00007632-200211010-00016. [DOI] [PubMed] [Google Scholar]
- 20.Merlin T, Weston A, Tooher R. Extending an evidence hierarchy to include topics other than treatment: revising the Australian ‘levels of evidence’. BMC Med Res Methodol. 2009;9(1):34. doi: 10.1186/1471-2288-9-34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hemingway H, Croft P, Perel P, Hayden JA, Abrams K, Timmis A, Briggs A, Udumyan R, Moons KG, Steyerberg EW. Prognosis research strategy (PROGRESS) 1: a framework for researching clinical outcomes. BMJ. 2013;346:e5595. doi: 10.1136/bmj.e5595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Steyerberg EW, Moons KG, van der Windt DA, Hayden JA, Perel P, Schroter S, Riley RD, Hemingway H, Altman DG, Group P. Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med. 2013;10(2):e1001381. doi: 10.1371/journal.pmed.1001381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hosmer DW, Jr, Lemeshow S. Applied logistic regression. Hoboken: Wiley; 2004. [Google Scholar]
- 24.Traeger A, Henschke N, Hübscher M, Williams CM, Kamper SJ, Maher CG, Moseley GL, McAuley JH. Development and validation of a screening tool to predict the risk of chronic low back pain in patients presenting with acute low back pain: a study protocol. BMJ Open. 2015;5(7):e007916. doi: 10.1136/bmjopen-2015-007916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Grotle M, Vollestad NK, Brox JI. Screening for yellow flags in first-time acute low back pain: reliability and validity of a Norwegian version of the Acute Low Back Pain Screening Questionnaire. Clin J Pain. 2006;22(5):458–67. doi: 10.1097/01.ajp.0000208243.33498.cb. [DOI] [PubMed] [Google Scholar]
- 26.Traeger AC, Henschke N, Hübscher M, Williams CM, Kamper SJ, Maher CG, Moseley GL, McAuley JH. Estimating the risk of chronic pain: development and validation of a prognostic model (PICKUP) for patients with acute low back pain. PLoS Med. 2016;13(5):e1002019. doi: 10.1371/journal.pmed.1002019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hush JM, Refshauge K, Sullivan G, De Souza L, Maher CG, McAuley JH. Recovery: what does this mean to patients with low back pain? Arthr Care Res. 2009;61(1):124–31. doi: 10.1002/art.24162. [DOI] [PubMed] [Google Scholar]
- 28.Hayden JA, van der Windt DA, Cartwright JL, Côté P, Bombardier C. Assessing bias in studies of prognostic factors. Ann Intern Med. 2013;158(4):280–6. doi: 10.7326/0003-4819-158-4-201302190-00009. [DOI] [PubMed] [Google Scholar]
- 29.Bruls VE, Bastiaenen CH, de Bie RA. Prognostic factors of complaints of arm, neck, and/or shoulder: a systematic review of prospective cohort studies. Pain. 2015;156(5):765–88. doi: 10.1097/j.pain.0000000000000117. [DOI] [PubMed] [Google Scholar]
- 30.Fischer CA, Neubauer E, Adams HS, Schiltenwolf M, Wang H. Effects of multidisciplinary pain treatment can be predicted without elaborate questionnaires. Int Orthop. 2014;38(3):617–26. doi: 10.1007/s00264-013-2156-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hurley DA, Dusoir TE, McDonough SM, Moore AP, Baxter GD. How effective is the acute low back pain screening questionnaire for predicting 1-year follow-up in patients with low back pain? Clin J Pain. 2001;17(3):256–63. doi: 10.1097/00002508-200109000-00012. [DOI] [PubMed] [Google Scholar]
- 32.Linton SJ, Nicholas M, MacDonald S. Development of a short form of the Orebro Musculoskeletal Pain Screening Questionnaire. Spine. 2011;36(22):1891–5. doi: 10.1097/BRS.0b013e3181f8f775. [DOI] [PubMed] [Google Scholar]
- 33.Morso L, Kent P, Manniche C, Albert HB. The predictive ability of the STarT Back Screening Tool in a Danish secondary care setting. Eur Spine J. 2014;23(1):120–8. doi: 10.1007/s00586-013-2861-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Morso L, Kongsted A, Hestbaek L, Kent P. The prognostic ability of the STarT Back Tool was affected by episode duration. Eur Spine J. 2016;25(3):936–44. doi: 10.1007/s00586-015-3915-0. [DOI] [PubMed] [Google Scholar]
- 35.Cats-Baril WL, Frymoyer JW. Identifying patients at risk of becoming disabled because of low-back pain. The Vermont Rehabilitation Engineering Center predictive model. Spine. 1991;16(6):605–7. doi: 10.1097/00007632-199106000-00001. [DOI] [PubMed] [Google Scholar]
- 36.World Health Organization. Declaration of Alma-Ata, 1978, Paper presented at: International Conference on Primary Health Care. Alma-Ata: USSR; 1978. http://www.euro.who.int/en/publications/policy-documents/declaration-of-alma-ata,-1978. Accessed 28 Apr 2016.
- 37.Law RK, Lee EW, Law SW, Chan BK, Chen PP, Szeto GP. The predictive validity of OMPQ on the rehabilitation outcomes for patients with acute and subacute non-specific LBP in a Chinese population. J Occup Rehabil. 2013;23(3):361–70. doi: 10.1007/s10926-012-9404-y. [DOI] [PubMed] [Google Scholar]
- 38.Kongsted A, Andersen CH, Hansen MM, Hestbaek L. Prediction of outcome in patients with low back pain–A prospective cohort study comparing clinicians’ predictions with those of the Start Back Tool. Man Ther. 2016;21:120–7. doi: 10.1016/j.math.2015.06.008. [DOI] [PubMed] [Google Scholar]
- 39.Gabel CP, Melloh M, Yelland M, Burkett B, Roiko A. Predictive ability of a modified Orebro Musculoskeletal Pain Questionnaire in an acute/subacute low back pain working population. Eur Spine J. 2011;20(3):449–57. doi: 10.1007/s00586-010-1509-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Shaw WS, Pransky G, Winters T. The Back Disability Risk Questionnaire for work-related, acute back pain: prediction of unresolved problems at 3-month follow-up. J Occup Environ Med. 2009;51(2):185–94. doi: 10.1097/JOM.0b013e318192bcf8. [DOI] [PubMed] [Google Scholar]
- 41.Williams C, Hancock M, Maher C, McAuley J, Lin C, Latimer J. Predicting rapid recovery from acute low back pain based on the intensity, duration and history of pain: a validation study. Eur J Pain. 2014;18(8):1182–9. doi: 10.1002/j.1532-2149.2014.00467.x. [DOI] [PubMed] [Google Scholar]
- 42.Nonclercq O, Berquin A. Predicting chronicity in acute back pain: validation of a French translation of the Orebro Musculoskeletal Pain Screening Questionnaire. Ann Phys Rehabil Med. 2012;55(4):263–78. doi: 10.1016/j.rehab.2012.03.002. [DOI] [PubMed] [Google Scholar]
- 43.Beneciuk JM, Bishop MD, Fritz JM, Robinson ME, Asal NR, Nisenzon AN, George SZ. The STarT back screening tool and individual psychological measures: evaluation of prognostic capabilities for low back pain clinical outcomes in outpatient physical therapy settings. Phys Ther. 2013;93(3):321–33. doi: 10.2522/ptj.20120207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Field J, Newell D. Relationship between STarT Back Screening Tool and prognosis for low back pain patients receiving spinal manipulative therapy. Chiropr Man Ther. 2012;20(1):17. doi: 10.1186/2045-709X-20-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Newell D, Field J, Pollard D. Using the STarT Back Tool: Does timing of stratification matter? Man Ther. 2015;20(4):533–9. doi: 10.1016/j.math.2014.08.001. [DOI] [PubMed] [Google Scholar]
- 46.Hill JC, Dunn KM, Lewis M, Mullis R, Main CJ, Foster NE, Hay EM. A primary care back pain screening tool: identifying patient subgroups for initial treatment. Arthritis Rheum. 2008;59(5):632–41. doi: 10.1002/art.23563. [DOI] [PubMed] [Google Scholar]
- 47.Heneweer H, van Woudenberg NJ, van Genderen F, Vanhees L, Wittink H. Measuring psychosocial variables in patients with (sub) acute low back pain complaints, at risk for chronicity: a validation study of the Acute Low Back Pain Screening Questionnaire–Dutch Language version. Spine. 2010;35(4):447–52. doi: 10.1097/BRS.0b013e3181bd9e3b. [DOI] [PubMed] [Google Scholar]
- 48.Schmidt CO, Kohlmann T, Pfingsten M, Lindena G, Marnitz U, Pfeifer K, Chenot J. Construct and predictive validity of the German Örebro questionnaire short form for psychosocial risk factor screening of patients with low back pain. Eur Spine J. 2016;25(1):325–32. doi: 10.1007/s00586-015-4196-3. [DOI] [PubMed] [Google Scholar]
- 49.Hazard RG, Haugh LD, Reid S, Preble JB, MacDonald L. Early prediction of chronic disability after occupational low back injury. Spine. 1996;21(8):945–51. doi: 10.1097/00007632-199604150-00008. [DOI] [PubMed] [Google Scholar]
- 50.Hazard RG, Haugh LD, Reid S, McFarlane G, MacDonald L. Early physician notification of patient disability risk and clinical guidelines after low back injury: a randomized, controlled trial. Spine. 1997;22(24):2951–8. doi: 10.1097/00007632-199712150-00019. [DOI] [PubMed] [Google Scholar]
- 51.Truchon M, Schmouth ME, Cote D, Fillion L, Rossignol M, Durand MJ. Absenteeism screening questionnaire (ASQ): a new tool for predicting long-term absenteeism among workers with low back pain. J Occup Rehabil. 2012;22(1):27–50. doi: 10.1007/s10926-011-9318-0. [DOI] [PubMed] [Google Scholar]
- 52.Jellema P, van der Windt DA, van der Horst HE, Stalman WA, Bouter LM. Prediction of an unfavourable course of low back pain in general practice: comparison of four instruments. Br J Gen Pract. 2007;57:15–22. [PMC free article] [PubMed] [Google Scholar]
- 53.Hill JC, Whitehurst D, Lewis M, Bryan S, Dunn KM, Foster NE, Konstantinou K, Main CJ, Mason EE, Somerville S, et al. Comparison of stratified primary care management for low back pain with current best practice (STarT Back): a randomised controlled trial. Lancet. 2011;378:1560–71. doi: 10.1016/S0140-6736(11)60937-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Moons KG, Altman DG, Vergouwe Y, Royston P. Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ. 2009;338:b606. doi: 10.1136/bmj.b606. [DOI] [PubMed] [Google Scholar]
- 55.Fritz JM, Beneciuk JM, George SZ. Relationship between categorization with the STarT Back Screening Tool and prognosis for people receiving physical therapy for low back pain. Phys Ther. 2011;91(5):722–32. doi: 10.2522/ptj.20100109. [DOI] [PubMed] [Google Scholar]
- 56.Gray H, Adefolarin AT, Howe TE. A systematic review of instruments for the assessment of work-related psychosocial factors (Blue Flags) in individuals with non-specific low back pain. Man Ther. 2011;16(6):531–43. doi: 10.1016/j.math.2011.04.001. [DOI] [PubMed] [Google Scholar]
- 57.Hockings RL, McAuley JH, Maher CG. A systematic review of the predictive ability of the Orebro Musculoskeletal Pain Questionnaire. Spine. 2008;33(15):E494–500. doi: 10.1097/BRS.0b013e31817ba3bb. [DOI] [PubMed] [Google Scholar]
- 58.Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology. 2010;21(1):128. doi: 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Mehling WE, Gopisetty V, Acree M, Pressman A, Carey T, Goldberg H, Hecht FM, Avins AL. Acute low back pain and primary care: how to define recovery and chronification? Spine. 2011;36(26):2316–23. doi: 10.1097/BRS.0b013e31820c01a6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Boonstra AM, Preuper HRS, Balk GA, Stewart RE. Cut-off points for mild, moderate, and severe pain on the visual analogue scale for pain in patients with chronic musculoskeletal pain. Pain. 2014;155(12):2545–50. doi: 10.1016/j.pain.2014.09.014. [DOI] [PubMed] [Google Scholar]
- 61.Turner JA, Shortreed SM, Saunders KW, Leresche L, Berlin JA, Korff MV. Optimizing prediction of back pain outcomes. Pain. 2013;154(8):1391–401. doi: 10.1016/j.pain.2013.04.029. [DOI] [PubMed] [Google Scholar]
- 62.Bergstrom C, Hagberg J, Bodin L, Jensen I, Bergstrom G. Using a psychosocial subgroup assignment to predict sickness absence in a working population with neck and back pain. BMC Musculoskelet Disord. 2011;12:81. doi: 10.1186/1471-2474-12-81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Bernstein IH, Jaremko ME, Hinkley BS. On the utility of the SCL-90-R with low-back pain patients. Spine. 1994;19(1):42–8. doi: 10.1097/00007632-199401000-00008. [DOI] [PubMed] [Google Scholar]
- 64.Morso L, Kent PM, Albert HB. Are self-reported pain characteristics, classified using the PainDETECT Questionnaire predictive of outcome in people with low back pain and associated leg pain? Clin J Pain. 2011;27(6):535–41. doi: 10.1097/AJP.0b013e318208c941. [DOI] [PubMed] [Google Scholar]
- 65.Morso L, Kent P, Albert HB, Hill JC, Kongsted A, Manniche C. The predictive and external validity of the STarT Back Tool in Danish primary care. Eur Spine J. 2013;22(8):1859–67. doi: 10.1007/s00586-013-2690-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Heneweer H, Aufdemkampe G, van Tulder MW, Kiers H, Stappaerts KH, Vanhees L. Psychosocial variables in patients with (sub)acute low back pain: an inception cohort in primary care physical therapy in The Netherlands. Spine. 2007;32(5):586–92. doi: 10.1097/01.brs.0000256447.72623.56. [DOI] [PubMed] [Google Scholar]
- 67.Linton SJ, Boersma K. Early identification of patients at risk of developing a persistent back problem: the predictive validity of the Orebro Musculoskeletal Pain Questionnaire. Clin J Pain. 2003;19(2):80–6. doi: 10.1097/00002508-200303000-00002. [DOI] [PubMed] [Google Scholar]
- 68.Linton SJ, Halldén K. Can we screen for problematic back pain? A screening questionnaire for predicting outcome in acute and subacute back pain. Clin J Pain. 1998;14(3):209–15. doi: 10.1097/00002508-199809000-00007. [DOI] [PubMed] [Google Scholar]
- 69.Hancock MJ, Maher CG, Latimer J, Herbert RD, McAuley JH. Can rate of recovery be predicted in patients with acute low back pain? Development of a clinical prediction rule. Eur J Pain. 2009;13(1):51–5. doi: 10.1016/j.ejpain.2008.03.007. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.