Notes
Editorial note
See https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD014461.pub2/full for a more recent review that covers this topic and has superseded this review.
Abstract
Background
Low‐back pain (LBP) is a common condition seen in primary care. A principal aim during a clinical examination is to identify patients with a higher likelihood of underlying serious pathology, such as vertebral fracture, who may require additional investigation and specific treatment. All 'evidence‐based' clinical practice guidelines recommend the use of red flags to screen for serious causes of back pain. However, it remains unclear if the diagnostic accuracy of red flags is sufficient to support this recommendation.
Objectives
To assess the diagnostic accuracy of red flags obtained in a clinical history or physical examination to screen for vertebral fracture in patients presenting with LBP.
Search methods
Electronic databases were searched for primary studies between the earliest date and 7 March 2012. Forward and backward citation searching of eligible studies was also conducted.
Selection criteria
Studies were considered if they compared the results of any aspect of the history or test conducted in the physical examination of patients presenting for LBP or examination of the lumbar spine, with a reference standard (diagnostic imaging). The selection criteria were independently applied by two review authors.
Data collection and analysis
Three review authors independently conducted 'Risk of bias' assessment and data extraction. Risk of bias was assessed using the 11‐item QUADAS tool. Characteristics of studies, patients, index tests and reference standards were extracted. Where available, raw data were used to calculate sensitivity and specificity with 95% confidence intervals (CI). Due to the heterogeneity of studies and tests, statistical pooling was not appropriate and the analysis for the review was descriptive only. Likelihood ratios for each test were calculated and used as an indication of clinical usefulness.
Main results
Eight studies set in primary (four), secondary (one) and tertiary care (accident and emergency = three) were included in the review. Overall, the risk of bias of studies was moderate with high risk of selection and verification bias the predominant flaws. Reporting of index and reference tests was poor. The prevalence of vertebral fracture in accident and emergency settings ranged from 6.5% to 11% and in primary care from 0.7% to 4.5%. There were 29 groups of index tests investigated however, only two featured in more than two studies. Descriptive analyses revealed that three red flags in primary care were potentially useful with meaningful positive likelihood ratios (LR+) but mostly imprecise estimates (significant trauma, older age, corticosteroid use; LR+ point estimate ranging 3.42 to 12.85, 3.69 to 9.39, 3.97 to 48.50 respectively). One red flag in tertiary care appeared informative (contusion/abrasion; LR+ 31.09, 95% CI 18.25 to 52.96). The results of combined tests appeared more informative than individual red flags with LR+ estimates generally greater in magnitude and precision.
Authors' conclusions
The available evidence does not support the use of many red flags to specifically screen for vertebral fracture in patients presenting for LBP. Based on evidence from single studies, few individual red flags appear informative as most have poor diagnostic accuracy as indicated by imprecise estimates of likelihood ratios. When combinations of red flags were used the performance appeared to improve. From the limited evidence, the findings give rise to a weak recommendation that a combination of a small subset of red flags may be useful to screen for vertebral fracture. It should also be noted that many red flags have high false positive rates; and if acted upon uncritically there would be consequences for the cost of management and outcomes of patients with LBP. Further research should focus on appropriate sets of red flags and adequate reporting of both index and reference tests.
Plain language summary
Physician use of red flags to screen for fractured vertebrae for patients with new back pain
This review describes the understanding of a common practice for checking for spinal injuries when patients come to a family practice doctor, back pain clinic or emergency room with new back pain. Doctors usually ask a few questions and examine the back to check for the possibility of a spinal fracture. The reason for this check for fractures is that the treatment is different for common back pain and fractures. Fractures are usually diagnosed with an x‐ray, then treated with rest, a back brace and pain relievers. Common back pain is treated with exercise, chiropractic manipulation, and pain relievers; x‐rays, computed tomography (CT) and magnetic resonance imaging scans are not useful for diagnosis. Fractures are rare, being the cause of back pain in the range of 1% to 4.5% of new back pain visits to family doctors.
Eight studies including several thousand patients described 29 different questions and physical exam tests that have been used to look for spinal fractures. Most of the 29 were not accurate. The best four questions asked about use of steroids (which can cause weak bones), the patient’s age (age above 74 increases the risk of fractures) and recent trauma such as a fall. Using a combination of the best questions appears to improve the accuracy. For example, women above age 74 are more likely to have a fracture when they come to the physician complaining of back pain. In the emergency room, the best indication of a spinal fracture was a bruise or scrape on the painful area of the back.
Fractures are rare and generally do not require emergency treatment, even if red flags exist clinicians and patients can watch and wait. During the waiting period, patients should avoid treatments like exercise and manipulation that are not recommended for spinal fractures.
The worst effects of low quality red flag screening are overtreatment and undertreatment. If the tests are not accurate, patients without a fracture may get an x‐ray or CT scan that they don’t need—unnecessary exposure to x‐rays, extra worry for the patient and extra cost. At the other extreme (and much less common), it might be possible to miss a real fracture, and cause the patient to have extra time without the best treatment.
Most of the studies were of low or moderate quality, so more research is needed to identify the best combination of questions and examination methods.
Summary of findings
Summary of findings 1. Results of potentially useful red flags.
|
Review question: What is the accuracy of red flags to screen for vertebral fracture in patients presenting with low‐back pain or for lumbar examination Patient population: Patients with low‐back pain or requiring examination of the lumbar spine when presenting to care in primary, secondary or tertiary settings Index tests: All relevant features taken during a history or physical examination Target condition: Vertebral fracture Reference standard: Diagnostic imaging (MRI, CT, X‐ray, bone scan) Included studies: Prospective cohorts (4), Retrospective chart reviews (4) Main limitations:Small number of studies included; large heterogeneity between studies and index tests precluded pooling of results; descriptive analysis presented;inadequate reporting of methods | ||||||||
| Results: Single red flags | ||||||||
| Population | Reference standard | Risk of bias | Sensitivity (95% CI) or range of estimates ^ | Specificity (95% CI) or range of estimates ^ | False negative rate in a population of 1000 (95% CI)* | False positive rate in a population of 1000 (95% CI)* | LR+ (95% CI)** |
LR‐ (95% CI)** |
|
Primary care: 1 study Index test: Age > 70 years |
Follow‐up with diagnostic imaging | Moderate (7/11) | 0.50 (0.16 to 0.84) | 0.96 (0.94 to 0.97) | 25 (8 to 42) | 38 (29 to 57) | 11.19 (5.33 ‐ 23.51) | 0.52 (0.26 ‐ 1.05) |
|
Primary care: 2 studies Index test: Age > 74 years |
X‐ray, follow‐up with diagnostic imaging | Moderate (6‐7/11) | range: 0.25 to 0.59 | range: 0.84 to 0.97 | 38 (18 to 49)** | 29 (19 to 38)** | 9.39 (2.69 ‐ 32.75) | 0.77 (0.52 ‐ 1.15) |
|
Primary care: 3 studies Index test: Significant trauma |
X‐ray, follow‐up with diagnostic imaging | High to moderate (4‐7/11) | range: 0.25 to 0.65 | range: 0.90 to 0.98 | 38 (18 to 49)** | 19 (19 to 38)** | 10.03 (2.87 ‐ 35.13) | 0.77 (0.52 ‐ 1.15) |
|
Primary care: 2 studies Index test: History of corticosteroid use |
X‐ray, follow‐up with diagnostic imaging | Moderate (6‐7/11) | range: 0.00 to 0.25 | range: 0.99 | 38 (18 to 49)** | 10 (0 to 10)** | 48.50 (11.46 ‐ 204.98) | 0.75 (0.51 ‐ 1.13) |
|
Tertiary care:1 study Index test: Contusion/abrasion |
X‐ray | Moderate (6/11) | 0.85 (0.70 to 0.94) | 0.97 (0.95 to 0.98) | 17 (7 to33) | 27 (18 to 45) | 31.09 (18.25 ‐ 52.96) | 0.15 (0.07 ‐ 0.32) |
| Results: Combined red flags | ||||||||
|
Primary care:2 studies Index test: Female and age > 64 years |
X‐ray, follow‐up with diagnostic imaging | Moderate (6‐7/11) | range:0.59 to 0.63 | range: 0.84 to 0.97 | 19 (5 to 38)** | 38 (29 to 57)** | 14.58 (8.0 ‐ 26.61) | 0.39 (0.16 ‐ 0.96) |
|
Primary care:2 studies Index test: Female and age > 74 years |
X‐ray, follow‐up with diagnostic imaging | Moderate (6‐7/11) | range: 0.25 to 0.45 | range: 0.89 to 0.98 | 38 (18 to 49)** | 19 (10 to 19)** | 16.17 (4.47 ‐ 58.43) | 0.76 (0.51 ‐ 1.14) |
|
Primary care: 1 study Index test: 2 of 4 red flags +ve (Henschke) |
Follow‐up with diagnostic imaging | Moderate (7/11) | 0.63 (0.24 to 0.91) | 0.96 (0.95 to 0.97) | 19 (5 to 38) | 38 (29 to 48) | 15.48 (8.45 ‐ 28.36) | 0.39 (0.16 ‐ 0.96) |
|
Secondary care: 1 study Index test: 3 of 5 features +ve (Roman) |
X‐ray or CT | Moderate (8/11) |
0.76 (0.60 to 0.89) | 0.69 (0.66 to 0.71) | 12 (6 to 20) | 295 (276 to 323) | 2.45 (2.02 ‐ 2.97) | 0.34 (0.19 ‐ 0.61) |
|
Secondary care: 1 study Index test: 4 of 5 features +ve (Roman) |
X‐ray or CT | Moderate (8/11) |
0.37 (0.22 to 0.54) | 0.96 (0.95 to 0.97) | 32 (23 to 39) | 38 (29 to 48) | 9.62 (5.88 ‐ 15.73) | 0.66 (0.52 ‐ 0.84) |
|
Tertiary care:1 study Index test:Trauma & neurological signs |
X‐ray | High (5/11) |
0.29 (0.04 to 0.71) | 0.98 (0.93 to 1) | 79 (32 to 106) | 18 (0 to 63) | 14.43 (2.38 ‐ 87.64) | 0.73 (0.46 ‐ 1.15 |
^Sensitively and specificity are the point estimate and 95% CI when only one study investigated an index test and the point estimate range when multiple studies investigated the same test
* Based on maximum prevalence: 5% primary care and secondary care, 11% tertiary care
** Based on results of study with lowest risk of bias
Henschke combination: Age >70 years, significant trauma, prolonged corticosteroid use, altered sensation from the trunk down
Roman combination: Age >52 years, no leg pain present, BMI ≤ 22, does not exercise regularly, female gender
Background
It is widely agreed that low‐back pain (LBP) can be seriously disabling and imposes an enormous social and economic burden on the community. A common theme presented in clinical practice guidelines is that LBP should be managed in the primary care setting because it is generally benign and the few cases of serious disease can be readily detected with a focused clinical assessment and subsequent diagnostic studies in those suspected of having serious disease (Koes 2010). In the primary care setting, between 1% and 5% of all patients who present with LBP will have a serious spinal pathology which requires further assessment and specific treatment (Chou 2007; Henschke 2009). The most common of these serious spinal pathologies, which may initially manifest as LBP, is vertebral fracture, followed by malignancy, infection, and inflammatory disease.
The identification of serious pathologies is one of the primary purposes of the clinical assessment of LBP patients. Clinical guidelines recommend the awareness of "red flags" as the ideal method to accomplish this purpose (Koes 2010). Red flags are features from the patient's medical history and physical examination which are thought to be associated with a higher risk of serious pathology. The presence of a red flag should alert clinicians to the need for further examination and in most cases, specific management (Waddell 2004). As most clinical guidelines explicitly recommend against the use of routine radiography and diagnostic imaging for patients with LBP, it is important to determine whether red flags can be used to aid a clinician's judgment to screen for serious spinal pathologies.
Target condition being diagnosed
Vertebral fractures are an important indicator of osteoporosis (Grigoryan 2003) and can cause significant pain, deformity, depressive mood, and functional impairments (Edmond 2005; Melton 2003). It is estimated that in Europe, one in eight individuals older than 50 years of age will have a vertebral fracture (Woolf 2003) and worldwide, there were 1.4 million vertebral fractures diagnosed in the year 2000 (Johnell 2006). In patients presenting to primary care with acute LBP, the prevalence of vertebral fractures has been estimated to be between 0.5% (Suarez‐Almazor 1997) and 4% (Chou 2007).
Identification of vertebral fracture is important, not least because it is associated with an increased risk of mortality (Cooper 1993), but that it is also an independent predictor of future fracture at both the spine and the hip (Cauley 2007; Hasserius 2005). According to pooled estimates, the presence of a prevalent (existing or first) vertebral fracture confers a four‐ to five‐fold increased risk of further vertebral fractures (Ensrud 2000), and a three‐fold risk of hip fracture (Delmas 2001; Ensrud 1999).
Despite the burden of vertebral fracture, less than one third of individuals with vertebral fractures receive medical attention and even fewer are treated (Freedman 2008). It has been suggested that only 30% of vertebral fractures are diagnosed in clinical practice because the presentation is similar to that of acute non‐specific LBP and adequate screening tests are not available (Grigoryan 2003; Papaioannou 2002). In addition, there appears to be agreement amongst leading clinicians that spinal manipulative therapy (which is a common treatment for acute LBP endorsed in clinical practice guidelines) is contraindicated in individuals with osteoporosis or vertebral fractures (Maitland 2001; Waddell 2004). Accurate identification of vertebral fracture is clearly essential not only in clinical trials and epidemiological studies, but also for the clinical management of patients with osteoporosis or acute LBP (Ferrar 2005; Hancock 2008).
Index test(s)
As a first step to identifying vertebral fracture in patients presenting with acute LBP, clinical practice guidelines generally recommend assessing for the following red flags: a recent history of trauma; prolonged use of corticosteroids; age greater than 50 years; and obvious structural deformity (Koes 2010). The inclusion of these features in the guidelines has often been poorly justified, for example, by reference to previous guidelines (van Tulder 2004) or unpublished data (Bigos 1994), and less often to primary diagnostic studies (NHMRC 2003). Despite their inclusion in the guidelines, the usefulness of screening for red flags continues to be debated (Underwood 2009) and there remains very little information on their diagnostic accuracy and how best to use them in clinical practice.
The low prevalence of vertebral fractures and other serious spinal pathologies in patients with LBP makes it difficult to develop screening tools which are both easy to apply and accurate. While the guidelines usually suggest individual red flags and leave their interpretation up to the clinicians, arguably more effective screening would be possible if diagnostic accuracy data were available on red flags used alone or in combination with each other. Ideally, an effective series of red flag questions for vertebral fracture would involve readily assessed features from the patient’s history and physical examination, avoiding invasive and potentially harmful tests, to identify all patients who require further assessment.
In 2008, a systematic review of 12 studies evaluated a total of 51 clinical features used to screen patients with LBP for vertebral fracture (Henschke 2008). The review found that five clinical features were useful to raise or lower the probability of vertebral fracture: age > 50 years (positive likelihood ratio [LR+] = 2.2, negative likelihood ratio [LR‐] = 0.34), female gender (LR+ = 2.3, LR‐ = 0.67), major trauma (LR+ = 12.8, LR‐ = 0.37), pain and tenderness (LR+ = 6.7, LR‐ = 0.44), and a painful, distracting injury (LR+ 1.7, LR‐ = 0.78). However, the review noted that the available studies were generally of poor methodological quality and very few studies had been carried out in the primary care setting; which is unfortunate as this is the setting for which guidelines suggest red flags should be used. Most studies evaluated these red flags as isolated diagnostic tests (i.e. not in combination with each other), and most red flags were only evaluated by one study.
Alternative test(s)
In the absence of robust information about the diagnostic accuracy of clinical red flags, clinicians are left with the prospect of routine diagnostic imaging of all patients to identify vertebral fracture. Given the substantial number of patients with LBP who seek care, this would prove to be very expensive and unnecessarily subject many patients to harmful radiation (Chou 2009).
Assessment of standard radiographs remains the "gold standard" for diagnosing vertebral fractures, however, there is no consensus in the literature regarding the definition of vertebral fractures (Grigoryan 2003). Numerous quantitative and semi‐quantitative methods have been developed to assess vertebral fracture from conventional lateral radiographs, but the results are highly dependent on the experience of the observer (Ferrar 2008). For therapeutic trials and epidemiological studies, Genant's semi‐quantitative assessment used by a trained and experienced observer is the preferred method, based on its good reproducibility and ability to differentiate fractures from other deformities. Intra‐observer agreement for this method is 97% (kappa = 0.89) for experienced observers and 93% (kappa = 0.73) for inexperienced observers (Genant 1993).
More recently, quantitative methods of vertebral morphometry have been applied to lateral spine images from dual‐energy X‐ray absorptiometry (DXA) with the intention to improve diagnosis and treatment without increasing cost and exposure to radiation (Damiano 2006; Samelson 2011). The preferred term for vertebral morphometry evaluation using DXA is now "vertebral fracture assessment" (VFA). Image acquisition requires only a few minutes and radiation exposure is only three micro‐Sieverts (mSv), compared to 600 mSv for a lateral radiograph of the thoracic and lumbar spine (Vokes 2006). Numerous sources of error still exist when using VFA (e.g. positioning the measurement points on the vertebrae, anatomic variants, and deformities related to degenerative disease), so standard radiographs are still recommended to confirm the presence of the abnormality and to determine whether it is a fracture or a deformity. VFA has shown good agreement with quantitative morphometry of digitised radiographs (94.8%, kappa = 0.70) (Rea 2000), and consensus qualitative radiograph assessment by two experts (kappa = 0.71) (Ferrar 2000). Lower levels of inter‐observer agreement have been found with VFA (kappa = 0.56 to 0.69) than with Genant’s semi‐quantitative assessment (kappa = 0.60 to 0.86) of standard radiographs (Damiano 2006; Schousboe 2006).
Overall, the preferred method to confirm the presence of vertebral fracture is Genant's semi‐quantitative assessment by a trained and experienced observer. Vertebral fracture assessment can be used to separate normal vertebrae from doubtful or fractured vertebrae, which should then be examined by an expert.
Rationale
We aimed to update a previous systematic review (Henschke 2008) in light of the recent publication of primary diagnostic studies on this topic and the recent development of methods recommended by the Cochrane Diagnostic Test Accuracy Working Group (Deeks 2009). The protocol for this review (Henschke 2010b) was based upon the first Diagnostic Test Accuracy review published with the Cochrane Back Review Group (van der Windt 2010).
The review was performed concurrently with another Cochrane review on the diagnostic test accuracy of red flags to screen for malignancy in patients with LBP (Henschke 2010a), in order to provide evidence on the diagnostic accuracy of red flags to identify serious spinal pathologies presenting as LBP. The results aim to provide researchers and clinicians with a clearer definition of which red flags are useful to screen for vertebral fracture, and identify in which situations it is appropriate to use them in the management of LBP. The potential consequences of a missed or late diagnosis of vertebral fracture necessitate the need for accurate diagnostic information on red flag screening questions.
Objectives
This review aims to provide information on the diagnostic accuracy of tests used to screen for vertebral fracture in patients presenting with LBP or lumbar spine examination. Specifically, the review aims to assess the performance of information collected during the history taking or physical assessment (potential red flags). Such information will assist clinicians in making decisions about the appropriate management course in patients with LBP.
Investigation of sources of heterogeneity
Where feasible, the review will assess the influence of sources of heterogeneity on the diagnostic accuracy of red flags used to screen for vertebral fracture. In particular the influences of study design (e.g. consecutive series or case‐control), healthcare setting (e.g. primary or secondary care), and aspects of study methods as reflected in the items of the QUality Assessment of Diagnostic Accuracy Studies (QUADAS) checklist (Appendix 1), will be assessed.
Methods
Criteria for considering studies for this review
Types of studies
We considered primary diagnostic studies if they compared the results of history taking or physical examination, to identify vertebral fracture in patients with LBP, to those of a reference standard. Cohort and cross‐sectional studies of a consecutive series of patients, as well as case‐control studies, that present sufficient data to allow estimates of diagnostic accuracy (such as sensitivity and specificity) were considered for eligibility in this review. Studies reported in abstracts or conference proceedings and full journal publications, in all languages were considered. Appropriate translation was sought when required.
Participants
Studies evaluating adult patients presenting to care for treatment of LBP or for lumbar spine examination were eligible. Studies with a substantial proportion of recruited patients (> 10%) already diagnosed with vertebral fracture as the likely cause of LBP were excluded. Patients presenting to primary, secondary and tertiary care settings were included in this review. Results of studies carried out in different settings are clearly labelled in text and tables.
Index tests
Studies investigating any aspect of the history taking or physical examination of LBP were considered eligible for inclusion. Such information included demographic characteristics (e.g. age, gender), clinical and medical history (e.g. pain intensity, previous history of falls), and physical examination results (e.g. tenderness, reduced range of motion). Studies where diagnostic accuracy of individual red flags were evaluated in isolation or as part of a combination were included, however, combination red flags required clear description of which tests were involved. "Clinical diagnosis" or "global clinician judgement" (involving some unknown combination of history and physical examination) were excluded as a clinical judgement based on undefined parameters is difficult to interpret or generalise.
Target conditions
Studies that investigated patients presenting with LBP or for lumbar spine examination, and reported the prevalence of vertebral fracture were included.
Reference standards
Studies were included if red flags were compared with diagnostic imaging procedures such as plain radiographs, computed tomography (CT), magnetic resonance imaging (MRI), and bone scans to confirm the presence of vertebral fracture. Clinical follow‐up was also considered suitable if suspected fracture was confirmed with imaging.
Search methods for identification of studies
Electronic searches
The search strategy utilised was developed in collaboration with a medical information specialist. Relevant computerised databases, including MEDLINE, OLDMEDLINE (PudMed), EMBASE (EMBASE.com), and CINAHL (Ebsco) were searched for eligible studies from the earliest year possible until 7th March 2012. The search strategy for MEDLINE, EMBASE and CINAHL are presented in Appendix 2; Appendix 3; and Appendix 4. The previous systematic review on the diagnostic performances of red flags for vertebral fracture was used as a point of reference (Henschke 2008). All publications included in that review are indexed in MEDLINE, and were identified by the proposed search strategy. The strategy used combinations of searches related to the patient population, history taking, physical examination, and the target condition.
Searching other resources
Reference lists of all included publications were checked and forward citation search was conducted on relevant studies. All included studies were subjected to a forward citation check using Web of Science. The electronic search was composed to identify relevant (systematic) reviews in MEDLINE and Medion (www.mediondatabase.nl), from which reference lists were also checked.
Data collection and analysis
Selection of studies
Two review authors (CW and NH) independently applied the selection criteria to all citations (titles and abstracts) identified by the search strategy and excluded citations that were clearly not relevant. Full publications were retrieved for any citation that potentially met the inclusion criteria. Final selection was based on a review of full publications. Disagreements were resolved by consensus or by consulting other review authors in cases of persisting disagreement. The selection criteria and the QUADAS criteria were piloted on selected diagnostic studies to ensure consistency of interpretation among review authors.
Data extraction and management
Two review authors (CW and NH) independently performed data extraction. This included data on the characteristics of studies and participants, types of index tests and reference standards and their descriptions, relevant aspects of the study methods and data for the assessment of diagnostic accuracy (true positive, false positive, true negative and false negative counts for all index tests).
Characteristics of studies (and participants) included details on: setting (type: primary care, secondary care or tertiary care, and country); reason for presentation; enrolment procedures (consecutive or non‐consecutive); number of participants (including number eligible for the study, number enrolled in the study, number receiving the index test and reference standard, number for whom results are reported in the 2 x 2 table, reasons for withdrawal); patient demographics (age, gender) and duration and history of LBP.
Test characteristics included the type of index test; methods of execution; experience and expertise of the assessors; type of reference standard; and where relevant, cut‐off points for diagnosing vertebral fracture (e.g. quantitative radiographic measures).
Assessment of methodological quality
Review authors CW, NH and CM independently assessed the risk of bias of each study using the QUADAS checklist (Whiting 2003). The 11‐item version of the QUADAS recommended by The Cochrane Diagnostic Test Accuracy Working Group (Deeks 2009) was used. The 11 items were considered individually for each study, without the application of weights or the use of a summary score to select studies with certain risk of bias in the analysis. A guideline (Appendix 1) was used to classify each item as "yes" (adequately addressed); "no" (inadequately addressed); or "unclear" (inadequate detail presented to allow a judgment to be made). Disagreements were resolved by consensus.
If a review author was involved in a primary study, then this author was not involved in the risk of bias rating and data extraction for that study.
Statistical analysis and data synthesis
The prevalence of vertebral fracture in each study is presented along with measures of diagnostic accuracy for each test reported. Indices of diagnostic performance were extracted or derived from data presented in each primary study for each red flag or combination of red flags. Diagnostic 2 x 2 tables were generated to calculate sensitivities and specificities with their 95% confidence intervals and are presented in forest plots. In one study (Roman 2010), 2 x 2 tables were reconstructed using information from other relevant parameters (sensitivity, specificity, and predictive values).
Due to the heterogeneity of tests, study settings and methodology, pooling of results was not possible.
The magnitude and precision (95% CI) of likelihood ratios (LRs) for individual studies were inspected to provide a study‐specific judgement of the informativeness of tests. The lowest criterion for a meaningful result for a LR+ was that the lower bound of the 95% CI did not include one. It is common place to consider LR+ between two to five to provide a small increase in post‐test probability, five to 10 a moderate increase and above 10 a large increase. This criteria detailed in Additional Table 2 was used to guide interpretation about informativeness of each test in the review.
1. Criteria for study‐specific judgement of the informativeness of tests based on LR+.
| Test result | Interpretation |
| Upper bound of confidence interval is less than 2 | Uninformative test |
| Lower bound of confidence interval spans 1 but upper bound >2 | Indeterminate result because we have an imprecise estimate of the test accuracy |
| Point estimate in the range 2‐5 and lower bound of confidence interval is above 1.0 | Small increase in likelihood of fracture |
| Point estimate in the range 5‐10 and lower bound of confidence interval is above 1.0 | Moderate increase in likelihood of fracture |
| Point estimate > 10 and lower bound of confidence interval is above 1.0 | Large increase in likelihood of fracture |
LR+: positive likelihood ratios
All statistics were calculated using Review Manager 5.1.
Investigations of heterogeneity
Factors contributing to heterogeneity could not be rigorously investigated to determine their influence on the diagnostic accuracy of index tests due to the small number of studies that reported similar tests.
Results
Results of the search
Eleven thousand, two hundred and thirty‐six unique titles were identified by the electronic searches with a further 221 identified by handsearching (Figure 1). Only one relevant systematic review was found (written by the current authors) Henschke 2008. After screening titles and abstracts, 60 full‐text articles were retrieved and examined in detail for eligibility. Fifteen studies were considered eligible for inclusion. Forward citation searches were performed on the reference lists of all included studies with no additional studies meeting the inclusion criteria. A further seven studies were excluded as these included patients presenting to accident and emergency (A&E) departments for blunt trauma or multi‐trauma, not low‐back pain (LBP). Overall, the exclusion of studies was mainly the result of irrelevant populations (e.g. reason for presentation was not LBP), reference standards or index test and inadequate methodology (e.g. not diagnostic studies or case series).
1.

Study flow diagram
Description of Studies
Characteristics of the eight included studies (e.g. study design, setting, population, target condition, index test and reference standards) are detailed in the Characteristics of included studies tables.
Of the eight studies included in the review, four were conducted in primary care (Deyo 1986; Henschke 2009; Scavone 1981; van den Bosch 2004) accounting for a total of 4671 participants. The prevalence of fracture ranged from 0.7% to 4.5%. Two studies (Deyo 1986; Henschke 2009) employed prospective sampling of consecutive patients presenting with LBP. The remaining two studies were retrospective chart reviews of patients referred for lumbar imaging, of which one included all patients with imaging studies (Scavone 1981) and the other a random sample of those with imaging for LBP (van den Bosch 2004). There were 20 different red flags identified from the four studies. One study provided data on a combination of four red flags (Henschke 2009). Three studies from primary care described x‐ray as the reference standard (Deyo 1986; Scavone 1981; van den Bosch 2004). For the remaining study (Henschke 2009), the authors confirmed that imaging studies were conducted for suspected serious pathology upon specialist consultation, in the long‐term follow‐up of the study.
There was one study set in secondary care (Roman 2010) which examined the charts of all patients consulted in the spinal surgery clinic (n = 1448) with lumbar disorders. The prevalence of fracture in this study was 3% and was confirmed by standard radiograph or computed tomography (CT). Eight individual tests and five combinations of tests were investigated.
The three studies that took place in tertiary care (an A&E setting) had a combined pool of participants of 1259. The prevalence of fracture ranged for 6.5% to 11%. Only one study enrolled patients presenting with (acute) lumbar pain (Gibson 1992). The remaining two studies enrolled all patients requiring lumbar imaging. One did this prospectively (Reinus 1998) and the other by retrospective chart review (Patrick 1983). There were 12 index tests described in the three A&E studies. A history of trauma (all three studies) and tests of abnormal neurology were the most commonly investigated red flags. The reference standards used were plain radiograph or CT scan.
The specific criterion for a radiological diagnosis of fracture was not well defined in any setting.
Methodological quality of included studies
The individual results of the 'Risk of bias' assessment for included studies are displayed in Figure 2. Most studies included a representative spectrum of patients (63%). The main reason for failing this criterion was enrolling only patients with imaging studies. Since it is possible for patients who present to care to have a positive result to a red flag and not be referred for imaging this was considered as a potential selection bias. Most studies utilised an acceptable reference standard (88%), avoided incorporation of the index tests and reference standards (88%), interpreted index tests when blind to the reference standard (75%) and assessed relevant clinical tests (100%). An acceptable time frame between tests was rarely reported (38%) nor were explanations for withdrawals (38%). No studies specifically reported blinding of index tests when the reference standard was assessed and 75% of studies reported vague or uninterpretable results. Partial verification bias and differential verification bias were only avoided by 38% of studies respectively.
2.

Methodological quality summary: review authors' judgements about each methodological quality item for each included study.
Findings
There were 29 different groups of similar index tests (e.g. trauma = significant trauma, history of trauma, major trauma, direct trauma, or fall; age = age with six different cut‐offs) reported across the eight studies. Sixteen of these groups featured in more than one study and only two similar groups of red flags were reported in more than two studies (trauma = five and sensation change = three). A further 10 tests were different combinations of red flags. Few tests descriptions were reported adequately for clinical use. Calculating pooled estimates of test accuracy for the current review was inappropriate given the heterogeneity of tests reported across the studies and study setting.
Results for all tests included are detailed in the Additional Tables (Additional Table 3; Additional Table 4; Additional Table 5; Additional Table 6; Additional Table 7). Table 1 describes the performance of red flags that were considered potentially useful. The findings presented are from descriptive analyses only.
2. Results of all red flags taken during a history in primary care.
| Study | Index test | Reference standard | Sample size | Prevalence | LR+ | LR‐ |
| Deyo 1986 | Age > 50 years | X‐ray | 621 | 0.045 | 2.16 (1.58 ‐ 2.95) | 0.34 (0.12 ‐ 0.92) |
| Henschke 2009 | Age > 50years | Follow‐up/imaging | 1178 | 0.007 | 1.84 (1.07 ‐ 3.17) | 0.57 (0.23 ‐ 1.39) |
| van den Bosch 2004 | Age > 54 years | X‐ray | 2100 | 0.041 | 1.72 (1.54 ‐ 1.91) | 0.33 (0.20 ‐ 0.53) |
| Henschke 2009 | Age > 54years | Follow‐up/imaging | 1178 | 0.007 | 2.57 (1.49 ‐ 4.44) | 0.50 (0.2 ‐ 1.21) |
| van den Bosch 2004 | Age > 64 years | X‐ray | 2100 | 0.041 | 2.46 (2.16 ‐ 2.8) | 0.32 (0.21 ‐ 0.48) |
| Henschke 2009 | Age > 64 years | Follow‐up/imaging | 1178 | 0.007 | 7.13 (4.04 ‐ 12.59) | 0.41 (0.17 ‐ 1.01) |
| Henschke 2009 | Age > 70 years | Follow‐up/imaging | 1178 | 0.007 | 11.19 (5.33 ‐ 23.51) | 0.52 (0.26 ‐ 1.05) |
| van den Bosch 2004 | Age > 74 years | X‐ray | 2100 | 0.041 | 3.69 (3.00 ‐ 4.53) | 0.49 (0.38 ‐ 0.63) |
| Henschke 2009 | Age > 74 years | Follow‐up/imaging | 1178 | 0.007 | 9.39 (2.69 ‐ 32.75) | 0.77 (0.52 ‐ 1.15) |
| van den Bosch 2004 | Female Gender | X‐ray | 2100 | 0.041 | 1.26 (1.10 ‐ 1.45) | 0.65 (0.46 ‐ 0.92) |
| van den Bosch 2004 | Female > 54 years | X‐ray | 2100 | 0.041 | 2.01 (1.68 ‐ 2.40) | 0.54 (0.41 ‐ 0.72) |
| Henschke 2009 | Female > 54 years | Follow‐up/imaging | 1178 | 0.007 | 5.39 (3.08 ‐ 9.43) | 0.42 (0.17 ‐ 1.04) |
| van den Bosch 2004 | Female > 64 years | X‐ray | 2100 | 0.041 | 2.75 (2.26 ‐ 3.35) | 0.52 (0.40 ‐ 0.68) |
| Henschke 2009 | Female > 64 years | Follow‐up/imaging | 1178 | 0.007 | 14.59 (8.00 ‐ 26.61) | 0.39 (0.16 ‐ 0.96) |
| van den Bosch 2004 | Female > 74 years | X‐ray | 2100 | 0.041 | 4.14 (3.17 ‐ 5.44) | 0.62 (0.51 ‐ 0.75) |
| Henschke 2009 | Female > 74 years | Follow‐up/imaging | 1178 | 0.007 | 16.17 (4.47 ‐ 58.43) | 0.76 (0.51 ‐ 1.14) |
| Deyo 1986 | Significant trauma | X‐ray | 621 | 0.045 | 3.42 (1.57 ‐ 7.45) | 0.72 (0.49 ‐ 1.06) |
| Henschke 2009 | Significant trauma | Follow‐up/imaging | 1178 | 0.007 | 10.03 (2.87 ‐ 35.13) | 0.77 (0.52 ‐ 1.15) |
| Scavone 1981 | Significant trauma | X‐ray | 871 | 0.030 | 12.85 (8.58 ‐ 19.24) | 0.36 (0.22 ‐ 0.62) |
| Deyo 1986 | Corticosteroid use | X‐ray | 621 | 0.045 | 3.97 (0.20 ‐ 79.15) | 0.98 (0.89 ‐ 1.07) |
| Henschke 2009 | Prolonged Corticosteroid use | Follow‐up/imaging | 1178 | 0.007 | 48.50 (11.46 ‐ 204.98) | 0.75 (0.51 ‐ 1.13) |
LR+: positive likelihood ratios LR‐: negative likelihood ratios
3. Results of all red flags taken during a physical examination in primary care.
| Study | Index test | Reference standard | Sample size | Prevalence | LR+ | LR‐ |
| Henschke 2009 | Altered sensation (from trunk down) | Follow‐up/imaging | 1178 | 0.007 | 3.32 (0.22 ‐ 50.86) | 0.96 (0.82 ‐ 1.13) |
| Scavone 1981 | Sensation change | X‐ray | 871 | 0.030 | 2.21 ( 1.14 ‐ 4.27) | 0.83 (0.66 ‐ 1.05) |
| Scavone 1981 | Motor deficit | X‐ray | 871 | 0.030 | 2.19 (1.06 ‐ 4.54) | 0.86 (0.70 ‐ 1.06) |
| Scavone 1981 | DTR abnormality | X‐ray | 871 | 0.030 | 1.08 (0.37 ‐ 3.18) | 0.99 (0.86 ‐ 1.14) |
| Scavone 1981 | Tenderness | X‐ray | 871 | 0.030 | 0.70 (0.25 ‐ 1.97) | 1.11 (0.87 ‐ 1.41) |
| Scavone 1981 | Spasm | X‐ray | 871 | 0.030 | 1.25 (0.42 ‐ 3.70) | 0.98 (0.85 ‐ 1.12) |
| Scavone 1981 | Sciatica | X‐ray | 871 | 0.030 | 0.42 (0.06 ‐ 2.91) | 1.06 (0.98 ‐ 1.15) |
| Scavone 1981 | Hip/leg pain | X‐ray | 871 | 0.030 | 0.21 (0.01 ‐ 3.35) | 1.08 ( 1.02 ‐ 1.14) |
DTR: deep tendon reflex LR+: positive likelihood ratios LR‐: negative likelihood ratios
4. Results of all red flags in secondary care.
| Study | Index test | Reference standard | Sample size | Prevalence | LR+ | LR‐ |
| Roman 2010 | Age > 52 years | X‐ray or CT | 1448 | 0.026 | 1.52 (1.42 ‐ 1.68) | 0.14 (0.04 ‐ 0.53) |
| Roman 2010 | Female | X‐ray or CT | 1448 | 0.026 | 1.51 (1.35 ‐ 1.71) | 0.26 (0.10 ‐ 0.65) |
| Roman 2010 | Concomitant osteoarthritis | X‐ray or CT | 1448 | 0.026 | 1.05 (0.76 ‐ 1.45) | 0.95 (0.69 ‐ 1.32) |
| Roman 2010 | No regular exercise | X‐ray or CT | 1448 | 0.026 | 1.47 (1.25 ‐ 1.75) | 0.42 (0.21 ‐ 0.81) |
| Roman 2010 | Absense of leg or buttock pain | X‐ray or CT | 1448 | 0.026 | 2.24 (1.38 ‐ 3.64) | 0.80 (0.64 ‐ 0.99) |
| Roman 2010 | BMI < 23 | X‐ray or CT | 1448 | 0.026 | 2.22 (1.44 ‐ 3.42) | 0.76 (0.59 ‐ 0.97) |
| Roman 2010 | Descreased pain on sitting | X‐ray or CT | 1448 | 0.026 | 1.56 (0.94 ‐ 2.59) | 0.87 (0.71 ‐ 1.07) |
| Roman 2010 | No gait abnormality | X‐ray or CT | 1448 | 0.026 | 0.85 (0.68 ‐ 1.08) | 1.49 (0.95 ‐ 2.34) |
BMI: body mass index CT: computed tomography LR+: positive likelihood ratios LR‐: negative likelihood ratios
5. Results of red flags in tertiary care.
| Study | Index test | Reference standard | Sample Size | Prevalence | LR+ | LR‐ |
| Gibson 1992 | History of direct trauma | X‐ray | 225 | 0.06 | 1.93 (1.48 ‐ 2.52) | 0.121 (0.01 ‐ 1.79) |
| Patrick 1983 | Presenting history of trauma | X‐ray | 552 | 0.07 | 1.77 (1.48 ‐ 2.13) | 0.36 (0.20 ‐ 0.68) |
| Reinus 1998 | History of trauma (fall) | X‐ray | 482 | 0.11 | 0.18 (0.07 ‐ 0.48) | 1.54 (1.38 ‐ 1.71) |
| Patrick 1983 | Sensory deficit | X‐ray | 552 | 0.07 | 1.42 (0.19 ‐ 10.95) | 0.99 (0.94 ‐ 1.04) |
| Patrick 1983 | Motor deficit | X‐ray | 552 | 0.07 | 1.39 (0.08 ‐ 25.38) | 0.98 (0.96 ‐ 1.03) |
| Patrick 1983 | DTR abnormality | X‐ray | 552 | 0.07 | 1.54 (0.49 ‐ 4.87) | 0.97 (0.89 ‐ 1.06) |
| Gibson 1992 | Neurological signs and/or SLR < 40 degrees | X‐ray | 225 | 0.06 | 2.40 (0.67 ‐ 8.70) | 0.81 (0.51 ‐ 1.30) |
| Reinus 1998 | Any motor or sensory dysfunction* | X‐ray | 482 | 0.11 | 0.69 (0.22 ‐ 2.17) | 1.03 (0.96 ‐ 1.10) |
| Patrick 1983 | Positive SLR | X‐ray | 552 | 0.07 | 1.02 (0.51 ‐ 2.05) | 1.00 (0.86 ‐ 1.16) |
| Patrick 1983 | Tenderness | X‐ray | 552 | 0.07 | 1.76 (1.42 ‐ 2.19) | 0.47 (0.28 ‐ 0.78) |
| Patrick 1983 | Spasm | X‐ray | 552 | 0.07 | 1.47 (0.83 ‐ 2.60) | 0.90 (0.75 ‐ 1.09) |
| Patrick 1983 | Contusion/abrasion | X‐ray | 552 | 0.07 | 31.09 (18.25 ‐ 52.96) | 0.15 (0.07 ‐ 0.32) |
* Weakness, numbness, or paraesthesias present on testing or questioning DTR: deep tendon reflex LR+: positive likelihood ratios LR‐: negative likelihood ratios SLR: Straight leg raise
6. Results of combined red flags in primary, secondary and tertiary care.
| Study | Index test | Reference standard | Sample Size | Prevalence | LR+ | LR‐ |
| Gibson 1992 | Trauma & neurological signs | X‐ray | 225 | 0.06 | 14.43 (2.38 ‐ 87.64) | 0.73 (0.46 ‐ 1.15) |
| Patrick 1983 | Multiple findings | X‐ray | 552 | 0.07 | 2.01 (1.35 ‐ 2.99) | 0.73 (0.56 ‐ 0.96) |
| Henschke 2009 | 1 of 4 red flags +ve | Follow‐up/imaging | 1178 | 0.007 | 1.75 (1.34 ‐ 2.29) | 0.25 (0.04 ‐ 1.57) |
| Henschke 2009 | 2 of 4 red flags +ve | Follow‐up/imaging | 1178 | 0.007 | 15.48 (8.45 ‐ 28.36) | 0.39 (0.16 ‐ 0.96) |
| Henschke 2009 | 3 of 4 red flags +ve | Follow‐up/imaging | 1178 | 0.007 | 906.11 (50.37 ‐ 16299.11) | 0.61 (0.36 ‐ 1.03) |
| Roman 2010 | 1 of 5 positive +ve | X‐ray or CT | 1448 | 0.026 | 1.04 (0.98 ‐ 1.10) | 0.43 (0.06 ‐ 2.98) |
| Roman 2010 | 2 of 5 positive +ve | X‐ray or CT | 1448 | 0.026 | 1.43 (1.32 ‐ 1.56) | 0.16 (0.04 ‐ 0.60) |
| Roman 2010 | 3 of 5 positive +ve | X‐ray or CT | 1448 | 0.026 | 2.45 (2.02 ‐ 2.97) | 0.34 (0.19 ‐ 0.61) |
| Roman 2010 | 4 of 5 positive +ve | X‐ray or CT | 1448 | 0.026 | 9.62 (5.88 ‐ 15.73) | 0.66 (0.52 ‐ 0.84) |
| Roman 2010 | 5 of 5 positive +ve | X‐ray or CT | 1448 | 0.026 | 7.63 (0.91 ‐ 63.67) | 0.98 (0.93 ‐ 1.03) |
CT: computed tomography LR+: positive likelihood ratios LR‐: negative likelihood ratios
Primary care
Twenty different red flags were reported in four primary care studies (Additional Table 3 and Additional Table 4). 'Significant trauma' was the most common red flag reported (Deyo 1986; Henschke 2009; Scavone 1981) in studies where the prevalence of fracture ranged from 0.7% to 4.5%. The sensitivity of ‘trauma’ as a test ranged from 0.25 (95% confidence interval (CI) 0.03 to 0.65) to 0.65 (95% CI 0.44 to 0.83), the specificity from 0.90 (95% CI 0.86 to 0.93) to 0.98 (95% CI 0.96 to 0.98) and the positive likelihood ratios (LR+) from 3.42 (95% CI 1.57 to 7.45) to 12.85 (95% CI 8.58 to 19.24).
'Older age' at five different cut‐offs was reported by three studies (Deyo 1986; Henschke 2009; van den Bosch 2004). Using raw data from Henschke 2009 allowed four age cut‐offs to be assessed in two studies. Comparing the LR+ at different cut‐offs indicated the most informative cut‐off was 'age greater than 74 years' (9.39, 95% CI 2.69 to 32.75 and 3.69, 95% CI 3.00 to 4.53;Henschke 2009; van den Bosch 2004 respectively). This was a small‐to‐moderate increase in the likelihood of fracture. Combining age and gender revealed more favourable results with increases in likelihood ratios indicating a higher suspicion (moderate‐to‐large) of fracture (e.g. the LR+ for age > 64 increased two‐fold; 7.13, 95% CI 4.04 to 12.59 to 14.59, 95% CI 8.00 to 26.61 when combined with female gender).
Two studies investigated corticosteroid use (Deyo 1986; Henschke 2009). The accuracy was similarly poor in terms of sensitivity (0.00, 95% CI 0.00 to 0.23 and 0.25 95% CI 0.03 to 0.65), however, the test was highly specific (0.99, 95% CI 0.98 to 1.00). This test yielded very different but still imprecise LR+ between the studies (3.97, 95% CI 0.20 to 79.15 and 48.50, 95% CI 11.46 to 204.98 respectively).
One primary care study Henschke 2009 investigated a combination of four red flags (age, gender, corticosteroid use and history of trauma). The presence of at least two positive red flags was highly specific (0.96; 95% CI 0.95 to 0.97) while maintaining reasonable sensitivity (0.63 95% CI 0.24 to 0.91). Any two of the four features resulted in a large increase in the suspicion of fracture (LR+ 15.48, 95% CI 8.45 to 28.36). While a positive test for three red flags yielded a large LR+ (906.11) the confidence interval of this estimate (50.37 to 16,299.10) indicated very sparse data and suggests caution in interpreting this result (Additional Table 7).
Secondary care
One study (Roman 2010) from secondary care investigated eight potential red flags (Additional Table 5). The sensitivity and specificity of the eight tests investigated varied greatly however, the largest LR+ were obtained from the tests; BMI < 23 (2.22, 95% CI 1.44 to 3.42) or the absence of leg/buttock pain (2.24, 95% CI 1.38 to 3.64). These reflected a small increase in the likelihood of fracture. The authors also reported diagnostic accuracy of a combination of tests. Four positive tests (out of five) yielded the best LR+ (9.62, 95% CI 5.88 to 15.73) representing a moderate increase in the likelihood of fracture (Additional Table 7).
Tertiary care (A&E)
The 12 red flags investigated in the three studies set in tertiary care (A&E) revealed varying rates of accuracy (Additional Table 6). For the most common red flag (a history of trauma) sensitivity ranged from 0.07 (95% CI 0.02 to 0.18) to 1.00 (95% CI 0.59 to 1.00) and a specificity from 0.51 (95% CI 0.41 to 0.62) to 0.60 (95% CI 0.56 to 0.65). The LR+ was also inconsistent ranging from 0.18 (95% CI 0.07 to 0.48 ) to 1.93 (95% CI 1.45 to 2.52). One red flag from the physical examination (presence of contusion or abrasion) in an A&E study (Patrick 1983) was superior to all other tests being reasonably sensitive (0.85, 95% CI 0.70 to 0.94) and highly specific (0.97, 95% CI 0.95 to 0.98). The LR+ was similarly high (31.09 95% CI 18.25 to 52.96).
Discussion
Summary of main results
This review aimed to summarise the evidence for the diagnostic accuracy of tests to screen for vertebral fracture in patients presenting with low‐back pain (LBP). Index tests were defined as ‘red flags’ obtained during the history taking or physical examination. The included studies were set in primary, secondary and tertiary care.
Our review noted that from only eight studies 29 different groups of red flags to screen for vertebral fracture have been investigated. Evidence from single studies suggests that many of these red flags are uninformative; the 95% CI of the positive likelihood ratios (LR+) spans 1.0 and most have high rates of false positives. From the five red flags most commonly recommended in guidelines for fracture (osteoporosis, history of trauma, corticosteroid use, older age and female gender), osteoporosis did not feature in any study. Three of the remaining four, trauma, older age and prolonged use of corticosteroids, appeared informative but when used in isolation had modest diagnostic accuracy, considering they are required to detect a condition that has a low prevalence. Combinations of red flags also seemed more informative than red flags used alone. For example, each of the primary care studies reported meaningful LR+ for ‘older age’ with the LR+ increasing from around two for someone in their 50s to around nine for someone in their 70s. Given the low prevalence of fracture, arguably these LRs are too low to move across a decision threshold. However, when female gender and older age were combined the LR+ was higher (e.g.16.17 for age >74 combined with female gender). Similarly, large LR+ were reported for a decision rule based upon four red flags (Henschke 2009: two of four positive, LR+ 15.48, 95% CI 8.45 – 28.36). The findings of this review support the use of a smaller set of red flags to screen for fracture and reveal that certain combinations of red flags are preferable to individual tests. However, these recommendations are limited by the strength of the available evidence; imprecise estimates and sometimes moderate risk of bias are factors of consideration. As there is ambiguity surrounding which red flags should be used to detect certain serious pathology presenting as LBP, caution is required not to dismiss red flags that are appropriate to identify other serious pathologies (e.g. infection, cancer). Because serious pathology presenting as LBP is rare in primary care and many red flags proposed for detecting fracture have high false positive rates, the consequences of uncritical use, in terms of patient outcomes and cost of management from unnecessary ancillary testing, should be considered carefully.
Population and setting
A key objective for all clinicians managing presentations of LBP is to identify underlying serious pathology when it is present. This review included studies from primary, secondary and tertiary care settings. However, it appears that the recommendation to use most red flags to identify vertebral fracture as a cause of LBP, in any of these settings, is not well supported by the available evidence; either because they have poor diagnostic accuracy or are of unknown value. While the review criteria allowed inclusion of patient populations with varying presentations of LBP (e.g. duration of symptoms) few studies were identified. As only a smaller subset of studies reported similar tests and further heterogeneity was invoked by different study methodology or poor reporting of methods, data pooling was not appropriate.
Findings of risk of bias were consistent across setting. The exclusion of patients without imaging posed a risk of selection bias in three of eight studies. Verification bias, commonly introduced by a failure to compare index tests and reference standards in all participants, may have overestimated the accuracy of the tests investigated (Lijmer 1999).
Reference Standard
Adequate reference standards according to the methodological quality criteria (Appendix 1) were observed in most studies (seven out of eight). Plain radiographs were the most common reference standard however, due to poor reporting of how the method was used (e.g. detail about the image view, experience of the observer and definition of vertebral fracture), the influence of these characteristics could not be determined between studies or settings. Likewise, the influence of different methods of imaging on our results could not be investigated. Only one study used CT scan as the reference standard and no study used MRI. One study in primary care used a long‐term clinical follow‐up in conjunction with imaging and specialist review for suspected cases of serious disease. While this is the guideline‐recommended clinical pathway, it is possible that there were missed (undiagnosed) cases of fracture, which would affect the estimates of diagnostic accuracy in this study.
Index Tests
For adequate use in practice, red flags when present should indicate the need for further investigation of serious disease. Conversely, when absent, red flags would reduce the suspicion of serious disease and remove the need for further testing. For managing LBP, these details should improve the efficiency of diagnostic test ordering by reducing unnecessary routine imaging. Based on the small number of studies included in this review, the use of many red flags to screen for fracture is not supported by the available evidence. Because the pre‐test probability is low and most tests have modest LR+, few would raise the index of suspicion of having fracture (rule fracture in) much above the reported prevalence. The sensitivity of index tests was also generally low (below 40%), meaning when fracture is present there is a high rate of missed diagnoses. The implications of undiagnosed fracture are most apparent in populations where there appears to be a higher risk of subsequent fracture (e.g. osteoporosis) Lindsay 2001.The negative likelihood ratios (LR‐) of most tests were not meaningful, despite generally high specificity (above 70%), suggesting that using red flags to decrease the suspicion of fracture (rule fracture out) in practice is also probably not useful. Arguably, the consequences of this in a condition with such a low prevalence are minimal. Importantly, there was a high rate of false positives among the red flags investigated. This finding presents the possibility that the indiscriminate use of red flags to indicate the need for imaging to confirm fracture may contribute to high rates of imaging and costs of LBP management.
While the application of results from individual studies requires caution, there were three red flags that stood out as potentially useful in the primary care setting. These tests (prolonged use of corticosteroids, significant trauma and age > 74 ) taken from a patient’s presenting history, all showed a range of LR+ that appeared meaningful (point estimate: 48.50; 3.42 to 12.85; 3.69 to 9.39 respectively; (Table 1). However, the LR+ calculated from some studies also lacked precision (e.g. corticosteroids 48.50, 95% CI 11.46 to 204.98; trauma 10.03, 95% CI 2.87 to 35.13; age > 74, 9.39, 95% CI 2.69 to 32.75) and probably reflects the low incidence of fracture observed in these studies. Regardless, caution about the application of these results is required.
In the tertiary care environment, opportunity costs of wasted resources may be more apparent than other settings. For this reason, the accuracy of tests to appropriately indicate the need for diagnostic imaging or not may have greater consequences. Unfortunately, in tertiary care, few tests were useful to implicate vertebral fracture or rule it out. 'Contusion/abrasion' was the only informative red flag (LR+ 31.09, 95% CI 18.25 to 52.96; LR‐ 0.15 95% CI 0.07 to 0.32). The effect of age on other red flags is a worthy consideration. For example, assessing the severity of trauma with respect to age may be more relevant where minor trauma in elderly patients may result in significant injury but not in younger patients (Henschke 2009).
Surprisingly, only two of the included studies thoroughly investigated combinations of red flags even though this would most likely reflect how these tools are used in clinical practice. Overall, combination of red flags appeared to produce more favourable results to implicate fracture. For instance Gibson 1992 combined trauma and neurological signs which yielded a LR+ of 14.43 (95% CI 2.38 to 87.64) compared to the best LR+ of 1.93 (95% CI 1.48 to 2.52) for trauma and 2.40 (95% CI 0.67 to 8.70) for neurological signs individually. The combination of female gender and older age improved the detection of fracture at a number of cut‐offs, in two studies. The remaining combinations were reported in only one study. While the precision of some combinations was questionable, the magnitude of many LR+ warrants consideration.
Strengths and weaknesses of the review
This review was modelled on a previous review conducted by contributing authors, and the Cochrane review by van der Windt and colleagues (van der Windt 2010). We used a diagnostic filter in the electronic search and identified all of the papers detected in the original review. Several other eligible publications were also identified and all languages were considered. van der Windt 2010 et al investigated the use of a diagnostic filter and found this did not contribute to missing any relevant citations.
A limitation of the review is the paucity of diagnostic research in the back pain field. There was a large range of tests investigated by individual studies however, the number of studies in which a given index test was investigated was small. Additionally, poor reporting of methods, differing study designs and lack of specific details of the index tests investigated meant that statistical pooling for diagnostic accuracy was not appropriate. For this reason, the analysis was only descriptive and we were not able to investigate the influences of potential sources of heterogeneity on the accuracy of red flags.
The 'Risk of bias' assessment in this review was aided by clear guidelines set a priori (Appendix 1). The majority of studies reported potential biases poorly. Subsequently, many items were scored as “unclear”. Moreover, since most studies were not designed specifically for diagnostic accuracy, the quality of the methods are further brought into question.
Applicability of findings to the review question
Routine imaging is not recommended for LBP but instead is reserved for those cases where serious pathology is suspected (Koes 2010). Despite the low prevalence of serious disease presenting as LBP, in primary care the use of imaging remains high (Williams 2010). Our review potentially explains that pattern of practice as we found a large range of red flags, many with high false positive rates. If practitioners refer any patient with a positive response to any of the 29 red flags identified in this review it is understandable that many patients are referred for further investigations. Since vertebral fracture in primary care is rare, the main implications of this are excessive costs of care, inappropriate diagnostic labelling of benign back pain (Chou 2007) and unnecessary radiation exposure (Flynn 2011), rather then missed diagnoses of fracture. Since there are also potential consequences for not using red flags, for instance increased referrals for imaging because of higher levels of uncertainty, a period of close clinical follow‐up instead of immediate referral for imaging may be useful, in cases where positive red flags indicate vertebral fracture. This approach would maintain caution about interventions that are contraindicated for fracture. A similar view has been proposed by others, whereby patients with suspected vertebral fracture but no neurological symptoms would be subjected to an observation period prior to imaging (Chou 2011).
Many proposed red flags for vertebral fracture are uninformative; others are of unknown value because of imprecision in the estimates of diagnostic accuracy. Of the red flags that were informative most yielded only small increases in the likelihood of fracture when used alone. Because fracture is uncommon, tests that provide at least moderate increases in the likelihood of fracture are required. Combinations of red flags appeared more useful with larger likelihood ratios observed compared to individual tests. These results support the view that recommendations about imaging referral for suspected fracture in patients presenting with LBP should be based on a smaller set of red flags and using certain combinations of red flags are preferable. Considering the quality of evidence available and the descriptive analysis employed from single studies, the strength of these recommendations is weak but follows high false positive rates of many red flags with subsequent consequences for the cost of management and potential adverse outcomes for patients with LBP.
Authors' conclusions
Implications for practice.
The available evidence suggests that the use of many red flags to guide decisions about the need for further investigation of suspected vertebral fracture is unfounded. Most red flags neither increase the likelihood of fracture enough when present, nor decrease its likelihood when absent. The descriptive analyses of single studies revealed three red flags with promising results in primary care (older age, significant trauma and corticosteroid use) and one in tertiary care (contusion/abrasion). Combinations of red flags also appeared more informative to assist clinical decision‐making than using individual tests. Unfortunately clear implications for practice regarding which red flags should be used remains challenging given limitations in the evidence available. Since vertebral fracture is rare in patients presenting with LBP and many red flags have high false positive rates, the current set of recommendations for screening of fracture in clinical practice should be reviewed.
Implications for research.
The lack of primary diagnostic research investigating the accuracy of red flags to detect fracture in patients presenting with LBP is concerning. There is little consistency in the literature regarding which red flags are investigated. The specific details of index tests are also poorly reported. Moreover, many red flags are investigated as individual tests yet these would be rarely applied in this way in a clinical situation. It would be more appropriate to examine multi factorial diagnostic models as considered in clinical reasoning. Consideration should also be given to which red flags are appropriate for fracture in a clinical context and subsequently investigated. Future studies should examine and validate well‐defined sets of index tests that have clinical relevance. As the prevalence of fracture in primary care is low, study design will need to consider large samples to produce robust estimates of diagnostic accuracy.
In the absence of validated screening tests, with appropriate diagnostic accuracy to detect vertebral fracture, enquiry into alternate models of management is warranted to overcome the undesirable effects of unnecessary imaging. For instance, determining the costs, benefits and consequences associated with managing patients presenting with suspicion of fracture by close clinical follow‐up rather than immediate referral may help provide a better direction for future guideline recommendations.
What's new
| Date | Event | Description |
|---|---|---|
| 5 January 2024 | Amended | Amended to add Editorial Note: review superseded. |
History
Protocol first published: Issue 8, 2010 Review first published: Issue 1, 2013
| Date | Event | Description |
|---|---|---|
| 19 January 2011 | Amended | Contact details updated. |
| 3 August 2010 | Amended | references updated |
Acknowledgements
We would like to thank Danielle van der Windt for her assistance in the development of this protocol.
Appendices
Appendix 1. Guide to scoring QUADAS Quality Assessment items
| Item and Guide to classification |
|
1. Was the spectrum of patients representative of the patients who will receive the test in practice? Is it a selective sample of patients? Classify as ‘yes’ if a consecutive series of patients or a random sample has been selected. Information should be given about setting, in inclusion and exclusion criteria, and preferably number of patients eligible and excluded. If a mixed population of primary and secondary care patients is used: the number of participants from each setting is presented. Classify as ‘no’ if healthy controls are used. Also, score ‘no’ if non‐response is high and selective, or there is clear evidence of selective sampling. Also, score ‘no’ if a population is selected that is otherwise unsuitable, for example, > 10% patients are known to have other specific causes of LBP (severe OA, malignancies, etc). Classify as ‘unclear’ if insufficient information is given on the setting, selection criteria, or selection procedure to make a judgment. |
|
2. Is the reference standard likely to classify the target condition correctly? Classify as ‘yes’ if one of: 1) plain radiography; 2) magnetic resonance imaging (MRI); 3) computed tomography (CT); or 4) other imaging tests such as bone scan; is used as a reference standard. Classify as ‘no’ if you seriously question the methods used, if consensus among observers, or an unknown combination of the clinical assessment (“clinical judgment”) is used as reference standard. Classify as ‘unclear’ if insufficient information is given on the reference standard to make an adequate assessment. |
|
3. Is the time period between the reference standard and the index test short enough to be reasonably sure that the target condition did not change between the two tests? Classify as ‘yes’ if the time period between clinical assessment and the reference standard is one week or less. Classify as ‘no’ if the time period between clinical assessment and the reference standard is longer than one week. Classify as ‘unclear’ if there is insufficient information on the time period between index tests and reference standard. |
|
4. Did the whole sample or a random selection of the sample receive verification using a reference standard of diagnosis? Classify as ‘yes’ if it is clear that all patients who received the index test went on to receive a reference standard, even if the reference standard is not the same for all patients. Classify as ‘no’ if not all patients who received the index test received verification by a reference standard. Classify as ‘unclear’ if insufficient information is provided to assess this item. |
|
5. Did patients receive the same reference standard regardless of the index test result? Classify as ‘yes’ if it is clear that all patients receiving the index test are subjected to the same reference standard. Classify as ‘no’ if different reference standards are used. Classify as ‘unclear’ if insufficient information is provided to assess this item. |
|
6. Was the reference standard independent of the index test (i.e. the index test did not form part of the reference standard)? Classify as ‘yes’ if the index test is not part of the reference standard. Classify as ‘no’ if the index test is clearly part of the reference standard. Classify as ‘unclear’ if insufficient information is provided to assess this item. |
|
7. Were the reference standard results interpreted without knowledge of the results of the index test? Classify as ‘yes’ if the results of the reference standard are interpreted blind to the results of the index tests. Also, classify as ‘yes’ if the sequence of testing is always the same (i.e. the reference standard is always performed first, followed by the index test) and consequently, the reference standard is interpreted blind of the index test. Classify as ‘no’ if the assessor is aware of the results of the index test. Classify as ‘unclear’ if insufficient information is given on independent or blind assessment of the index test. |
|
8. Were the index test results interpreted without knowledge of the results of the reference standard? Classify as ‘yes’ if the results of the index test are interpreted blind to the results of the reference test. Also, classify as ‘yes’ if the sequence of testing is always the same (i.e. the index test is always performed first, followed by the reference standard), and consequently, the index test is interpreted blind of the reference standard. Classify as ‘no’ if the assessor is aware of the results of the reference standard. Classify as ‘unclear’ if insufficient information is given on independent or blind assessment of the reference standard. |
|
9. Were the same clinical data available when the index test results were interpreted as would be available when the test is used in practice? Classify as ‘yes’ if clinical data (i.e. patient history, other physical tests) would normally be available when the test results are interpreted and similar data are available in the study. Also, classify as ‘yes’ if clinical data would normally not be available when the test results are interpreted and these data are also not available in the study. Classify as ‘no’ if this is not the case, e.g. if other test results are available that cannot be regarded as part of routine care. Classify as ‘unclear’ if the paper does not explain which clinical information was available at the time of assessment. |
|
10. Were uninterpretable / intermediate test results reported? Classify as ‘yes’ if all test results are reported for all patients, including uninterpretable, indeterminate, or intermediate results. Also, classify as ‘yes’ if the authors do not report any uninterpretable, indeterminate, or intermediate results AND the results are reported for all patients who were described as having been entered into the study. Classify as ‘no’ if you think that such results occurred, but have not been reported. Classify as ‘unclear’ if it is unclear whether all results have been reported. |
|
11. Were withdrawals from the study explained? Classify as ‘yes’ if it is clear what happens to all patients who entered the study (all patients are accounted for, preferably in a flow chart). Also, classify as ‘yes’ if the authors do not report any withdrawals AND if the results are available for all patients who were reported to have been entered in the study. Classify as ‘no’ if it is clear that not all patients who were entered completed the study (received both index test and reference standard), and not all patients are accounted for. Classify as ‘unclear’ when the paper does not clearly describe whether or not all patients completed all tests, and are included in the analysis. |
Appendix 2. MEDLINE search strategy
1 Index test: clinical red flags
"Medical History Taking"[mesh] OR history[tw] OR "red flag"[tw] OR "red flags" OR Physical examination[mesh] OR "physical examination"[tw] OR "function test"[tw] OR "physical test"[tw] OR ((clinical[tw] OR clinically[tw]) AND (diagnosis[tw] OR sign[tw] OR signs[tw] OR significance[tw] OR symptom*[tw] OR parameter*[tw] OR assessment[tw] OR finding*[tw] OR evaluat*[tw] OR indication*[tw] OR examination*[tw])) OR (ra[sh] OR ri[sh])
OR "Wounds and Injuries"[mesh] OR trauma[tw] OR injury[tw] OR "Accidental Falls"[mesh]
2. Population:low‐back pain and anatomical location
(back pain[mesh] OR sciatica[mesh] OR "back ache"[tw] OR backache[tw] OR "back pain"[tw] OR dorsalgia[tw] OR lumbago[tw] OR sciatica[tw] OR Pain[mesh] OR pain[tw] OR ache*[tw] OR aching[tw] OR complaint*[tw] OR dysfunction*[tw] OR disabil*[tw] OR neuralgia[tw]) AND (Back[mesh] OR spine[mesh] OR back[ti] OR lowback[tw] OR lumbar[tw] OR lumba*[tw] OR lumbo*[tw] OR sciatic*[tw] OR ischia*[tw] OR sacroilia*[tw] OR spine[tw] OR spinal[tw] OR radicular[tw] OR "nerve root"[tw] OR "nerve roots"[tw] OR disk[tw] OR disc[tw] OR disks[tw] OR discs[tw] OR vertebra*[tw] OR intervertebra*[tw] OR Sacroiliac‐joint[mesh] OR Lumbar vertebrae[mesh])
3. Target condition: vertebral fracture
Fractures, Bone[mesh] OR Fractures, stress[mesh] OR Fractures, Spontaneous[mesh] OR Fractures, Compression [mesh] OR Fractures, Closed [mesh] OR fracture*[tw] OR Spinal Injuries [mesh] OR Spinal Diseases [mesh] OR Lumbar vertebrae [mesh]
4. Exclusion criteria: children, case reports, animal studies
(exp Child [mesh] OR exp Infant [mesh]) NOT ((exp Child [mesh] OR exp Infant [mesh]) AND (exp Adult [mesh] OR Adolescent [mesh])) OR (Animals [mesh] NOT (Animals [mesh] AND Humans [mesh])) OR “case report”[ti]
Search combination
1 AND 2 AND 3 NOT 4
Appendix 3. EMBASE search strategy
1 Index test: clinical red flags
'medical history taking'/exp OR 'history'/de OR history OR 'red flag' OR 'red flags' OR 'physical examination'/exp OR 'physical examination' OR 'function test'/de OR 'function test' OR 'physical test' OR (clinical OR clinically AND ('diagnosis'/de OR sign OR signs OR significance OR symptom$ OR parameter$ OR assessment OR finding$ OR evaluat$ OR indication$ OR examination$)) OR 'radiography'/exp OR 'radionuclide'/exp AND [humans]/lim
2. Population:low‐back pain and anatomical location
back AND 'pain'/exp OR 'back pain' OR 'low back' AND 'pain'/exp OR 'low back pain' OR 'sciatica'/exp OR sciatica OR backache OR coccyx OR coccydynia OR dorsalgia OR 'lumbar pain' OR spondylosis OR lumbago AND [humans]/lim
3. Target condition: vertebral fracture
'fractures, bone'/exp OR 'fractures, stress'/exp OR 'fractures, spontaneous'/exp OR 'fractures, compression'/exp OR 'fractures, closed'/exp OR fracture$ OR 'spinal injuries'/exp OR 'spinal diseases'/exp OR 'wounds and injuries'/exp OR trauma$ OR injury AND [humans]/lim
4. Exclusion criteria: case reports, animal studies
'case report' AND [humans]/lim
Search combination
1 AND 2 AND 3 NOT 4
Appendix 4. CINAHL search strategy
1 Index test: clinical red flags
MH "Patient History Taking" or TX history or TX "red flag" or MM “Physical examination” or TX "physical examination" or TX "physical test" or TX clinical* or MH "Diagnostic Tests, Routine" and (TX diagnosis or TX sign or TX signs or TX significance or TX symptom* or TX parameter* or TX assessment or TX finding* or TX evaluat* or TX indication* or TX examination*)
2. Population:low‐back pain and anatomical location
MH “Back Pain” or MH “Low back pain” or TX “back pain” or TX “low back pain” or MM Sciatica or TX sciatica or TX Backache or TX Coccyx or TX Coccydynia or TX Dorsalgia or TX lumbar pain or TX spondylosis TX lumbago
3. Target condition: vertebral fracture
MH "Fractures, Bone" or MH "Fractures, stress" or MH "Fractures, Spontaneous" or MH "Fractures, Compression" or MH "Fractures, Closed" or MH "Fractures, Vertebral Compression" or TX fracture* or MH "Spinal Injuries" or MH "Spinal Diseases" or MH "Wounds and Injuries” or TX trauma or TX injury
Search combination
1 and 2 and 3
Data
Presented below are all the data for all of the tests entered into the review.
Tests. Data tables by test.
1. Test.

Age > 50 years
2. Test.

Age > 54 years
3. Test.

Age > 52 years
4. Test.

Age >64 years
5. Test.

Age > 70
6. Test.

Age >74 years
7. Test.

Gender
8. Test.

Female > 54 years
9. Test.

Female > 64 years
10. Test.

Female > 74 years
11. Test.

Trauma
13. Test.

Corticosteroid use
14. Test.

OA
15. Test.

No regular exercise
16. Test.

Sensation change
17. Test.

Motor deficit
18. Test.

DTR abnormality
19. Test.

Neurological signs
20. Test.

SLR
21. Test.

Tenderness
22. Test.

Spasm
23. Test.

Contusion/abrasion
24. Test.

Sciatica
25. Test.

Hip/leg pain
26. Test.

Absence of buttock/leg pain
27. Test.

BMI< 23
28. Test.

Decreased pain on sitting
29. Test.

No gait abnormality
30. Test.

Trauma and neurological signs
31. Test.

Multiple findings
32. Test.

Henschke rule 1 sign positive
33. Test.

Henschke rule 2 positive signs
34. Test.

Henschke rule 3 positive signs
35. Test.

1 of 5 positive ‐ Roman Cluster
36. Test.

2 of 5 positive ‐ Roman Cluster
37. Test.

3 of 5 positive ‐ Roman Cluster
38. Test.

4 of 5 positive ‐ Roman Cluster
39. Test.

5 of 5 positive ‐ Roman Cluster
Characteristics of studies
Characteristics of included studies [ordered by study ID]
Deyo 1986.
| Study characteristics | ||
| Clinical features and settings | Primary care, USA ‐ Sampling method unclear ‐ Only patients with main complaint of back pain considered ‐ Selective criteria used for ordering x‐ray |
|
| Participants | 621 patients presenting with LBP, 311 received imaging Exclusions: maximal pain above T12, evidence of urinary tract disease, women of age less than 45 who did not practice contraception and whose last menstrual period occurred greater than 10 days previous, and participants of a concurrent study. |
|
| Study design | Prospective cohort | |
| Target condition and reference standard(s) | Fracture or malignancy: X‐ray ‐ anteroposterior and lateral lumbar views |
|
| Index and comparator tests | Significant trauma, Age > 50 years, Corticosteroid use | |
| Follow‐up | All radiology reports obtained and medical records reviewed | |
| Notes | Prevalence of vertebral fracture 4.5% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Yes | Consecutive series of patients with LBP |
| Acceptable reference standard? All tests | Yes | X‐ray ‐ anteroposterior and lateral lumbar views |
| Acceptable delay between tests? All tests | Yes | 84% of reference test obtained on the day of the index test or within 6 days thereafter |
| Partial verification avoided? All tests | No | Only 311 of 621 received the x‐ray reference test |
| Differential verification avoided? All tests | Unclear | Unclear from text |
| Incorporation avoided? All tests | Yes | X‐ray not part of index tests |
| Reference standard results blinded? All tests | No | Radiologists were aware of the index tests |
| Index test results blinded? All tests | Yes | Index tests performed before reference standard |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | Possible fracture in patients not receiving the reference test |
| Withdrawals explained? All tests | Unclear | Unclear from text |
Gibson 1992.
| Study characteristics | ||
| Clinical features and settings | Accident and Emergency Department, UK ‐ Consecutive sample over 6‐month period |
|
| Participants | 225 patients presenting with pain in the lumbar region of less than 48 hours duration 108 (48%) had radiographs |
|
| Study design | Prospective cohort | |
| Target condition and reference standard(s) | Vertebral fracture: Plain radiograph ‐ not further defined |
|
| Index and comparator tests | Trauma, trauma and neurological signs, neurological signs | |
| Follow‐up | ||
| Notes | Prevalence of vertebral fracture was 6.5% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Yes | Consecutive series of patients with pain in the lumbar region ‐ number excluded not reported |
| Acceptable reference standard? All tests | Yes | Plain radiography |
| Acceptable delay between tests? All tests | Unclear | No time frame specified |
| Partial verification avoided? All tests | No | Only 48% of sample received reference standard |
| Differential verification avoided? All tests | Unclear | Unclear from text |
| Incorporation avoided? All tests | Yes | Questionnaire (index test) not part of reference standard |
| Reference standard results blinded? All tests | No | Treating doctor involved in radiological reporting |
| Index test results blinded? All tests | Yes | Index tests performed prior to reference standard |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | More than half of the patients did not receive imaging |
| Withdrawals explained? All tests | Unclear | Unclear from results |
Henschke 2009.
| Study characteristics | ||
| Clinical features and settings | Primary care, Australia Consecutive sample from general practice, physiotherapy or chiropractic |
|
| Participants | 1,172 participants presenting with acute LBP, mean age 43.97 years, 53.4% male, 72.6% from physiotherapy, 59.4% less than 1 week since onset | |
| Study design | Prospective cohort | |
| Target condition and reference standard(s) | Serious spinal pathology including vertebral fracture: 12‐month clinical follow‐up with suspected cases confirmed by imaging studies and specialist review |
|
| Index and comparator tests | 25 red flags ‐ 5 specific to vertebral fracture; age > 70, significant trauma, prolonged use of corticosteroids, altered sensation, clinician judgement | |
| Follow‐up | ||
| Notes | Prevalence of fracture was 0.68% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Yes | Consecutive sample of patients with LBP with clear inclusion criteria |
| Acceptable reference standard? All tests | No | Long‐term follow‐up of all patients with only those suspected of serious pathology referred for specialist appointment and imaging (i.e., asymptomatic fracture could go undetected) |
| Acceptable delay between tests? All tests | Unclear | Unclear from text |
| Partial verification avoided? All tests | Yes | All patients had long‐term follow‐up |
| Differential verification avoided? All tests | Yes | All patients received the reference standard |
| Incorporation avoided? All tests | Unclear | Unclear from text |
| Reference standard results blinded? All tests | Unclear | Unclear from text |
| Index test results blinded? All tests | Yes | Index test completed prior to reference standard |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | Yes | All results reported |
| Withdrawals explained? All tests | Yes | All participants completed follow‐up |
Patrick 1983.
| Study characteristics | ||
| Clinical features and settings | Accident and Emergency Department and University Hospital Medical Centre, USA ‐ Consecutive sample of patients with imaging studies ordered over three‐month period |
|
| Participants | 552 patients with lumbar spine x‐ray referral, 54% male, age range 6 to 95 years, 99% complained of LBP | |
| Study design | Retrospective chart review | |
| Target condition and reference standard(s) | Lumbar spine bony injury: X‐ray (not further defined) |
|
| Index and comparator tests | Trauma, tenderness, LBP with radiation and/or hip pain, spasm, Sensory deficit, motor deficit, tendon reflex abnormality, positive straight leg raise, contusion/abrasion, multiple findings (distinction between fracture and other bony injury unclear for some index tests) | |
| Follow‐up | ||
| Notes | Prevalence of fracture was 7.2% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Unclear | Consecutive sample of with lumbar spine imaging requested ‐ reason for presentation not described |
| Acceptable reference standard? All tests | Yes | X‐ray |
| Acceptable delay between tests? All tests | Yes | At time of admission |
| Partial verification avoided? All tests | Yes | All had reference standard |
| Differential verification avoided? All tests | Yes | X‐ray |
| Incorporation avoided? All tests | Yes | X‐ray not part of clinical examination |
| Reference standard results blinded? All tests | Unclear | Unclear from text |
| Index test results blinded? All tests | Unclear | Unclear from text |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | Distinction between fracture and other bony injury was unclear for some index tests |
| Withdrawals explained? All tests | Unclear | Unclear from text |
Reinus 1998.
| Study characteristics | ||
| Clinical features and settings | Accident and Emergency Department, USA ‐ Consecutive sample of all patients receiving lumbosacral radiographs in a 14‐month period |
|
| Participants | 482 patients, 35% male, age range 17‐98 years, 92% with back pain | |
| Study design | Prospective cohort | |
| Target condition and reference standard(s) | Suspected clinical diagnosis including fracture and spondylosis: Lumbosacral AP, lateral, bi‐lateral posterior oblique and coned down radiological views |
|
| Index and comparator tests | Trauma, neurological deficit | |
| Follow‐up | ||
| Notes | Prevalence of fracture was 11% (24 intermediate age, 21 chronic, 10 acute) | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Unclear | Consecutive sample of patients with lumbosacral imaging ‐ reason for presentation not described |
| Acceptable reference standard? All tests | Yes | Lumbosacral AP, lateral, bi‐lateral posterior oblique and coned down radiological views |
| Acceptable delay between tests? All tests | Unclear | Not time specified |
| Partial verification avoided? All tests | Unclear | Unclear from text |
| Differential verification avoided? All tests | Unclear | Unclear from text |
| Incorporation avoided? All tests | Yes | Radiological diagnosis not part of clinical examination |
| Reference standard results blinded? All tests | No | Radiologists aware of clinical history |
| Index test results blinded? All tests | Yes | Clinical examination prior to radiograph |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | Ambiguous reporting of index test in patients with confirmed intermediate and chronic fracture |
| Withdrawals explained? All tests | Yes | All results reported |
Roman 2010.
| Study characteristics | ||
| Clinical features and settings | University Hospital Spine Clinic, USA ‐Consecutive sample in 5 year period |
|
| Participants | 1448 patients with lumbar‐related disorders, 41% male | |
| Study design | Retrospective chart review | |
| Target condition and reference standard(s) | Compression fracture or wedge deformity: Standard radiograph or CT assessing sagittal alignment, vertebral body compression and spinal canal dimensions |
|
| Index and comparator tests | Leg or buttock pain, gender, age, BMI, gait abnormality, regular exercise, sitting pain, osteoarthritis, multiple signs | |
| Follow‐up | ||
| Notes | Prevalence of fracture was 3% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Yes | Consecutive sample of patients with lumbar disorders |
| Acceptable reference standard? All tests | Yes | Radiograph or CT |
| Acceptable delay between tests? All tests | Yes | Part of standard clinical assessment |
| Partial verification avoided? All tests | Yes | Part of standard clinical assessment |
| Differential verification avoided? All tests | No | X‐ray or CT |
| Incorporation avoided? All tests | Yes | Imaging was not part of clinical examination |
| Reference standard results blinded? All tests | Unclear | Unclear from text |
| Index test results blinded? All tests | Unclear | Unclear from text |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | Yes | Results for all participants were reported |
| Withdrawals explained? All tests | Yes | Results for all participants were reported |
Scavone 1981.
| Study characteristics | ||
| Clinical features and settings | University teaching hospital medical centre, USA ‐ Sampling: all patients with lumbar spine x‐ray |
|
| Participants | 871 patients in 12‐month period, 176 were inpatients and 695 outpatients | |
| Study design | Retrospective chart review | |
| Target condition and reference standard(s) | Abnormal radiological finding including fracture: AP and lateral x‐ray views |
|
| Index and comparator tests | Major trauma, minor trauma, tenderness, LBP with radiation, hip/leg pain, muscle spasm, neurological deficits, sciatica, abnormal physical examination | |
| Follow‐up | ||
| Notes | Prevalence of fracture was 3% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Unclear | Selection criteria not clear |
| Acceptable reference standard? All tests | Yes | X‐ray: AP and lateral views |
| Acceptable delay between tests? All tests | Unclear | No time frame specified |
| Partial verification avoided? All tests | Unclear | Unclear from text |
| Differential verification avoided? All tests | Unclear | Unclear from text |
| Incorporation avoided? All tests | Yes | X‐ray not part of clinical examination |
| Reference standard results blinded? All tests | Unclear | Unclear from text |
| Index test results blinded? All tests | Yes | Radiologists not aware of clinical symptoms |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | Reporting of fracture prevalence was ambiguous |
| Withdrawals explained? All tests | No | History for 57 persons could not be obtained |
van den Bosch 2004.
| Study characteristics | ||
| Clinical features and settings | University hospital medical centre, UK ‐ Sampling: random sample of 2100 lumbar spine radiology reports from 6269 patients referred for imaging by general practitioners |
|
| Participants | 2007 patients with full radiographic and demographic details, 42% were men, mean age was 49.9 for men and 56.7 for women | |
| Study design | Retrospective chart review | |
| Target condition and reference standard(s) | Serious radiological findings including fracture X‐ray (not further defined) |
|
| Index and comparator tests | Age, gender | |
| Follow‐up | ||
| Notes | Prevalence of vertebral fracture was 4.1% | |
| Methodological quality | ||
| Item | Authors' judgement | Support for judgement |
| Representative spectrum? All tests | Yes | Random sample of patients with LBP referred for imaging by general practitioners |
| Acceptable reference standard? All tests | Yes | X‐ray |
| Acceptable delay between tests? All tests | Unclear | Unclear from text |
| Partial verification avoided? All tests | No | X‐ray reports of 93 patients not available |
| Differential verification avoided? All tests | Yes | X‐ray only |
| Incorporation avoided? All tests | Yes | Index test not part of reference test |
| Reference standard results blinded? All tests | Unclear | Unclear from text |
| Index test results blinded? All tests | Yes | Index test performed prior to reference standard |
| Relevant clinical information? All tests | Yes | Index tests available in usual care |
| Uninterpretable results reported? All tests | No | 93 patients with no records available |
| Withdrawals explained? All tests | No | 93 patients with no records available |
AP: anterior‐posterior BMI: body mass index CT: computed tomography LBP: low‐back pain
Characteristics of excluded studies [ordered by study ID]
| Study | Reason for exclusion |
|---|---|
| Durham 1995 | Patients presented with blunt trauma not LBP. |
| Frankel 1994 | Patients presented with blunt trauma not LBP. |
| Gestring 2002 | Patients presented with blunt trauma not LBP. |
| Holmes 2003 | Patients presented with blunt trauma not LBP. |
| Hsu 2003 | Patients presented with blunt and multi‐trauma not LBP. |
| Samuels 1993 | Patients presented with blunt trauma not LBP. |
| Terregino 1995 | Patients presented with blunt trauma not LBP. |
LBP: low‐back pain
Differences between protocol and review
Due to the limited number of index tests evaluated in studies and the heterogeneity in study setting, meta‐analyses were not performed.
Contributions of authors
All review authors contributed to discussions regarding the conception and design of the study. All review authors read and approved the final manuscript.
Sources of support
Internal sources
Vrije Universiteit, EMGO+ Institute for Health and Care Research, Netherlands
University of Sydney, Australia
ErasmusMC, University Medical Centre, Netherlands
The George Institute for Global Health, Australia
External sources
Dutch Health Care Insurance Board, Netherlands
National Health and Medical Research Council, Australia
Declarations of interest
No conflicts of interest are declared.
Edited (no change to conclusions)
References
References to studies included in this review
Deyo 1986 {published data only}
- Deyo RA, Diehl AK. Lumbar spine films in primary care: current use and effects of selective ordering criteria. Journal of General Internal Medicine 1986;1:20-5. [DOI] [PubMed] [Google Scholar]
Gibson 1992 {published data only}
- Gibson M, Zoltie N. Radiography for back pain presenting to accident and emergency departments. Archives of Emergency Medicine 1992;9(1):28-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
Henschke 2009 {published data only}
- Henschke N, Maher CG, Refshauge KM, Herbert RD, Cumming RG, Bleasel J, et al. Prevalence of and screening for serious spinal pathology in patients presenting to primary care settings with acute low back pain. Arthritis & Rheumatism 2009;60(10):3072-80. [MEDLINE: ] [DOI] [PubMed] [Google Scholar]
Patrick 1983 {published data only}
- Patrick JD, Doris PE, Mills ML, Friedman J, Johnston C. Lumbar spine x-rays: a multihospital study. Annals of Emergency Medicine 1983;12(2):84-7. [DOI] [PubMed] [Google Scholar]
Reinus 1998 {published data only}
- Reinus WR, Strome G, Zwemer Jr FL. Use of lumbosacral spine radiographs in a level II emergency department. American Journal of Roentgenology 1998;179(2):443-7. [DOI] [PubMed] [Google Scholar]
Roman 2010 {published data only}
- Roman M, Brown C, Richardson W, Isaacs R, Howes C, Cook C. The development of a clinical decision making algorithm for detection of osteoporotic vertebral compression fracture or wedge deformity. Journal of Manual & Manipulative Therapy 2010;18(1):44-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Scavone 1981 {published data only}
- Scavone JG, Latshaw RF, Rohrer GV. Use of lumbar spine films. Statistical evaluation at a university teaching hospital. JAMA 1981;246:1105-8. [PubMed] [Google Scholar]
van den Bosch 2004 {published data only}
- den Bosch MA, Hollingworth W, Kinmonth AL, Dixon AK. Evidence against the use of lumbar spine radiography for low back pain. Clinical Radiology 2004;59(1):69-76. [DOI] [PubMed] [Google Scholar]
References to studies excluded from this review
Durham 1995 {published data only}
- Durham RM, Luchtefeld WB, Wibbenmeyer L, Maxwell P, Shapiro MJ, Mazuski JE. Evaluation of the thoracic and lumbar spine after blunt trauma. American Journal of Surgery 1995;170(6):681-4. [MEDLINE: ] [DOI] [PubMed] [Google Scholar]
Frankel 1994 {published data only}
- Frankel H, LRozycki GS, Ochsner MG, Harviel JD, Champion HR. Indications for obtaining surveillance thoracic and lumbar spine radiographs. Journal of Trauma 1994;37(4):673-6. [MEDLINE: ] [DOI] [PubMed] [Google Scholar]
Gestring 2002 {published data only}
- Gestring ML, Gracias VH, Feliciano MA, Reilly PM, Shapiro MB, Johnson JW, et al. Evaluation of the lower spine after blunt trauma using abdominal computed tomographic scanning supplemented with lateral scanograms. Journal of Trauma 2002;53(1):9-14. [DOI] [PubMed] [Google Scholar]
Holmes 2003 {published data only}
- Holmes JF, Panacek EA, Miller PQ, Lapidis AD, Mower WR. Prospective evaluation of criteria for obtaining thoracolumbar radiographs in trauma patients. Journal of Emergency Medicine 2003;24(1):1-7. [DOI] [PubMed] [Google Scholar]
Hsu 2003 {published data only}
- Hsu, JM, Joseph T, Ellis AM. Thoracolumbar fracture in blunt trauma patients: Guidelines for diagnosis and imaging. Injury 2003;34(6):426-33. [DOI] [PubMed] [Google Scholar]
Samuels 1993 {published data only}
- Samuels LE. Routine radiologic evaluation of the thoracolumbar spine in blunt trauma patients: a reappraisal. Journal of Trauma 1993;34(1):85-9. [MEDLINE: ] [DOI] [PubMed] [Google Scholar]
Terregino 1995 {published data only}
- Terregino CA, Ross SE, Lipinski MF, Foreman J, Hughes R. Selective indications for thoracic and lumbar radiography in blunt trauma. Annals of Emergency Medicine 1995;26(2):126-9. [DOI] [PubMed] [Google Scholar]
Additional references
Bigos 1994
- Bigos SJ, Braen GR, Deyo RA, Hart J, Keller R, Liang M, et al. Acute low back problems in adults. Clinical practice guideline no. 14. Rockville, MD: Agency for Health Care Policy and Research, Public Health Service, U.S. Department of Health and Human Services, 1994. [Google Scholar]
Cauley 2007
- Cauley JA, Hochberg MC, Lui LY, Palermo L, Ensrud KE, Hillier TA, et al. Long-term risk of incident vertebral fractures. JAMA 2007;298:2761-7. [DOI] [PubMed] [Google Scholar]
Chou 2007
- Chou R, Qaseem A, Snow V, Casey D, Cross JT Jr, Shekelle P, et al. Diagnosis and treatment of low back pain: a joint clinical practice guideline from the American College of Physicians and the American Pain Society. Annals of Internal Medicine 2007;147(1):478-91. [PMID: ] [DOI] [PubMed] [Google Scholar]
Chou 2009
- Chou R, Fu R, Carrino JA, Deyo RA. Imaging strategies for low-back pain: systematic review and meta-analysis. Lancet 2009;373:463-72. [DOI] [PubMed] [Google Scholar]
Chou 2011
- Chou R, Qaseem A, Owens DK, Shekelle P, Clinical Guidelines Committee of the American College of Physicians. Diagnostic imaging for low back pain: advice for high-value health care from the American College of Physicians. Annals of Internal Medicine 2011;154(3):181-9. [DOI] [PubMed] [Google Scholar]
Cooper 1993
- Cooper C, O'Neill T, Silman A. The epidemiology of vertebral fractures. European Vertebral Osteoporosis Study Group. Bone 1993;14(Suppl 1):S89-97. [DOI] [PubMed] [Google Scholar]
Damiano 2006
- Damiano J, Kolta S, Porcher R, Tournoux C, Dougados M, Roux C. Diagnosis of vertebral fractures by vertebral fracture assessment. Journal of Clinical Densiometry 2006;9:66-71. [DOI] [PubMed] [Google Scholar]
Deeks 2009
- Deeks JJ, Bossuyt PM, Gatsonis C (editors). Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1.0.1 [updated March 2009]. The Cochrane Collaboration, 2009. Available from: http://srdta.cochrane.org/.
Delmas 2001
- Delmas PD, Watts N, Eastell R, Ingersleben G, de Langerijt L, Cahall DL. Under diagnosis of vertebral fractures is a worldwide problem: The IMPACT Study. Journal of Bone and Mineral Research 2001;16(Suppl.1):S139. [DOI] [PubMed] [Google Scholar]
Edmond 2005
- Edmond SL, Kiel DP, Samelson EJ, Kelly-Hayes M, Felson DT. Vertebral deformity, back symptoms, and functional limitations among older women: the Framingham Study. Osteoporosis International 2005;16:1086–95. [DOI] [PubMed] [Google Scholar]
Ensrud 1999
- Ensrud KE, Nevitt MC, Palermo L, Cauley JA, Griffith JM, Genant HK, et al. What proportion of incident morphometric vertebral fractures are clinically diagnosed and vice versa? Journal of Bone and Mineral Research 1999;14:S138. [DOI] [PubMed] [Google Scholar]
Ensrud 2000
- Ensrud KE, Thompson DE, Cauley JA, Nevitt MC, Kado DM, Hochberg MC, et al. Prevalent vertebral deformities predict mortality and hospitalization in older women with low bone mass. Fracture Intervention Trial Research Group. Journal of the American Geriatric Society 2000;48:241-9. [DOI] [PubMed] [Google Scholar]
Ferrar 2000
- Ferrar L, Jiang G, Barrington NA. Identification of vertebral deformities in women: comparison of radiological assessment and quantitative morphometry using morphometric radiography and morphometric X-ray absorptiometry. Journal of Bone and Mineral Research 2000;15:575-85. [DOI] [PubMed] [Google Scholar]
Ferrar 2005
- Ferrar L, Jiang G, Adams J, Eastell R. Identification of vertebral fractures: An update. Osteoporosis International 2005;16:717-28. [DOI] [PubMed] [Google Scholar]
Ferrar 2008
- Ferrar L, Jiang G, Schousboe JT, DeBold CR, Eastell R. Algorithm-based qualitative and semi-quantitative identification of prevalent vertebral fracture: agreement between different readers, imaging modalities, and diagnostic approaches. Journal of Bone and Mineral Research 2008;23:417-24. [DOI] [PubMed] [Google Scholar]
Flynn 2011
- Flynn TW, Smith B, Chou R. Appropriate use of diagnostic imaging in low back pain: a reminder that unnecessary imaging may do as much harm as good. Journal of Orthopaedic and Sports Physical Therapy. 2011;41(11):838-46. [DOI: 10.2519/jospt.2011.3618] [DOI] [PubMed] [Google Scholar]
Freedman 2008
- Freedman BA, Potter BK, Nesti LJ, Giuliani JR, Hampton C, Kuklo TR. Osteoporosis and vertebral compression fractures - continued missed opportunities. Spine Journal 2008;8:756-62. [DOI] [PubMed] [Google Scholar]
Genant 1993
- Genant HK, Wu CY, Kuijk C, Nevitt MC. Vertebral fracture assessment using a semi-quantitative technique. Journal of Bone and Mineral Research 1993;8:1137–48. [DOI] [PubMed] [Google Scholar]
Grigoryan 2003
- Grigoryan M, Guermazi A, Roemer FW, Delmas PD, Genant HK. Recognizing and reporting osteoporotic vertebral fractures. European Spine Journal 2003;12(Suppl 2):S104-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hancock 2008
- Hancock MJ, Maher CG, Latimer J. Spinal manipulative therapy for acute low back pain: a clinical perspective. Journal of Manual & Manipulative Therapy 2008;16(4):198-203. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hasserius 2005
- Hasserius R, Karlsson MK, Jonsson B, Redlund-Johnell I, Johnell O. Long-term morbidity and mortality after a clinically diagnosed vertebral fracture in the elderly - a 12- and 22-year follow-up of 257 patients. Calcified Tissue International 2005;76:235-42. [DOI] [PubMed] [Google Scholar]
Henschke 2008
- Henschke N, Maher CG, Refshauge KM. A systematic review identifies five "red flags" to screen for vertebral fracture in patients with low back pain. Journal of Clinical Epidemiology 2008;61(2):110-8. [DOI] [PubMed] [Google Scholar]
Henschke 2010a
- Henschke N, Maher CG, Ostelo RWJG, Vet HCW, Macaskill P, Irwig L. Red flags to screen for malignancy in patients with low-back pain. Cochrane Database of Systematic Reviews 2010, Issue 9. Art. No: CD008686. [DOI: 10.1002/14651858.CD008686] [DOI] [PMC free article] [PubMed] [Google Scholar]
Johnell 2006
- Johnell O, Kanis JA. An estimate of the worldwide prevalence and disability associated with osteoporotic fractures. Osteoporos International 2006;17:1726-33. [DOI] [PubMed] [Google Scholar]
Koes 2010
- Koes B, Tulder MW, Lin C, Macedo L, McAuley JH, Maher CG. An updated overview of clinical guidelines for the management of non-specific low back pain in primary care. European Spine Journal 2010;19:2075-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lijmer 1999
- Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, Meulen JH, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282(11):1061-6. [PMID: ] [DOI] [PubMed] [Google Scholar]
Lindsay 2001
- Lindsay R, Silverman SL, Cooper C, Hanley DA, Barton, Broy S, et al. Risk of new vertebral fracture in the year following a fracture. JAMA 2001;285(3):320-3. [DOI] [PubMed] [Google Scholar]
Maitland 2001
- Maitland GD, Banks K, English K, Hengeveld E. Maitland’s Vertebral Manipulation. 6th edition. Boston: Butterworth-Heinemann, 2001. [Google Scholar]
Melton 2003
- Melton 3rd LJ. Adverse outcomes of osteoporotic fractures in the general population. Journal of Bone and Mineral Research 2003;18:1139–41. [DOI] [PubMed] [Google Scholar]
NHMRC 2003
- National Health and Medical Research Council Australia (NHMRC). Evidence-based management of acute musculoskeletal pain. Bowen Hills, Queensland: Australian Academic Press, 2003. [Google Scholar]
Papaioannou 2002
- Papaioannou A, Watts NB, Kendler DL, Chui KY, Adachi JD, Ferko N. Diagnosis and management of vertebral fractures in elderly adults. American Journal of Medicine 2002;113:220-8. [DOI] [PubMed] [Google Scholar]
Rea 2000
- Rea JA, Chen MB, Li J. Morphometric X-ray absorptiometry and morphometric radiography of the spine: a comparison of prevalent vertebral deformity identification. Journal of Bone and Mineral Research 2000;15:564-74. [DOI] [PubMed] [Google Scholar]
Samelson 2011
- Samelson EJ, Christiansen BA, Demissie S, Broe KE, Zhou Y, Meng CA, et. Reliability of vertebral fracture assessment using multidetector CT lateral scout views: the Framingham Osteoporosis Study. Osteoporosis International 2011;22(4):1123-31. [DOI: 10.1007/s00198-010-1290-6] [DOI] [PMC free article] [PubMed] [Google Scholar]
Schousboe 2006
- Schousboe JT, Ensrud KE, Nyman JA. Cost-effectiveness of vertebral fracture assessment to detect prevalent vertebral deformity and select postmenopausal women with a femoral neck T-score > 2.5 for alendronate therapy: a modeling study. Journal of Clinical Densitometry 2006;9:133-43. [DOI] [PubMed] [Google Scholar]
Suarez‐Almazor 1997
- Suarez-Almazor ME, Belseck E, Russell AS, Mackel JV. Use of lumbar radiographs for the early diagnosis of low back pain. Proposed guidelines would increase utilization. JAMA 1997;277:1782-6. [PubMed] [Google Scholar]
Underwood 2009
- Underwood M. Diagnosing acute nonspecific low back pain: time to lower the red flags? Arthritis and rheumatism 2009;Oct;60(10):2855-7. [DOI] [PubMed] [Google Scholar]
van der Windt 2010
- Windt DAWM, Simons E, Riphagen II, Ammendolia C, Verhagen AP, Laslett M, et al. Physical examination for lumbar radiculopathy due to disc herniation in patients with low-back pain. Cochrane Database of Systematic Reviews 2010, Issue 2. Art. No: CD007431. [DOI: 10.1002/14651858.CD007431.pub2] [DOI] [PubMed] [Google Scholar]
van Tulder 2004
- Tulder M, Becker A, Bekkering T, Breen A, Gil del Real MT, Hutchinson A, et al. European guidelines for the management of acute nonspecific low back pain in primary care. European Commission COST B13; Available at www.backpaineurope.org (accessed Dec 8, 2009), 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vokes 2006
- Vokes T, Bachman D, Baim S. Vertebral fracture assessment: the 2005 ISCD official positions. Journal of Clinical Densitometry 2006;9:37-46. [DOI] [PubMed] [Google Scholar]
Waddell 2004
- Waddell G. The Back Pain Revolution. 2nd edition. London: Churchill Livingstone, 2004. [Google Scholar]
Whiting 2003
- Whiting P, Rutjes A, Reitsma J, Bossuyt P, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology 2003;3:25; available from http://www.biomedcentral.com/1471-2288/3/25. [DOI: 10.1186/1471-2288-3-25] [DOI] [PMC free article] [PubMed] [Google Scholar]
Williams 2010
- Williams CM, Maher CG, Hancock MJ, McAuley JH, McLachlan AJ, Britt H, ET AL. Low back pain and best practice care: A survey of general practice physicians. Archives of Internal Medicine 2010;170(12):1088. [DOI] [PubMed] [Google Scholar]
Woolf 2003
- Woolf AD, Pfleger B. Burden of major musculoskeletal conditions. Bulletin of the World Health Organisation 2003;81:646-56. [PMC free article] [PubMed] [Google Scholar]
References to other published versions of this review
Henschke 2010b
- Henschke N, Williams CM, Maher CG, Tulder MW, Koes BW, Macaskill P, Irwig L. Red flags to screen for vertebral fracture in patients presenting with low-back pain. Cochrane Database of Systematic Reviews 2010, Issue 8. Art. No: CD008643. [DOI: 10.1002/14651858.CD008643] [DOI] [PubMed] [Google Scholar]
