Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Nov 1.
Published in final edited form as: PM R. 2021 Oct 19;14(11):1325–1332. doi: 10.1002/pmrj.12705

Reliability and validity of subjective radiologist reporting of temporal changes in lumbar spine MRI findings

Mark Jonathan Hancock 1, Chris Maher 2, Jeffrey G Jarvik 3, Michele C Battié 4, James Elliott 5, Tue Jensen 6,7,8, John Panagopoulos 9, Hazel Jenkins 10, Marg Pardey 11, Jeffery McIntosh 12, John Magnussen 13
PMCID: PMC8917240  NIHMSID: NIHMS1739857  PMID: 34510774

Abstract

Background

The importance of lumbar findings on magnetic resonance imaging (MRI) remains controversial. Temporal changes in lumbar MRI findings over time may provide important insights into the causes of low back pain. However, the reliability and validity of temporal changes is unknown.

Objective:

The purpose of this study was to 1) Investigate the inter-rater reliability of subjective radiologist reporting of temporal changes in lumbar spine MRI findings, and 2) determine how commonly temporal changes are reported when two scans are conducted 30 minutes apart (considered false positives).

Design:

Cross-sectional study.

Setting:

Radiology clinic.

Participants:

40 volunteers (mean age 40; 53 % female) with current (n=31) or previous (n=9) low back pain underwent an initial lumbar MRI scan on a single 3T scanner. Participants then lay on a bed for 30 minutes before undergoing an identical MRI. In addition, we purposely selected 5 participants from a previous study with repeat lumbar MRI scans where temporal changes were reported in at least one MRI finding (1–12 weeks after initial scan) and another 5 participants where no temporal change was reported. The 10 participants were included in analyses for aim 1 only.

Interventions:

Not applicable.

Main Outcome measures:

Two blinded radiologists reported on temporal changes between the baseline and repeat scan for 12 different MRI findings (e.g. disc herniation, annular fissure) at 5 levels.

Results:

The inter-rater reliability of subjective reporting of temporal changes was poor for all MRI findings based on Kappa values (≤ 0.24), but agreement was relatively high (≥ 90.8%). This is explained by the low prevalence of temporal changes as demonstrated by high values for Prevalence and Bias Adjusted Kappa (≥ 0.82). “False positive” temporal changes were reported by at least one radiologist for most MRI findings, but the rate was generally low.

Conclusions:

Caution is required when interpreting temporal changes in lumbar MRI findings due to low reliability and some false positive reporting.

Introduction:

The importance of lumbar spine findings reported on MRI (e.g disc height loss or disc herniation) remains controversial1,2 and current guidelines suggest that imaging has little clinical value for most patients with low back pain (LBP).3,4 There is strong evidence that people without LBP commonly have these structural findings on MRI, but there is also evidence that these findings are more common in people with LBP.57 In addition, the presence of lumbar findings on MRI may increase the risk of future LBP.8

The cross-sectional nature of most previous imaging studies investigating lumbar findings on MRI is a drawback. Longitudinal studies with repeat MRI scans have the potential to provide stronger evidence on the clinical importance of lumbar MRI findings by investigating the relationship between temporal changes (differences between imaging conducted at 2 or more time points) in MRI and clinical outcomes. A systematic review investigating temporal changes in MRI9 found only one study that investigated the relationship between lumbar temporal MRI change (over a period of ≤ 1 year) and changes in clinical symptoms in patients with low back pain or sciatica. In patients with sciatica this study found no association between temporal changes in disc herniation and a good or bad outcome at 1 year.10

While studies using repeat imaging have the potential to lead to better understanding of the clinical importance of MRI findings, the results are dependent on the reliability and validity of reporting temporal changes in MRI findings. The reliability and validity of reporting temporal changes may be impacted by a number of factors including, but not limited to, patient positioning and motion within the scanner, variability in the defined field of view, slice selection or position, and radiologist threshold for identification/reporting of perceived temporal changes.

Our previous pilot work found that even in people without current LBP, temporal changes in MRI findings were reported to occur between MRI scans conducted up to 12 weeks apart.11 One possible explanation for the unexpected temporal changes in MRI findings in pain-free participants is that they represent “true” temporal changes, despite no current clinical symptoms. Another possibility is that the temporal changes do not represent true change in the tissues, but rather are false positive findings due to some/all of the aforementioned technical factors. Finally, the result could be due to poor reliability of reporting perceived temporal changes.

Before substantial research effort is invested into investigating the relationship between temporal changes on MRI and clinical symptoms, it is important to determine the reliability of reporting temporal changes in MRI findings and to determine if identified temporal changes are true changes or are errors resulting from a repeated imaging process. Therefore, the specific aims of the study were 1) investigate the inter-rater reliability of reporting temporal changes in MRI findings of the lumbar spine on repeat lumbar imaging, and 2) determine how commonly temporal changes are reported when two scans are conducted approximately 30 minutes apart. These will be considered false positive changes due to the very short interval between scans.

Methods:

Participants

Forty participants were recruited from the community using posters on public notice boards and social media. To be included volunteers were required to be 18 or older with either current LBP or a history of LBP. Participants were excluded if they were claustrophobic, had any contraindications to undergoing MRI (e.g. metal implants, pacemakers), had a presentation suggestive of LBP due to a serious pathology (e.g. cancer or fracture) or were pregnant. In addition, we included 10 participants from our previous LILAC study11 which involved longitudinal imaging of participants (20 with acute LBP and 10 controls without current LBP) at baseline, 1, 2, 6 and 12 weeks. The participants from the LILAC study were recruited from primary care physiotherapy and chiropractic clinics. We purposely selected 5 cases from LILAC where the reporting radiologist had noted temporal changes in at least one MRI finding between the baseline and follow-up scan and another 5 cases where no temporal change was reported. For the 10 LILAC participants scans from the original LILAC study were used. These were included so at least some of the repeat scans included participants with temporal changes to help blinding of the radiologists in the current study. The 10 participants from the LILAC study were included in the reliability analyses (aim 1) but excluded from the validity analyses investigating false positives (aim 2). Before participating all participants read and signed a consent form approved by the Macquarie University Human Ethics Committee.

Procedures

Each volunteer attended a single radiology clinic (Macquarie Medical Imaging) on one occasion for approximately 1 hour and 15 minutes. They completed a short questionnaire including demographic details and information about current and previous LBP. Participants then underwent a lumbar MRI scan on a single 3T GE Discovery MR750W scanner (Milwaukee, WI. USA) using a standardised protocol. Participants were positioned by a radiographer, feet first in the bore as per standard clinical protocol for a lumbar spine assessment. After the MRI, participants were removed from the scanner and asked to walk approximately 10 meters and then lie supine and remain still on a nearby bed for approximately 30 minutes. A pillow was placed under the knees to keep the lumbar spine in a relatively neutral position. After this the participant underwent an identical MRI scan on the same scanner. The second MRI scan was conducted by the same radiographer who attempted to position the participant in an identical manner to the previous scan. Each participant was posted a CD of their MRI scans and a standard MRI report. This was accompanied by a letter stating “Your MRI has been reviewed by a specialist radiologist and there is no evidence of any serious condition such as a fracture or cancer. We have included a full report; however, it is important you realise these findings discussed are common in people with and without back pain and it is best to consider them normal age-related findings.”

MRI evaluation

All imaging was conducted on a single 3.0 T MRI scanner (GE Discovery 750W; GE Healthcare, Milwaukee, WI, USA) with the embedded high-density posterior array. Participants were positioned in a supine position, with a bolster placed under their knees. The lower costal margin was used as the centring point prior to entering the bore. Three orthogonal localisers were performed once the participants reached isocentre. The sagittal FOV (300 mm) was centred at approximately the midpoint of L3 to acquire the sagittal FSE T2 and sagittal T1. Sagittal T2 fast spin echo (FSE) images were acquired with an echo train length of 22, 24 slices, repetition time (TR) ~ 5940ms and echo time (TE) 100.4ms. Slice thickness was 3.4mm with no gap. The mid-slice was aligned around the midpoint of the spinal column. Sagittal T1 images replicated this FOV and slice alignment, with scanning parameters of echo train length 3, TR 684ms, TE 8.54ms. Planning of the two, upper and lower, axial T2 FSE sequences was based on the sagittal acquisition using the posterior longitudinal ligament as a reference point and a FOV of 180mm. Axial T2 FSE images were acquired with echo train length 21, 30 slices, TR 3775ms and TE of 87.52ms. Slice thickness was 4 mm with no interslice space. The upper T2 FSE axial scan covered from conus down, from approximately T12, aligned perpendicular to the posterior longitudinal ligament at the level of T12. The lower T2 FSE axial scan was performed with at least a 3 slice overlap, aligned perpendicular to the posterior ligament at the level of L5/S1.

MRI reporting:

All scans were initially reviewed for any serious pathology by an experienced radiologist, but none were identified. All MRI scans were subsequently reported on independently by two experienced radiologists blinded to the patient identity and timing of repeat scans. Radiologists were told we were interested in the reliability of the reporting of changes in MRI findings over time. Radiologists were blinded to the fact some participants had both images taken on the same day or that we had a secondary aim to investigate the rate of false positive reports of change over time. The radiologists underwent training by an experienced radiologist involved in the study (JM). Firstly, the radiologists were provided with the standardised reporting form and accompanying definitions and asked to review these. The radiologists were instructed to view the 2 scans at the same time and report temporal changes from the first to the second scan on the basis of the imaging alone, without regard to judgement as to its likely clinical significance. Then the radiologists met with JM to discuss any questions about the definitions and rating form. The radiologists then independently practised rating two pilot cases, before discussing agreement/disagreement moderated by JM. Finally, the radiologists conducted a second read on 2 new pilot cases and the explanation/definitions document was updated as required.

MRI findings investigated

The different MRI findings investigated included disc degeneration (Pfirrmann score), disc height loss, disc signal intensity, high intensity zone, annular fissure, Modic changes, disc contour/herniation, facet joint arthropathy, central canal stenosis, spondylolisthesis, bone marrow oedema of pars or pedicle and nerve root compromise. Radiologists initially reported on each of these MRI findings at each spinal level (L1/L2 to L5/S1) on the “baseline” (initial) scan using a standardised reporting form. They then reported on whether any change occurred between the 2 images for each MRI finding, by directly comparing the second scan with the “baseline” (initial) scan. The reporting of change between the 2 scans (temporal changes) was the focus of this study. Subjective change could be reported by the radiologist based on the presence or absence of an MRI finding at the 2 time points, or a change in the size or degree of an imaging finding between the 2 scans. Change was reported as “no change”, “probable change” or “definite change”. If temporal changes were reported for any MRI finding, the radiologist then recorded if the change indicated a worsening, appearance (i.e. MRI finding not present on baseline scan but present on second scan), improvement or disappearance (i.e. MRI finding present on baseline scan but not on second scan).

Analysis/Outcomes

In our analyses we considered both probable and definite temporal change as a change. Inter-rater reliability (aim 1) for reporting a change for each MRI finding was assessed using Kappa statistic, Prevalence and Bias Adjusted Kappa (PABAK) and percent agreement. Where both radiologists reported change for the same MRI finding on the same patient, we descriptively reported the consistency between the type of change reported (i.e. worsened, appeared, improved, disappeared).

The number of “false positive” findings (aim 2) for each MRI finding as reported by each radiologist was reported descriptively based on only the 40 participants who underwent repeat imaging on the same occasion. The 10 patients from the previous LILAC study were excluded from this analysis as repeat images were not conducted on the same day and therefore any changes between the two time points may have represented “real change” and not “false positives”.

Results:

Participants:

A description of the participants included in the study is presented in Table 1. Generally, participants were middle aged (mean age 40), equal gender (48% female) and approximately three quarters had current LBP. The 40 participants in the current study were similar in baseline characteristics to the 10 participants included from the LILAC study.

Table 1:

Characteristics of study participants

Characteristic Participants with repeat MRI scan on same day (n=40) Participants from previous LILAC study with repeat MRI on different days n=10) All participants (n=50)
Age (years), mean (SD) 39.8 (13.5) 39.0 (9.2) 39.7 (12.7)
Sex (female), n (%) 21 (53) 3 (30) 24 (48)
Current pain, n (%) 31 (78) 7 (70) 38 (76)
BMI, mean (SD) 27.2 (7.0) 24.9 (5.0) 26.8 (6.7)

SD = standard deviation; BMI = body mass index

Reliability of reporting change between two scans (all 50 participants)

The inter-rater reliability of reporting temporal changes in MRI findings between the two images was poor for all MRI findings, with Kappa values all ≤ 0.24 (Table 2). For some MRI findings (disc height, Modic changes, facet joint arthropathy, spondylolisthesis and bone marrow oedema) it was not possible to calculate a Kappa value due to one radiologist reporting no cases of temporal change between the two MRIs. However, when we used Prevalence and Bias Adjusted Kappa (PABAK) the values were high and all > 0.82. Percent agreement was high for all imaging findings (>90.8%). The high overall agreement was driven by agreement when no temporal changes were present but there was very little agreement on when temporal changes were present. For example, of the 115 temporal changes reported across all imaging findings and spinal levels by either radiologist there were only 4 where the two radiologists both reported the same temporal change (Table 2). Of these 4 there was one where both radiologists reported worsening, one where both reported improvement and in the other two one radiologist reported worsening while the other reported improvement.

Table 2:

Reporting of temporal change between two MRI scans and inter-rater reliability of reporting temporal change for radiologists A and B (n=50)

MRI finding “A” change
“B” change
“A” change
“B” no change
“A” no change
“B” change
“A” no change
“B” no change
Kappa (95% CI) Prevalence and Bias Adjusted Kappa (PABAK) Percent Agreement
Disc degeneration (Pfirrmann Grade) 0 6 17 227 −0.037
(−0.060 to −0.014)
0.816 90.8
Disc signal intensity 0 2 19 229 −0.015
(−0.033 to 0.004)
0.832 91.6
Disc height 0 0 17 233 N/A 0.864 93.2
High intensity zone 1 5 3 241 0.184
(−0.15 to 0.520)
0.936 96.8
Annular fissure 0 6 2 242 −0.012
(−0.025 to 0.001)
0.936 96.8
Modic changes* 0 0 2 498 N/A 0.992 99.6
Disc contour/herniation 3 6 10 231 0.240
(−0.010 to 0.491)
0.872 93.6
Facet joint arthropathy* 0 0 5 495 N/A 0.980 99
Central canal stenosis 0 1 5 244 0.007
(−0.018 to 0.004)
0.952 97.6
Spondylolisthesis 0 0 1 249 N/A 0.992 99.6
Bone marrow oedema 0 0 0 250 N/A N/A 100
Nerve root compromise* 0 2 2 496 −0.004
(−0.008 to −0.000)
0.984 99.2

A and B refer to the 2 reporting radiologists. “A” change indicates radiologist A reported a change between the 2 MRIs. “B” change indicates radiologist B reported a change between the 2 MRIs.

50 patients were included in these analyses and all MRI findings reported at 5 lumbar levels. Therefore, there are 250 data points for most MRI findings apart from those indicated with an * where each MRI finding produced 2 data points per spinal level resulting in a total of 500 data points for Modic changes (upper and lower endplates), facet joint arthropathy (right and left) and nerve root compromise (right and left).

Frequency of reporting changes between the two MRIs in participants imaged twice on one day (“false positives” (40 CHROME patients only)

“False positive” temporal changes were reported by at least one radiologist for all MRI findings other than bone marrow oedema, but the rate was generally very low (Table 3). Temporal changes in disc contour/herniation were reported in 3% of all discs by both radiologists. Changes in disc degeneration were reported in 6% of discs by one radiologist but only 1% of discs by the other radiologist.

Table 3:

False positive reporting of temporal changes in participants undergoing repeat MRI scans on the same day (n=40)

MRI finding Probable or definite change radiologist A
n (%)
n=200 unless indicated
Probable or definite change radiologist B
n (%)
n=200 unless indicated
Disc degeneration (Pfirrmann Grade) 2 (1%) 12 (6%)
Disc signal intensity 2 (1%) 15 (7.5%)
Disc height 0 (0%) 12 (6%)
High intensity zone 3 (1.5%) 2 (1%)
Annular fissure 4 (2%) 1 (0.5%)
Modic changes* 0 (0%) 2 (0.5%)
Disc contour/herniation 6 (3%) 6 (3%)
Facet joint arthropathy* 0 (0%) 5 (1.25%)
Central canal stenosis 1 (0.5%) 3 (0.75%)
Spondylolisthesis 0 (0%) 1 (0.25%)
Bone marrow oedema 0 (0%) 0 (0%)
Nerve root compromise* 1 (0.5%) 2 (0.5%)

40 patients were included in these analyses and all findings reported at 5 lumbar levels. Therefore, there are 200 data points for most MRI findings apart from those indicated with an * where each MRI finding produced 2 data points per spinal level resulting in a total of 400 data points for Modic changes (upper and lower endplates), facet joint arthropathy (right and left) and nerve root compromise (right and left).

Discussion

Primary findings

The current study has two main findings. Firstly, we found that the inter-rater reliability of reporting temporal changes in MRI findings was poor based on Kappa values, but agreement was relatively high. This can be explained by the low prevalence of temporal changes as demonstrated by high values for Prevalence and Bias Adjusted Kappa. Agreement on reporting absence of temporal changes was high but agreement when reporting the presence of temporal changes was low. The second key finding was that “false positive” temporal changes were reported for most MRI findings, but the rate was very low.

Interpretation of findings: strengths/weaknesses

There are several aspects of the design of our study that need considering. We considered temporal change to have occurred if the radiologist reported either “probable” or “definite” change. While the radiologists underwent standardised training, it is possible that they used different thresholds for considering change to have occurred. There is no guidance in the literature that we know of for standardising what should be considered temporal change or what would be considered clinically important change for the imaging findings reported in this study. Therefore, we do not know how the results would vary if different radiologists were involved. Our study did not specifically investigate some temporal changes that are relatively extreme and/or likely to be clinically important such as change from no fracture to fracture or from no herniation to large herniation. The reliability of reporting on these more extreme changes needs further investigation.

We used both Kappa and Prevalence and Bias Adjusted Kappa to assess reliability due to limitations of Kappa when the prevalence rate is low.12 Inspection of the data in Table 2 shows that it was far more common for raters to agree on “no change” than to agree on “change”. As an example, the radiologists agreed that there was no temporal change in spondylolisthesis in 249 of 250 spinal levels and agreed on change for 0 of 250 spinal levels. Considering the decision in this manner seems a more appropriate and useful interpretation of our findings than simply reporting low or high reliability. A novel aspect of our study was performing repeat imaging for 40 of the participants on the same day. This allowed us to consider reported temporal changes in these participants to be false positives (aim 2); however, it contributed to the low prevalence rate for temporal changes which made assessment of reliability, (aim 1) especially the interpretation of Kappa difficult. We included 10 participants who had their repeat images done between 1 and 12 weeks apart in our reliability analyses to address this, but we recommend future research should investigate reliability of reporting temporal changes in a sample with longer periods between the 2 images where the prevalence of temporal changes would be expected to be higher. In addition, we use the term “temporal changes”, to describe radiologists subjective reporting of differences between the 2 scans. For 40 of the participants the 2 MRIs occurred on the same day so differences may well result from technical aspects of the imaging rather than actual changes in the lumbar spine.

Our finding of “false positive” temporal changes for most MRI findings in the 40 participants who underwent repeat scans separated by approximately 30 minutes (even though the rates were low) is important for research and clinical practice. We previously identified similar temporal changes in people without current LBP imaged on two occasions 1–12 weeks apart11. Given we now found similar changes when imaging was conducted only 30 minutes apart, we suspect these are false positive temporal changes and do not represent true anatomic changes. Technical factors including motion of the participant and slice selection likely contribute to these false positive findings and future research should investigate MRI protocols that can reduce this problem. While we considered subjective reporting of any change between the 2 MRIs to be “false positives” we cannot rule out that for some MRI findings a real change could occur in the 30 minutes or associated with positioning. One example would be the degree of spondylolisthesis, but this seems very unlikely for most of the investigated findings. It should also be considered that the radiologists reported on a large number of imaging findings given they reported on 12 MRI findings at 5 spinal levels and sometimes on the right and left side.

Implications of findings

While temporal changes in MRI findings may provide important insights into the causes of LBP and the mechanisms of action of specific interventions, their use is currently questionable given our findings of poor reliability and questionable validity. We are unaware of other studies that have investigated the reliability of reporting temporal changes in lumbar MRI findings. Reliability of reporting temporal changes in routine clinical practice may be even lower than in our research setting where we attempted to carefully standardise the imaging and reporting process. Research to develop more reliable methods to identify temporal changes in MRI findings should be a priority before further investigations of the relationship between temporal changes and clinical outcomes are undertaken.

There is very limited research investigating if temporal changes in imaging are associated with clinical outcomes and the available studies do not suggest an important relationship.

Barzouhi et al10 reported minimal relationship between temporal changes in disc herniation or nerve root compression with clinical outcome at one year in patients with sciatica; however, this seemingly important finding may be influenced by poor reliability of the measures of temporal change. No measures of the reliability of reporting change in disc herniation or nerve root compression were reported in this study. Evidence of reliability of reporting temporal changes is critical to being able to confidently interpret studies investigating the clinical importance of temporal changes. Given our findings of poor reliability and the lack of current evidence demonstrating a relationship with clinical outcomes clinicians should largely avoid making clinical decisions based on temporal changes for the imaging findings investigated in this study.

Future directions:

Future research should investigate methods to increase the reliability of reporting temporal changes in lumbar MRI findings and investigate if reliability of more extreme changes is higher than we reported. Attempting to define a threshold for an important temporal change is one starting point. Using quantitative measures is another approach that may lead to more reliable measures of temporal change. However, while some quantitative measures have been developed for MRI findings like disc height and signal intensity, little has been published on quantitative measures for most of the other MRI findings we investigated.

Conclusion

In conclusion, our study found mixed results regarding the reliability for reporting temporal changes in lumbar spine MRI findings. The radiologists rarely agreed on reporting the presence of temporal changes. “False positive” temporal changes were reported for most MRI findings, but the rate was very low. Caution is required when interpreting temporal changes in lumbar MRI findings due to low reliability and some false positive reporting.

Acknowledgements

Dr. Jarvik’s time was supported, in part, by NIH/NIAMS grant # P30AR072572, Prof Maher’s time was supported, in part, by an NHMRC Fellowship APP1194283.

Funding Source: This study was funded by an internal Macquarie University Safety Net grant.

List of Abbreviations

LBP

low back pain

Footnotes

Disclosures: Dr. Jarvik reports grants from NIH/NIAMS, during the conduct of the study; and Springer Publishing: Royalties as a book co-editor; GE-Association of University Radiologists Radiology Research Academic Fellowship (GERRAF): Travel reimbursement for Faculty Board of Review; Wolters Kluwer/UpToDate: Royalties as a chapter author.

Contributor Information

Mark Jonathan Hancock, Faculty of Medicine, Health, and Human Sciences, Macquarie University, Australia.

Chris Maher, Sydney School of Public Health, The University of Sydney; Director, Institute for Musculoskeletal Health, Sydney..

Jeffrey G. Jarvik, University of Washington Clinical Learning, Evidence, And Research (CLEAR) Center, University of Washington.

Michele C Battié, Faculty of Health Sciences and Western’s Bone and Joint Institute, Western University, London, Ontario, Canada.

James Elliott, Faculty of Medicine and Health, The University of Sydney; Principal Investigator Neuromuscular Imaging Research Laboratory, Northern Sydney Local Health District; Australia; Feinberg School of Medicine, Northwestern University, Chicago, IL., USA..

Tue Jensen, Nordic Institute of Chiropractic and Clinical Biomechanics (NIKKB), Odense, Denmark; Diagnostic Centre, University Research Clinic for Innovative Patient Pathways, Silkeborg Regional Hospital, Silkeborg, Denmark; Department of Clinical Medicine, Aarhus University, Aarhus, Denmark..

John Panagopoulos, Active Physiotherapy Newtown, Australia.

Hazel Jenkins, Faculty of Medicine, Health, and Human Sciences, Macquarie University, Australia.

Marg Pardey, Faculty of Medicine, Health, and Human Sciences, Macquarie University, Australia.

Jeffery McIntosh, Macquarie Medical Imaging and Faculty of Medicine, Health, and Human Sciences, Macquarie University, Australia.

John Magnussen, Faculty of Medicine, Health, and Human Sciences, Macquarie University, Australia..

References:

  • 1.Hancock MJ, Maher CG, Laslett M, et al. Discussion paper: what happened to the ‘bio’in the bio-psycho-social model of low back pain? Eur Spine J. 2011;2012:2105–2110. DOI: 10.1007/s00586-011-1886-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Steffens D, Hancock MJ, Pereira LS, et al. Do MRI findings identify patients with low back pain or sciatica who respond better to particular interventions? A systematic review. Eur Spine J. 2016;254:1170–1187. DOI: 10.1007/s00586-015-4195-4 [DOI] [PubMed] [Google Scholar]
  • 3.van Tulder M, Becker A, Bekkering T, et al. Chapter 3. European guidelines for the management of acute nonspecific low back pain in primary care. Eur Spine J. 2006;15 Suppl 2:S169–191. DOI: 10.1007/s00586-006-1071-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hartvigsen J, Hancock MJ, Kongsted A, et al. What low back pain is and why we need to pay attention. Lancet. 2018;39110137:2356–2367. DOI: 10.1016/S0140-6736(18)30480-X [DOI] [PubMed] [Google Scholar]
  • 5.Brinjikji W, Diehn F, Jarvik J, et al. MRI Findings of Disc Degeneration are More Prevalent in Adults with Low Back Pain than in Asymptomatic Controls: A Systematic Review and Meta-Analysis. AJNR Am J Neuroradiol. 2015:2394–2399. DOI: 10.3174/ajnr.A4498 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Brinjikji W, Luetmer PH, Comstock B, et al. Systematic literature review of imaging features of spinal degeneration in asymptomatic populations. AJNR Am J Neuroradiol. 2015;364:811–816. DOI: 10.3174/ajnr.A4173 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hancock M, Maher C, Macaskill P, et al. MRI findings are more common in selected patients with acute low back pain than controls? Eur Spine J. 2012;212:240–246. DOI: 10.1007/s00586-011-1955-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hancock MJ, Maher CM, Petocz P, et al. Risk factors for a recurrence of low back pain. Spine J. 2015;1511:2360–2368. DOI: 10.1016/j.spinee.2015.07.007 [DOI] [PubMed] [Google Scholar]
  • 9.Panagopoulos J, Hush J, Steffens D, et al. Do MRI Findings Change Over a Period of Up to 1 Year in Patients With Low Back Pain and/or Sciatica?: A Systematic Review. Spine (Phila Pa 1976). 2017;427:504–512. DOI: 10.1097/brs.0000000000001790 [DOI] [PubMed] [Google Scholar]
  • 10.el Barzouhi A, Vleggeert-Lankamp CL, Lycklama a Nijeholt GJ, et al. Magnetic resonance imaging in follow-up assessment of sciatica. N Engl J Med. 2013;36811:999–1007. DOI: 10.1056/NEJMoa1209250 [DOI] [PubMed] [Google Scholar]
  • 11.Panagopoulos J, Magnussen JS, Hush J, et al. Prospective Comparison of Changes in Lumbar Spine MRI Findings over Time between Individuals with Acute Low Back Pain and Controls: An Exploratory Study. AJNR Am J Neuroradiol. 2017;389:1826–1832. DOI: 10.3174/ajnr.A5357 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Mandrekar JN. Measures of interrater agreement. Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer. 2011;61:6–7. DOI: 10.1097/JTO.0b013e318200f983 [DOI] [PubMed] [Google Scholar]

RESOURCES