Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Nov 1.
Published in final edited form as: Contemp Clin Trials. 2015 Oct 19;45(0 0):157–163. doi: 10.1016/j.cct.2015.10.003

Lumbar Imaging with Reporting of Epidemiology (LIRE)- Protocol for a Pragmatic Cluster Randomized Trial

Jeffrey G Jarvik 1, Bryan A Comstock 1, Kathryn T James 1, Andrew L Avins 1, Brian W Bresnahan 1, Richard A Deyo 1, Patrick H Luetmer 1, Janna L Friedly 1, Eric N Meier 1, Daniel C Cherkin 1, Laura S Gold 1, Sean D Rundell 1, Safwan S Halabi 1, David F Kallmes 1, Katherine W Tan 1, Judith A Turner 1, Larry G Kessler 1, Danielle C Lavallee 1, Kari A Stephens 1, Patrick J Heagerty 1
PMCID: PMC4674321  NIHMSID: NIHMS736644  PMID: 26493088

Abstract

Background

Diagnostic imaging is often the first step in evaluating patients with back pain and likely functions as a “gateway” to a subsequent cascade of interventions. However, lumbar spine imaging frequently reveals incidental findings among normal, pain-free individuals suggesting that treatment of these “abnormalities” may not be warranted. Our prior work suggested that inserting the prevalence of imaging findings in patients without back pain into spine imaging reports may reduce subsequent interventions. We are now conducting a pragmatic cluster randomized clinical trial to test the hypothesis that inserting this prevalence data into lumbar spine imaging reports for studies ordered by primary care providers will reduce subsequent spine-related interventions.

Methods/Design

We are using a stepped wedge design that sequentially randomizes 100 primary care clinics at four health systems to receive either standard lumbar spine imaging reports, or reports containing prevalence data for common imaging findings in patients without back pain. We capture all outcomes passively through the electronic medical record. Our primary outcome is spine-related intervention intensity based on Relative Value Units (RVUs) during the following year. Secondary outcomes include subsequent prescriptions for opioid analgesics and cross-sectional lumbar spine re-imaging.

Discussion

If our study shows that adding prevalence data to spine imaging reports decreases subsequent back-related RVUs, this intervention could be easily generalized and applied to other kinds of testing, as well as other conditions where incidental findings may be common. Our study also serves as a model for cluster randomized trials that are minimal risk and highly pragmatic.

Keywords: Pragmatic randomized trial, cluster randomized trial, back pain, spine imaging, lumbar imaging, stepped wedge design

INTRODUCTION

Diagnostic imaging is often an early step in the work-up of back pain and a likely gateway to subsequent interventions. Unfortunately, these studies reveal incidental anatomic spine findings among many normal, pain-free individuals.[1-3] Such findings can be alarming to clinicians and patients alike, and may prompt unnecessary additional tests and treatments.[4-6]

Because many spine findings seen on spine radiography may be incidental and not responsible for the patient’s symptoms, Roland and van Tulder proposed adding statements to spine imaging reports describing the prevalence of various degenerative findings among patients without back pain.[7] In a prior pilot study, our group found that primary care patients undergoing lumbar spine imaging were less likely to receive a variety of subsequent diagnostic and therapeutic interventions if their imaging reports contained information describing the prevalence of common imaging findings among patients without back pain. . However, this initial study was small and retrospective. More definitive conclusions regarding the impact of adding benchmark prevalence information to radiology reports requires a large, prospective randomized controlled trial, leading to our designing the Lumbar Imaging with Reporting of Epidemiology (LIRE) trial.

Our main hypothesis is that for patients of primary care providers, inserting age- and imaging modality-appropriate benchmark prevalence data into lumbar spine imaging reports will reduce overall spine-related healthcare utilization and testing as measured by our primary outcome, spine-related relative value units (RVUs), including magnetic resonance imaging (MRI) and computed tomography (CT), as well as subsequent therapeutic interventions, such as spinal injections, and spine surgeries. An important secondary hypothesis is that the intervention will reduce subsequent opioid prescriptions, as suggested by our pilot work.[8] Because advanced imaging modalities (MRI and CT) are more sensitive than radiographs for detecting a variety of incidental findings,[9, 10] a third hypothesis is that there will be a greater reduction in subsequent spine-related diagnostic and therapeutic interventions for MRI and CT compared with radiography. Our fourth hypothesis is that there will be a greater impact for findings such as disc bulges that are likely less clinically important than other findings, such as disc extrusions.

MATERIALS AND METHODS

Overview

We aim to conduct a pragmatic, cluster randomized trial, assigning primary care clinics at 4 large health systems to receive either standard lumbar spine imaging reports or reports containing age- and modality-appropriate epidemiological benchmarks for common imaging findings. We will use a stepped wedge design that randomizes participating clinics to initiate the intervention sequentially on one of five pre-specified calendar times: April 1, 2014, October 1, 2014, April 1, 2015, October 1, 2015, and April 1, 2016. The stepped wedge design allows for both between-clinic cross-sectional and within-clinic before/after longitudinal comparisons of the intervention, while assuring that all clinics eventually receive the intervention. Our primary outcome is a summary measure of back-related intervention intensity, spine-related relative value units (RVUs). We based our measure of spine-related RVUs on work done by Martin and colleagues and have devoted substantial effort towards refining the measure for use with data that we can retrieve from the participating health systems.[4, 11]

Participating Centers

We are enrolling patients at four integrated health care systems: Kaiser Permanente, Northern California (KPNC) Oakland, CA; Henry Ford Health System (HFHS) Detroit, MI; Group Health Cooperative (GHC), Seattle, WA; and Mayo Clinic Health System (MCHS), Rochester, MN. These sites contribute both geographic and demographic diversity and have comprehensive electronic medical record (EMR) systems, allowing capture of health care utilization data.

The University of Washington’s Comparative Effectiveness, Cost and Outcomes Research Center (CECORC) and Center for Biomedical Statistics (CBS) serve as the Data Coordinating Center (DCC) for LIRE. A collaborator at Oregon Health and Science University (RAD) is also part of the DCC.

Study Design

Eligibility Criteria- Clinics

We define a “clinic” as primary care if a majority of its practitioners provide adult primary care (general internal medicine, family practice, and associated mid-level providers). The criteria for clinic eligibility are that the health care providers are a distinct, readily identifiable group that has at least a subgroup of primary care providers who do not practice at another clinic that will also be part of the trial. The requirement that providers be based primarily at one clinic is to minimize cross-contamination (having the epidemiological benchmarks at one clinic influence another clinic that is not yet receiving the epidemiological benchmarks).

Eligibility Criteria- Patients

Patients are eligible for the LIRE study if a primary care provider orders a diagnostic imaging test of the lumbar spine between October 1, 2013, and September 30, 2016. We include all adult patients (>18 years old as of the date of the imaging test) who undergo plain film, CT, or MR imaging of the lumbar spine, identified from Current Procedural Terminology (CPT) codes [12] (see Table 1) by primary care providers.

Table 1.

Eligible Lumbar Spine Imaging Procedures by CPT Codes

CPT Code Description
72080 Spine, thoracolumbar, 2 views
72100 Spine, lumbosacral; 2 or 3 views
72110 Spine, lumbosacral; minimum of 4 views
72114 Spine, lumbosacral; complete, with bending views
72131 CT, lumbar spine without contrast
72132 CT, lumbar spine with contrast
72133 CT, lumbar spine without and with contrast
72148 MRI, lumbar spine without contrast
72149 MRI, lumbar spine with contrast
72158 MRI, lumbar spine without and with contrast

All patients receiving eligible lumbar spine imaging studies at participating clinics are part of the trial, unless the patient has explicitly signed a declaration opting out of all research studies.

Patient Identification

The health information system automatically identifies when a practitioner from a participating clinic orders an eligible lumbar spine imaging study. We accomplish this either through the radiology information system (RIS) or EMR, depending on the site.

Consent

The Institutional Review Boards (IRBs) at KPNC and UW ceded authority to the Group Health Research Institute (GHRI) IRB, which is the IRB of record for the overall study. The IRBs at HFHS and Mayo retained review of the protocol. Each reviewing IRB approved the trial procedures. All participating IRBs agreed that our study is minimal risk and granted waivers of both consent and Health Insurance Portability and Accountability Act (HIPAA) authorization. One site required the investigators to inform primary care providers at the end of the study that we conducted a randomized evaluation at their site.

Randomization

We randomly assigned all clinics at each site to receive the intervention at one of five fixed calendar times: April 2014, October 2014, April 2015, October 2015, and April 2016 (Figure 1).

Figure 1.

Figure 1

Randomization Scheme

Within each recruitment site, we sorted clinics by number of primary care providers (PCPs) into tertiles (small, medium, large clinics). From each tertile we randomly selected clinics using urn-based randomization (without replacement) stratified by site and clinic size such that clinics of small, medium, and large size are equally represented in each randomization wave. We use site-specific definitions for the size of the clinic with the goal of having balance of clinic size across randomization waves within each site. Because clinic size varied substantially both within and between health systems, balancing randomization on clinic size ensures comparable time in the control and intervention periods for all clinic size strata. Table 2 displays the site-specific strata definitions and size.

Table 2.

Within-site stratified randomization schedule of clinics by number of PCPs.

Clinics/Units of
Randomization
(# of PCPs)
PCP strata size boundaries

(# clinics)
Recruitment
Site
Small Medium Large
Site #1 19 (245) 5 to 10 (7) 11 to 15 (7) 16 to 41 (5)
Site #2 26 (230) 3 to 6 (9) 7 to 9 (9) 10 to 24 (8)
Site #3 21 (814) 7 to 29 (7) 33 to 39 (7) 43 to 106 (7)
Site #4 34 (338) 2 to 4 (12) 5 to 9 (10) 11 to 34 (12)
Total 100 (1,627)

Due to providers practicing at multiple clinics, we combined some clinics into single randomization units to reduce cross-contamination, resulting in a total of 100 randomization units. We initially identified 1,627 PCPs as units of observation within those randomization units, realizing that the number of providers is in constant flux due to new hires, retirement, etc. Additionally, because of uncertainty in classifying providers as PCPs at one of our sites (KPNC) we expect that the total number of participating PCPs will rise to between 2,400 and 4,400.

Intervention

Using an automated approach through either the RIS or EMR, we insert age- and modality-appropriate epidemiological benchmark information[3] automatically into the lumbar spine imaging reports of PCPs who work in intervention clinics. The exact method of insertion varies by site.

At GHC, we use the EMR (Epic) to insert the intervention text after the radiologist has finalized the radiology report. Similarly at Mayo, we insert the intervention text after the radiologist has finalized the report, using a system that interfaces with the RIS and the EMR (Cerner). At KPNC and HFHS, the RIS inserts the intervention at the time that the radiologist dictates the report. At KPNC, this allows the radiologist to review the text and potentially modify or remove it. If they do either, they must insert a text string before they are able to finalize the report that allows us to track these modifications.

We based the intervention text (see Appendix A) on text that our group had previously used [7, 8], with modifications that included updates from a systematic review of the prevalence of findings in patients without back pain as well as input from patients to make the text more easily understood by non-medical readers. Primary care providers who work in control clinics receive their usual imaging report.

Data Collection Methods

We collect all baseline and follow-up data from the electronic information systems which, depending on the site, include EMR and administrative data systems. We use the PopMedNet software application that enables simple creation, operation, and governance of distributed health data networks.[13, 14] This system provides a secure, mutually governed and auditable model for transporting data between the sites and the DCC. Each LIRE partner site is also a participant in the LIRE PopMedNet network and has a named and authenticated set of users.

For EMR queries, we perform two types of data extraction: 1) index files consisting of all patients receiving an eligible lumbar imaging study at participating clinics, with data limited to date and type of imaging procedure, clinic identifier, patient demographics, imaging report text; and 2) comprehensive data extraction that includes outcome and safety variables (described below). We perform an index data file query 2-3 weeks after the start of each randomization wave to verify that the intervention is being deployed appropriately. For all of the imaging text reports included in each index data set, we use text matching to verify that the intervention text was properly inserted into the reports based on clinic, imaging modality, and patient age.

One year following the first randomization wave and then every six months thereafter, we perform a comprehensive data query for both safety monitoring and outcomes assessment. These files are compatible with the Virtual Data Warehouse schema, and we will implement them at each of the HMO Research Network (HMORN) sites (KPNC, HFHS, and GHC) using common SAS software code. The Virtual Data Warehouse is a protocol for obtaining common data elements from health systems that are part of the HMORN.[15]. This facilitates data extraction as well as comparisons between these health systems. MCHS, which is not part of the HMORN, developed a local mapping of the common data dictionary to provide semantically identical data index files and summary data files on the same schedule. Each site retains full lists of eligible patients and will be responsible for addressing patient withdrawals, additions, and deaths. Sites submit all index and comprehensive outcome files as Limited Data Sets (de-identified except for dates of service) to the DCC via PopMedNet. Sites provide unique, coded study identifiers for each patient so that index image data can be linked to comprehensive outcomes and so that patients with more than one lumbar spine image during the study period are only included in the study once (images subsequent to the index image become part of the outcome data). The DCC maintains the data dictionary for the duration of the project.

Pre-Intervention Data Collection

To determine baseline pre-intervention RVUs and outcomes prior to randomization, we included all patients at participating clinics who received eligible lumbar spine imaging studies between October 1, 2013 and March 31, 2014. These data comprised our baseline period. Additionally, we are collecting diagnosis and utilization data for patients 12 months prior to baseline (potentially as early as October 1, 2012 for a patient who received index imaging on October 1, 2013) to characterize the assembled cohort at a patient-level.

Data Collection Schedule

We will capture EMR data on patients for a minimum of one year after the index imaging examination. Two-thirds of patients will have two years of EMR-based follow-up data due to the staggered implementation of the intervention. Biannually, we will collect comprehensive EMR data for patients who reached their one and two-year follow-up date in the previous six months.

Primary and Secondary Outcomes

We will collect all outcomes passively through the EMR (Table 3). The primary outcome is the one-year summary of lumbar spine-specific RVUs, a composite measure of spine interventions that combines the overall intensity of resource utilization for back pain care into a single metric.

Table 3.

Outcomes data from EMR

Domain Specific Element
Hospitalizations (all inpatient stays)
  • Duration (days)

  • Diagnosis related group (DRG)

  • Associated Current Procedural Terminology (CPT) and International Classification of Diseases-9 (ICD-9) codes

Outpatient visits
  • Clinic type

  • Associated ICD-9 codes

Pharmacy data (prescribed
medications)
  • National Drug Code (NDC)

  • Drug name, dose, quantity, days’ supply

Procedures (inpatient and outpatient,
including subsequent imaging
  • Procedure type and indication

  • Associated CPT and ICD-9 codes

Safety Data
  • ER visits (within 90 days of index image)

  • Death (within 6 months of index image)

We developed a summary spine-specific RVU metric using EMR data from a cohort of 5,239 patients with back pain as part of the Back pain Outcomes using Longitudinal Data (BOLD) project.[16] We developed a data dictionary and code to abstract EMR data across three health systems, two of which are LIRE sites. For each BOLD participant, we obtained a comprehensive list of procedures (CPT codes), diagnoses and provider visits (International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes),[17] and inpatient hospitalizations from the EMR. Using the Medicare Physician Fee Schedule (http://www.cms.gov/), we generated and tested a mapping algorithm to assign RVUs to over 10,000 unique CPT codes. A sample of RVUs from the 2012 CMS file is shown in Table 4. We will not capture care provided to patients outside the partner health systems that is not included in their EMRs.

Table 4.

Selected spine-related CPT codes and associated RVUs.

CPT Code Description RVU
72100 X-ray exam of lower spine – 2 or 3 views 1.07
97001 Physical Therapy Evaluation 2.18
99214 Detailed office visit 2.26
99284 Emergency department visit – high intensity 3.37
64483 Epidural injection for lumbar spinal stenosis 3.37
72131 CT lumbar spine without contrast 6.27
72148 MRI Lumbar Spine without contrast 11.31
63047 Removal of spinal lamina (laminectomy) 32.89
22804 Fusion of the spine 71.60

To obtain a spine-related summary RVU from CPT and ICD-9-CM codes, we used an existing validated algorithm.[11, 18, 19] Aggregating across CPT codes identified by this algorithm through one year after the index diagnostic imaging test yields the total spine-specific RVUs. Data extraction from the EMRs will likely be impacted by the conversion to ICD10. Because ICD10 is more granular than ICD9, it is possible to develop a cross-walk algorthim that accounts for all of the data elements that we identify using ICD9 from an ICD10 dataset. Since each health system EMR has idiosyncratic elements, we plan to develop this crosswalk for each health system.

We will also derive important secondary outcomes from EMR and administrative data. These include: a longer-term back-specific RVU summary metric reflecting back-related care up to two years following the index imaging study date; an indicator of opioid prescribing after the index imaging study; subsequent advanced imaging (number of MRI or CT studies) within 90 days and 12 months after the index imaging study; spine injections, spine surgery and other back-related medical costs over the 2 years.

Imaging Finding Abstraction

We will use an approach that combines statistical machine learning with rules based natural language processing (NLP), to abstract the necessary data from anonymized radiology text imaging reports. The health systems will export the imaging reports from their EMR and we will identify common imaging findings that are likely less clinically important (e.g., disc bulge, disc space narrowing, annular fissure) versus more important (central canal stenosis, nerve root compression, disc extrusion) (Table 5).[1, 20] We will also identify less common but potentially clinically important findings such as possible tumor or infection.

Table 5.

Imaging Findings Likely Clinically Important vs. Likely Clinically Unimportant

Likely Clinically Important Likely Clinically Unimportant
  • Moderate or severe stenosis*

  • Disc extrusion

  • Nerve root displacement or compression

  • Endplate edema (Type 1 endplate change)

  • Grade 2 or higher listhesis

  • Annular Fissure

  • Disc height loss

  • Mild stenosis (central, lateral recess or foraminal)

  • Nerve root contact without displacement/compression

  • Grade 1 listhesis

  • Disc dessication

  • Disc bulge

  • Disc protrusion

  • Facet degeneration (any severity)

*

central, lateral recess or foraminal

For the machine learning approach, we plan to use the NLTK library in Python. [21] We then plan to create a “feature set” of variables derived from report text that optimizes prediction. Features included will likely include basic N-grams (unigrams, bigrams, trigrams), sections of the report, and type of imaging examination.

For the rule-based portion, we plan to code regular expressions (REGEX) based on a list of synonyms for imaging findings developed by our study content experts (JGJ, SDR, RAD). We also plan to account for negation by adapting the ConText algorithm. [22] We will evaluate our prediction accuracies using a random subset of annotated reports. Based on a pilot investigation, agreement among the three annotators was substantial (kappa ranging from 0.65-1) for various imaging findings.

We will use these data to examine whether there is any differential effect of the intervention according to the results in the imaging report.

Data quality control (QC)

As biannual EMR data accumulate, we will use established algorithms to ensure data quality and validation. We will focus evaluation of EMR data quality primarily on patterns of completeness and substitution (use of one code for another) in CPT codes (inpatient and outpatient), ICD-9-CM diagnosis codes, and NDC pharmacy records. For QC of patient-level EMR data, we will check that the data are of the appropriate type and format specified by the data dictionary. For system-to-system and wave-to-wave QC evaluation of EMR data, we will generate diagnostic plots comparing rates of CPT code endorsement to look for unusual patterns of utilization or use of specific CPT codes.[23] Similarly, we will use site-by-site cross-tabulation and diagnostic plots to examine patterns in prescribed medications by NDC code. In an additional QC step, we will cross-tabulate the top 100 CPT codes by frequency and RVUs for each site in order to detect unusual patterns of missing codes or code substitution. We will document and resolve any issues that we discover through direct discussion with programming staff and site principal investigators.

Analytic Approach

To evaluate the impact of inserting prevalence data into an imaging report, we will use longitudinal regression methods such as linear mixed effects models (LMMs) or generalized linear mixed models (GLMMs) with robust standard errors for all primary and secondary outcome measures. Mixed models provide an efficient method for analysis of longitudinal or multilevel data and will be the basis of our primary analysis approach. In secondary analyses we will use generalized estimating equations (GEE), adopting simple exchangeable correlation models at the clinic level to determine whether conclusions appear sensitive to model specification.

We will use a time-varying intervention status indicator Statuskt (0 = control, 1 = intervention, for clinic k at time t) for the primary longitudinal model for back pain specific RVUs. We will adopt the functional form given below for the specific regression model, with fixed effects for time (linear), age (18-39, 40-59, 60+, using 2 dummy variables), imaging modality type (plain film or MRI, using dummy variables), clinic size (small, medium, large, using 2 dummy variables), and site (GHC, HFHS, KPNC, MCHS, using 3 dummy variables) in addition to random effects for provider, clinic, and intervention status:

Yijk=β0+β1Timet+β2TAgeijk+β3TModalityijk+β4TSizek+β5TSitek+λ0Statuskt+meanmodelbk,0+bk,1Statuskt+clinicrandomeffectsajk,0+eijkproviderrandomeffectsandsubjecterrors

We will collect the outcome measure Yijk on patient i (i= 1,2,…,nj) under primary care provider j (j = 1,2,…,nk) enrolled in time period t (t = 0,1,2,… ,5) in order to evaluate the overall effect of the intervention at the level of the clinic k (k = 1,2,…,100). The primary parameter of interest, λ0, represents the average effect of the intervention adjusting for temporal trends (Timet), clinic characteristics (Sitek, Sizek), and individual covariates (Ageijk, Modalityijk).

Our second hypothesis is that the intervention will reduce subsequent opioid prescriptions. Our pilot data suggested this effect and it could be mediated by the intervention reassuring providers and patients regarding the benignity of imaging findings. Each clinical site data system utilizes the national drug code or similar classification[24] that allows us to determine the morphine equivalent dose (MED) for each opioid prescription. [25] We will then calculate the total MED prescribed per patient within 90 days and 12 months of the index image. We will then use the same approach for the analysis of MED as we used for back-specific RVUs, using a time-varying intervention status indicator Statuskt (0 = control, 1 = intervention, for clinic k at time t) for the primary longitudinal model for MED prescriptions.

Our third hypothesis is that there will be a differential effect of the intervention according to imaging modality: there will be a greater reduction in subsequent spine-related diagnostic and therapeutic interventions for MRI compared with radiography. In order to test this hypothesis, we will analyze patient-level data according to the appropriate LMM or GLMM given above, but including the interactions between Modalityijk indicators (modeled using an indicator variable coding MRI, with plain film as the reference) and Statuskt. A test of the interaction terms (2 degree-of-freedom Wald test) will be used to test the null hypothesis that the effect of the intervention does not vary according to the imaging modality.

Our fourth hypothesis is that there will be a greater impact for findings that are likely less clinically important than findings that are likely more clinically important. We will use an additional variable, ImageFindingijk, that takes the value 1 if an important image finding is present, and 0 otherwise (see Table 5). We will test the null hypothesis that the interaction between ImageFindingijk and Statuskt is zero using a Wald test.

Power for Primary Outcome

We randomized 100 clinics with approximately 1,700 PCPs. We calculated statistical power for the primary outcome measure, spine-specific RVUs, using data from the BOLD study to inform key design parameter estimates. From BOLD, we identified 639 patients at KPNC and HFHS who had a qualifying lumbar image within six weeks of a PCP visit, the majority (74%) of which occurred within seven days. As expected, patient-level RVUs were positively skewed and we will therefore use an approximately normalizing transformation of log (RVU + 1) but make interpretations regarding effect size on the original RVU scale.

With log-transformed BOLD Registry RVU data, we fit a linear mixed effects model adjusting for image type (advanced vs. plain film) and study recruitment site and estimated the variance components for clinic (0.026) and the residual error term (1.230). The observed intra-class correlation coefficient (ICC) across clinics was 0.013 (95% CI: 0.000 to 0.046). In this subset of BOLD data, the number of PCPs with multiple patients was too few to inform the PCP-level variance component and it is therefore conservatively included in the error term variance for power simulations.

We considered fixed numbers of clinics and PCPs for each simulation and we assumed that each provider would provide data for all study time periods. For a range of potential RVU effect sizes, we generated 1,000 simulated data sets and performed mixed model estimation with each data set. For the primary outcome measure of back-specific RVUs, the study has 89% power to detect reductions of 5.0% or larger in the median back-specific RVU. For a patient receiving a lumbar image, a 5% average reduction in spine-specific RVU translates into approximately one fewer lumbar CT scan (for example) on average compared to a patient unexposed to the LIRE intervention.

Data Safety Monitoring Plan

Two external safety officers will monitor safety endpoints (Emergency Room (ER) visits within 90 days and deaths within six months of index imaging) every six months for the duration of the study. Safety officers will utilize absolute relative risk ratio monitoring thresholds of 1.15 and 1.10 for comparing 90-day ER visit and death rates by intervention group, with adjustment for pre-intervention rates at each site and patient-specific Quan comorbidity index.[26]

DISCUSSION

The imaging of patients with back pain has the potential for both benefit and harm. Imaging can identify conditions such as central spinal stenosis that in the appropriate clinical setting have a good evidence base for the benefits of certain treatments such as surgery.[27] However, because of the high prevalence of a variety of imaging findings in patients without back pain, imaging may lead to a cascade of needless interventions that can be both harmful and expensive. We describe a study that tests a strategy to mitigate the problematic consequences of potentially misleading lumbar spine imaging findings by including benchmark prevalence data in the routine imaging report.

We adopted the concept of adding epidemiologic benchmarks from Roland and van Tulder, who proposed adding statements to plain film imaging reports describing the prevalence of different degenerative findings in patients without back pain.[7] A small, retrospective, pilot study published by our group suggested that patients who had the benchmark epidemiological information included in their radiology report were about three and a half times less likely to get a narcotic prescription compared with patients who did not get this information in their imaging report.[8] This was statistically significant after controlling for a variety of factors (age, gender, severity of condition)(OR=0.29, P=0.01). Not statistically significant but also reduced in the “statement” group were subsequent high-cost imaging examinations (CT and MRI) (OR=.22) and physical therapy (PT) referrals (OR=.55, P=0.06).

Why might this simple, inexpensive approach be effective? We surmised that the benchmark information would reassure both patients and physicians and result in fewer downstream interventions, including both diagnostic (cross-sectional imaging) and treatment (PT, opiates, surgeries) interventions.

LIRE is a pragmatic cluster randomized trial of a minimal-risk intervention that we believe can serve as a model for future pragmatic trials. We designed LIRE to be as pragmatic as possible. Thorpe et al.[28] described a tool that they named the pragmatic–explanatory continuum indicator summary (PRECIS) to determine where on the spectrum of explanatory to pragmatic a trial falls with respect to 10 domains, and it has recently been modified to the PRECIS-2 tool by Loudon et al. [29]. Figure 2 is the PRECIS-2 diagram of LIRE, demonstrating the highly pragmatic nature of the study design.[30]

Figure 2.

Figure 2

Pragmatic-Explanatory Continuum Indicator Summary (PRECIS)

LIRE is 1 of 7 pilot pragmatic randomized trials that are part of the initial round of funding by the NIH Health Care Systems Collaboratory.[31] LIRE and the other pilot pragmatic trials are a first step toward that goal.

One potential limitation of LIRE is that the intervention may lead PCPs to devalue the importance of all imaging findings and thus under-treat conditions that could have benefitted from treatment. However, given the climate of over-diagnosis and over-treatment of back pain in the U.S., this is less likely to occur than in other settings. Another limitation is that, since we are not collecting patient reported outcomes (PROs), we cannot directly comment on the effect of the intervention on potentially important patient outcomes such as functional status and pain. We are also not able to collect information on work loss or disability. Nor can we comment on the effectiveness of the intervention on potentially important subgroups such as patients with acute vs. chronic back pain or those who satisfy the National Quality Forum’s criteria for imaging. We made this decision not to collect patient reported data deliberately, conscious of the fact that collecting PROs would make the trial less pragmatic. Given our projected sample size of over 160,000 patients, collecting PROs would not have been practical.

Because of the large size of our project, abstraction of the radiology reports is a challenge. We plan to use natural language processing as a tool to help us with this abstraction task. Natural language processing is reasonably accurate at identifying clinical findings from radiology reports. Chapman et. al reported sensitivities of 86-92%, specificities of 78-91%, and positive predictive values (PPV) of 72-85% for NLP in X-ray reports.[32] Despite this reported success as well as our promising pilot experience, once we start applying NLP to the larger dataset we may discover obstacles that will prompt us to revise our approach.

If the current trial shows that adding prevalence data to spine imaging reports decreases subsequent back-related RVUs, the method is likely to be generalizable to other conditions and to other kinds of testing (e.g., other imaging procedures such as chest CT, abdominal CT, etc.) where incidental findings may be common.[33] Because the cost of the intervention itself is minimal, the potential cost-effectiveness of this intervention, if successful, is enormous. If it reduces unnecessary treatment, the intervention could prevent avoidable complications, decreasing costs and improving patient outcomes.

Our project, the Lumbar Imaging with Reporting of Epidemiology (LIRE) study, also serves as a model for a cluster randomized trial that is minimal risk and highly pragmatic. If successful, our use of the stepped wedge design on a large scale could become a desirable model for future studies and represent an important strategy for health services researchers.

CONCLUSION

The long-term public health significance of the LIRE study is that, if effective, our simple, inexpensive intervention has the potential to reduce unnecessary, inappropriate, and costly care not only for back pain, but also for a wide range of other conditions, since it could easily be applied to other diagnostic tests (e.g., other imaging tests, laboratory tests, genetic testing). If our study demonstrates that the intervention decreases subsequent health-resource use, adding epidemiologic benchmarks to diagnostic test reporting could become the dominant paradigm for communicating all diagnostic information, resulting in substantially more appropriate clinical care for minimal implementation costs.

Supplementary Material

Acknowledgements

The study is supported by the National Institutes of Health 4UH3AR066795-02

Funding Source: NIH 1UH2AT007766-01 and 4UH3AR066795-02

LIST OF ABRREVIATIONS

ANCOVA

analysis of co-variance

BOLD

Back pain Outcomes using Longitudinal Data

CPT

Current Procedural Terminology

DCC

data coordinating center

EMR

electronic medical record

ER

emergency room

GEEs

generalized estimating equations

GLMMs

generalized linear mixed models

HIPAA

Health Insurance Portability and Accountability Act

ICD-9-CM

International Classification of Diseases, Ninth Revision, Clinical Modification

LMMs

linear mixed effects models

NDC

National Drug Codes

NLP

natural language procession

QCC

quality control

RIS

radiology information system

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Trial Registration: Clinicaltrials.gov NCT02015455

Competing Interests

Dr. Jarvik has the following potential conflicts of interest, although they do not relate directly to the subject of this manuscript, he lists them in the spirit of full disclosure. He has served on the Comparative Effectiveness Advisory Board for GE Healthcare through October 2012. He is a co-founder and stockholder of PhysioSonics, a high intensity focused ultrasound company, and receives royalties for intellectual property. He serves as a consulting medical editor for Google as well as a consultant for HealthHelp, a radiology benefits management company. Finally, he is a co-Editor of Evidence-based Neuroradiology published by Springer Publishing.

Authors’ contributions

JGJ, BAC, RAD, PJH developed the original concept of the study and JGJ, BAC, ALA, BWB, RAD, JLF, SSH, KTJ, DFK, JAT, PJH developed the design of the LIRE study. KTJ is the project director. All authors have read and approved the final version of the article. JGJ drafted the manuscript and all co-authors made contributions to revisions.

References

  • 1.Jarvik JJ, Hollingworth W, Heagerty P, Haynor DR, Deyo RA. The Longitudinal Assessment of Imaging and Disability of the Back (LAIDBack) Study: baseline data. Spine. 2001;26(10):1158–1166. doi: 10.1097/00007632-200105150-00014. [DOI] [PubMed] [Google Scholar]
  • 2.Jensen MC, Brant-Zawadzki MN, Obuchowski N, Modic MT, Malkasian D, Ross JS. Magnetic resonance imaging of the lumbar spine in people without back pain [see comments] N Engl J Med. 1994;331(2):69–73. doi: 10.1056/NEJM199407143310201. [DOI] [PubMed] [Google Scholar]
  • 3.Brinjikji W, Luetmer PH, Comstock B, Bresnahan BW, Chen LE, Deyo RA, Halabi S, Turner JA, Avins AL, James K, et al. Systematic Literature Review of Imaging Features of Spinal Degeneration in Asymptomatic Populations. AJNR Am J Neuroradiol. 2014 doi: 10.3174/ajnr.A4173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jarvik JG, Gold LS, Comstock BA, Heagerty PJ, Rundell SD, Turner JA, Avins AL, Bauer Z, Bresnahan BW, Friedly JL, et al. Association of early imaging for back pain with clinical outcomes in older adults. JAMA. 2015;313(11):1143–1153. doi: 10.1001/jama.2015.1871. [DOI] [PubMed] [Google Scholar]
  • 5.Graves JM, Fulton-Kehoe D, Jarvik JG, Franklin GM. Health care utilization and costs associated with adherence to clinical practice guidelines for early magnetic resonance imaging among workers with acute occupational low back pain. Health Serv Res. 2014;49(2):645–665. doi: 10.1111/1475-6773.12098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Graves JM, Fulton-Kehoe D, Jarvik JG, Franklin GM. Early Imaging for Acute Low Back Pain: One-year Health and Disability Outcomes Among Washington State Workers. Spine (Phila Pa 1976) 2012;37(18):1617–1627. doi: 10.1097/BRS.0b013e318251887b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Roland M, van Tulder M. Should radiologists change the way they report plain radiography of the spine? Lancet. 1998;352(9123):229–230. doi: 10.1016/S0140-6736(97)11499-4. [DOI] [PubMed] [Google Scholar]
  • 8.McCullough BJ, Johnson GR, Martin BI, Jarvik JG. Lumbar MR imaging and reporting epidemiology: do epidemiologic data in reports affect clinical management? Radiology. 2012;262(3):941–946. doi: 10.1148/radiol.11110618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Jarvik JG, Deyo RA. Diagnostic evaluation of low back pain with emphasis on imaging. Ann Intern Med. 2002;137(7):586–597. doi: 10.7326/0003-4819-137-7-200210010-00010. [DOI] [PubMed] [Google Scholar]
  • 10.Jarvik JG, Hollingworth W, Martin B, Emerson SS, Gray DT, Overman S, Robinson D, Staiger T, Wessbecher F, Sullivan SD, et al. Rapid magnetic resonance imaging vs radiographs for patients with low back pain: a randomized controlled trial. JAMA. 2003;289(21):2810–2818. doi: 10.1001/jama.289.21.2810. [DOI] [PubMed] [Google Scholar]
  • 11.Martin B, Mirza SK, Lurie JD, Tosteson ANA, Deyo RA. International Society for the Study of the Lumbar Spine (ISSLS) Scottsdale, AZ: May 14, 2013. Validation of an administrative coding algorithm to identify back-related degenerative diagnoses. 2013. [Google Scholar]
  • 12. AMA - About CPT® [ http://www.ama-assn.org/ama/pub/physician-resources/solutions-managing-your-practice/coding-billing-insurance/cpt/about-cpt.page]
  • 13.Vogel J, Brown JS, Land T, Platt R, Klompas M. MDPHnet: secure, distributed sharing of electronic health record data for public health surveillance, evaluation, and planning. Am J Public Health. 2014;104(12):2265–2270. doi: 10.2105/AJPH.2014.302103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.PopMedNet™ Distributed Research Network Technologies for Population Medicine. [ http://www.popmednet.org]
  • 15.Ross TR, Ng D, Brown JS, Pardee R, Hornbrook MC, Hart G, Steiner JF. The HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration. EGEMS (Wash DC) 2014;2(1):1049. doi: 10.13063/2327-9214.1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jarvik JG, Comstock BA, Bresnahan BW, Nedeljkovic SS, Nerenz DR, Bauer Z, Avins AL, James K, Turner JA, Heagerty P, et al. Study Protocol: The Back pain Outcomes using Longitudinal Data (BOLD) Registry. BMC Musculoskelet Disord. 2012;13(1):64. doi: 10.1186/1471-2474-13-64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.International Classification of Diseases, Ninth Revision (ICD-9) doi: 10.7326/0003-4819-88-3-424. [ http://www.cdc.gov/nchs/icd/icd9.htm] [DOI] [PubMed]
  • 18.Martin BI, Gerkovich MM, Deyo RA, Sherman KJ, Cherkin DC, Lind BK, Goertz CM, Lafferty WE. The association of complementary and alternative medicine use and health care expenditures for back and neck problems. Med Care. 2012;50(12):1029–1036. doi: 10.1097/MLR.0b013e318269e0b2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Martin BI, Mirza SK, Franklin GM, Lurie JD, MacKenzie TA, Deyo RA. Hospital and surgeon variation in complications and repeat surgery following incident lumbar fusion for common degenerative diagnoses. Health Serv Res. 2013;48(1):1–25. doi: 10.1111/j.1475-6773.2012.01434.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jarvik JG, Hollingworth W, Heagerty PJ, Haynor DR, Boyko EJ, Deyo RA. Three-year incidence of low back pain in an initially asymptomatic cohort: clinical and imaging risk factors. Spine. 2005;30(13):1541–1548. doi: 10.1097/01.brs.0000167536.60002.87. discussion 1549. [DOI] [PubMed] [Google Scholar]
  • 21.Bird S. NLTK: the natural language toolkit. Proceedings of the COLING/ACL on Interactive presentation sessions. 2006.
  • 22.Harkema H, Dowling JN, Thornblade T, Chapman WW. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. Journal of biomedical informatics. 2009;42(5):839–851. doi: 10.1016/j.jbi.2009.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gibson G. Hints of hidden heritability in GWAS. Nat Genet. 2010;42(7):558–560. doi: 10.1038/ng0710-558. [DOI] [PubMed] [Google Scholar]
  • 24. NDC: http://www.fda.gov/Drugs/InformationOnDrugs/UCM142438.htm.
  • 25. MED: http://www.agencymeddirectors.wa.gov/opioiddosing.asp.
  • 26.Quan H, Li B, Couris CM, Fushimi K, Graham P, Hider P, Januel JM, Sundararajan V. Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. Am J Epidemiol. 2011;173(6):676–682. doi: 10.1093/aje/kwq433. [DOI] [PubMed] [Google Scholar]
  • 27.Weinstein JN, Tosteson TD, Lurie JD, Tosteson AN, Blood E, Hanscom B, Herkowitz H, Cammisa F, Albert T, Boden SD, et al. Surgical versus nonsurgical therapy for lumbar spinal stenosis. N Engl J Med. 2008;358(8):794–810. doi: 10.1056/NEJMoa0707136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Thorpe K, Zwarenstein M, Oxman A, Treweek S, Furburg C, Altman D, Tunis S, Bergel E, Harvey I, Magid D, et al. A pragmatic–explanatory continuum indicator summary (PRECIS): a tool to help trial designers. CMAJ. 2009 doi: 10.1503/cmaj.090523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Loudon K, Treweek S, Sullivan F, Donnan P, Thorpe KE, Zwarenstein M. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350:h2147. doi: 10.1136/bmj.h2147. [DOI] [PubMed] [Google Scholar]
  • 30. PRECIS toolkit: https://crs.dundee.ac.uk/precis/Help/Documentation/Toolkit.
  • 31.Johnson KE, Tachibana C, Coronado GD, Dember LM, Glasgow RE, Huang SS, Martin PJ, Richards J, Rosenthal G, Septimus E, et al. A guide to research partnerships for pragmatic clinical trials. BMJ. 2014;349:g6826. doi: 10.1136/bmj.g6826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Chapman WW, Fizman M, Chapman BE, Haug PJ. A comparison of classification algorithms to automatically identify chest X-ray reports that support pneumonia. Journal of biomedical informatics. 2001;34(1):4–14. doi: 10.1006/jbin.2001.1000. [DOI] [PubMed] [Google Scholar]
  • 33.Jacobs PC, Mali WP, Grobbee DE, van der Graaf Y. Prevalence of incidental findings in computed tomographic screening of the chest: a systematic review. J Comput Assist Tomogr. 2008;32(2):214–221. doi: 10.1097/RCT.0b013e3181585ff2. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

RESOURCES