Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jan 1.
Published in final edited form as: Osteoarthritis Cartilage. 2012 Oct 23;21(1):110–116. doi: 10.1016/j.joca.2012.10.011

The Osteoarthritis Initiative (OAI) magnetic resonance imaging quality assurance update

E Schneider †,‡,*, M NessAiver §
PMCID: PMC3629918  NIHMSID: NIHMS423324  PMID: 23092792

Abstract

Objective

Longitudinal quantitative evaluation of cartilage disease requires reproducible measurements over time. We report 8 years of quality assurance (QA) metrics for quantitative magnetic resonance (MR) knee analyses from the Osteoarthritis Initiative (OAI) and show the impact of MR system, phantom, and acquisition protocol changes.

Method

Key 3 T MR QA metrics, including signal-to-noise, signal uniformity, T2 relaxation times, and geometric distortion, were quantified monthly on two different phantoms using an automated program.

Results

Over 8 years, phantom measurements showed root-mean-square coefficient-of-variation reproducibility of <0.25% (190.0 mm diameter) and <0.20% (148.0 mm length), resulting in spherical volume reproducibility of <0.35%. T2 relaxation time reproducibility varied from 1.5% to 5.3%; seasonal fluctuations were observed at two sites. All other QA goals were met except: slice thicknesses were consistently larger than nominal on turbo spin echo images; knee coil signal uniformity and signal level varied significantly over time.

Conclusions

The longitudinal variations for a spherical volume should have minimal impact on the accuracy and reproducibility of cartilage volume and thickness measurements as they are an order of magnitude smaller than reported for either unpaired or paired (repositioning and reanalysis) precision errors. This stability should enable direct comparison of baseline and follow-up images. Cross-comparison of the geometric results from all four OAI sites reveal that the MR systems do not statistically differ and enable results to be pooled. MR QA results identified similar technical issues as previously published. Geometric accuracy stability should have the greatest impact on quantitative analysis of longitudinal change in cartilage volume and thickness precision.

Keywords: Magnetic resonance, Quality assurance (QA), Clinical trial, Osteoarthritis Initiative

Introduction

The Osteoarthritis Initiative (OAI), a public-private partnership jointly sponsored by the National Institutes of Health (NIH) and the pharmaceutical industry, is targeted at discovering promising biomarkers for identifying development and progression of symptomatic knee osteoarthritis (OA)1. The OAI enrolled a total of 4,796 men and women ages 45–79, who either have, or are at increased risk of developing, knee OA. This longitudinal natural history study will evaluate these subjects for up to 8 years follow-up with radiography and magnetic resonance (MR) imaging. The OAI MR protocol2 was designed to allow thorough clinical and research evaluations of the femorotibial and patellofemoral joints of both knees and the study utilized matched, dedicated 3 Tesla (T) (Trio, Siemens Medical Solutions, Erlangen, Germany) MR systems at four clinical sites.

Longitudinal quantitative evaluation of cartilage disease including cartilage volume and thickness requires reproducible MR measurements over time. One component of measurement reproducibility is MR system stability and consistency. Standardized quality assurance (QA) methods and centralized automated image analysis were used to identify and to correct slowly developing problems, such as gradient field changes, eddy current increases or magnetic field decreases, prior to their impacting image quality or quantitative analysis results. The OAI MR QA process3 was designed to achieve consistency across the four sites enabling longitudinal quantitative analysis and pooling results to increase the statistical power. This report presents automated QA analysis results from the first 8 years (January 2004 through January 2012) and quantifies the variations in the QA metrics required for longitudinal, quantitative analysis of knee MR images. Cross-calibration of the OAI MR systems was presented in a prior work3. The timing and impact of changes in the MR systems, phantoms, and acquisition protocols over the course of the study are identified.

Methods

The four OAI MR facilities, located in Columbus, OH, Pittsburgh, PA, Pawtucket, RI, and Baltimore, MD, were outfitted with matched 3 T Siemens Trio MR systems (Siemens Medical Solutions, Erlangen, Germany), one quadrature transmit-receive head coil (USA Instruments, Aurora, OH) and three quadrature transmit-receive knee coils (USA Instruments, Aurora, OH). Service agreements with monthly preventative maintenance visits by the manufacturer’s service engineers were key components of the QA process in addition to daily, weekly, monthly, and annual QA acquisitions by the MR technologists3. Two aqueous phantoms were used: the American College of Radiology (ACR) MR accreditation phantom47 was measured in the head coil with a phantom holder (Chamco, Inc., Cocoa, FL); and a custom phantom (OAI)3 was measured in the knee coil. Monthly and annual QA analyses were centrally performed using automated image analysis software (SimplyPhysics, Baltimore, MD) with performance specifications that were more restrictive than, or equivalent to, variations allowed by the manufacturer or the ACR48.

The larger phantom (ACR; 148 mm length, 190 mm diameter) was previously shown3 to be more sensitive to small MR system changes than the smaller, study specific OAI phantom (120 mm length, 115 mm diameter) due to sampling of a larger gradient field. For this reason, the majority of results focus on longitudinal measurements of the ACR phantom. Furthermore, monthly QA with the ACR phantom was used to identify and initiate an additional service call to correct drift or any other performance deficits in the MR system. Measurements included signal-to-noise ratio (SNR), image uniformity, spatial accuracy, eddy current and gradient calibration. The monthly QA acquisition with the smaller OAI knee phantom was use to quantify the effects of MR system calibration on a knee image and included assessments of length, diameter and volume changes for cartilage quantification. The ACR and OAI phantom QA acquisitions were performed 2 weeks apart and thus the MR system performance was assessed twice each month. All QA acquisition protocols reflected the contrast and spatial resolution of the knee acquisitions; QA acquisitions using the knee coil were performed positioned 60 mm offset from magnet isocenter along the right–left (RL) axis to replicate the same physical locations used for right (R60) and left (L60) knee MR exams.

In November 2005, the spatial resolution of both the monthly and annual ACR acquisition was improved to better reflect the spatial resolution of the OQI knee acquisition (changed to: 555 × 704 matrix with 0.355 mm × 0.45 mm pixel dimension). No other QA acquisition protocol changes occurred during the study. MR system hardware and software changes were minimized, however inevitably changes were required to maintain or replace broken hardware, including phantoms. The knee RF coils were the same brand and design throughout this period; while one facility (site 1) was able to use the same knee coil for the entire duration, the other three sites had failures that required repaired and/or replacement of the knee coil. During this 8-year period, all the MR systems underwent one hardware upgrade to enable continued maintenance (TIM Trio, Siemens Medical Solutions, Erlangen, Germany) that included a change in the quadrature head coil (QED, Quality Electrodynamics, LLC, Mayfield Village, OH) but retained the identical quadrature knee coils. In September 2005, the building where the MR system was located at site 4 was demolished resulting in relocation of all the equipment. At this time, all software and hardware, including the gradient, head and body RF coils but excluding the magnet and knee coils, were replaced to create a TIM Trio level MR system. The other three sites were upgraded to the TIM Trio level in Spring 2010. The OAI phantom at sites 1 and 3 were replaced due to damage in September 2005 and November 2007, respectively. Because replacement phantoms have slightly different physical dimensions, the data cannot be combined; hence separate dimensional metrics are reported for the two periods for these sites.

Precision for each metric was determined by first calculating the mean and variance of all measurements for each site individually. These calculations were performed before and after any MR system changed and well as pooled for overall reproducibility. The mean was divided by the variance to determine the root-mean-square (RMS) coefficient-of-variation (CV%). All outliers were included in the calculation to provide a realistic representation of the MR system variation. The metric mean for each system and time period was used as a surrogate for the true system value. Systematic differences in metric values were evaluated using a two-sided paired Student’s t-test for each study period. Pooled analysis compared the measurements from all four sites for each study period.

Results

QA measurements were obtained for a minimum of 92 months and a maximum of 97 months at each of the four sites. A combined total of 398 monthly OAI and 363 monthly ACR phantom measurements were included in this analysis. The number of distinct phantom QA MR exams are identified per site in Tables IA and IIA for the ACR and OAI phantom, respectively. During this 8-year measurement period, the majority of QA performance measures were within documented target specifications3. Previously, issues of poor knee coil signal uniformity and systematically larger than nominal (3.0 mm and 5.0 mm) slice thicknesses had been reported;3 during this reporting period, the same problems were identified.

Table I.

Longitudinal measurements of the ACR phantom (A) length and (B) diameter (at slice 5, 25 mm off-isocenter). No phantom changes occurred during the measurement period, however all four systems were upgraded from the TRIO to the TIM TRIO level (site 4 in September 2005, the others in spring 2010). All system components including gradient and body RF coils as well as software were changed, only the magnet and knee RF coils remained identical. This change is more apparent in the length measurement (A). Actual phantom length is 148.0 mm and diameter 190.0 mm

Site Overall
Pre-upgrade
Post-upgrade
Pre- vs post-upgrade
N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) P-value
(A) ACR length
1 95 0.20% 146.8 0.29 146.1 147.4 78 0.14% 146.9 0.20 146.5 147.4 17 0.08% 146.4 0.12 146.1 146.6 <0.0001
2 93 0.22% 147.0 0.33 145.9 147.5 76 0.20% 147.1 0.30 145.9 147.5 17 0.13% 146.7 0.19 146.3 146.9 <0.0001
3 89 0.27% 147.0 0.39 146.2 147.6 70 0.19% 147.1 0.28 146.3 147.6 19 0.17% 146.4 0.25 146.2 147.1 <0.0001
4 82 0.27% 146.2 0.40 145.5 147.1 27 0.11% 146.7 0.17 146.5 147.1 55 0.17% 146.0 0.24 145.5 146.4 <0.0001
(B) ACR diameter (slice 5, 25 mm off-isocenter)
1 291 0.15% 190.0 0.29 189.3 190.5 240 0.15% 189.9 0.29 189.3 190.5 51 0.09% 190.2 0.18 189.8 190.4 <0.0001
2 279 0.13% 189.9 0.25 189.4 191.1 228 0.08% 189.8 0.16 189.5 190.3 51 0.22% 190.1 0.42 189.4 191.1 <0.0001
3 266 0.09% 189.7 0.18 189.3 190.3 209 0.08% 189.7 0.16 189.3 190.0 57 0.10% 189.8 0.19 189.5 190.3 <0.0001
4 246 0.13% 189.9 0.25 189.2 190.6 81 0.20% 189.8 0.38 189.2 190.6 165 0.05% 190.0 0.10 189.7 190.2 <0.0001

N is the number of measurements (one length and four diameters were measured at each time point; A/P measurements were excluded from this table due to air bubbles). Overall (pooled data) significance for pre- vs post-upgrade P = 0.0046 (length) and 0.018 (diameter).

Table II.

Longitudinal measurements of the OAI phantom (A) 3D spherical volume and (B) outer compartment T2 value. Both site 3 (November 2007) and site 1 (September 2005) replaced phantoms during the measurement period; the initial phantom measurements (site 3 = 44; site 1 =18) were eliminated from this report. This phantom change is immediately apparent in the 3D volume measurement (Fig. 3A). Approximate (original) phantom spherical volume is 210 mm3 (new 222 mm3) and the outer compartment target T2 value is 50 ms

Site Overall
Pre-upgrade
Post-upgrade
Pre- vs post-upgrade
N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) N RMS CV% Mean (mm) SD (mm) Min (mm) Max (mm) P-value
(A) 3D spherical volume
1 82 0.42% 212.8 0.90 208.1 214.5 60 0.39% 213.1 0.83 208.1 214.5 22 0.18% 211.9 0.39 210.8 212.4 <0.001
2 95 0.52% 211.1 1.10 208.5 214.1 76 0.33% 211.5 0.69 209.8 214.1 19 0.26% 209.4 0.55 208.5 210.4 <0.001
3 66 0.49% 221.5 1.09 219.6 223.6 34 0.30% 222.4 0.66 221.1 223.6 32 0.30% 220.6 0.67 219.6 221.8 <0.001
4 83 0.33% 209.4 0.69 208.1 211.5 16 0.35% 210.1 0.74 208.9 211.5 67 0.27% 209.2 0.57 208.1 210.7 0.002
(B) T2 value (outer compartment)
1 328 1.67% 52.1 0.87 50.1 54.6 240 1.53% 52.2 0.80 50.5 54.4 88 1.85% 51.7 0.96 50.1 54.6 0.080
2 380 4.58% 50.1 2.29 43.8 55.4 304 4.39% 49.9 2.19 43.8 54.2 76 5.15% 50.6 2.61 44.1 55.4 0.32
3 280 3.84% 52.0 2.00 45.6 58.9 148 4.57% 51.1 2.33 45.6 58.9 132 1.60% 52.9 0.84 49.7 54.1 <0.001
4 336 5.43% 52.0 2.82 45.3 58.7 64 4.52% 50.5 2.28 45.8 56.5 272 5.42% 52.3 2.83 45.3 58.7 0.020

N is the number of measurements (one volume and four T2 values were measured at each time point). Overall significance (pooled data) for pre- vs post-upgrade P = 0.012 (3D volume) and 0.18 (T2).

Geometric measurements

The longitudinal variation of the inside end-to-end length of the ACR phantom was <0.20% RMS CV% and had overall standard deviations (SDs) well below ±0.5 mm (±0.5 pixel) (Table IA, [Fig. 1(A)]). The length was consistently 1–1.5 mm shorter than the nominal value of 148.0 mm, one site was almost 2 mm shorter using the pre-upgrade TRIO MR system. After the TIM Trio upgrade, a further length decrease (range: 0.4–0.7 mm, factor of 2–3 times larger than the SD; statistically significant, overall P = 0.005) was observed in addition to a decrease in the length RMS CV%. Based on phantom length, the measurements for all sites within each period (pre- and post-upgrade) are equivalent (P = 0.8 and P = 0.5, respectively).

Fig. 1.

Fig. 1

Example longitudinal measurements of the ACR phantom (A) length (site 1) and (B) diameter (site 4) from slice 5 (25 mm off-isocenter). In both (A) and (B), the solid green line represents the nominal value (148.0 mm length; 190.0 mm diameter). In (B), the four measurement directions are individually identified: A/P (dark blue diamond), left/right (L/R; pink square), left upper to right lower diagonal (diag up; yellow triangle), and left lower to right upper diagonal (diag down; blue circle). No ACR phantom changes occurred during the measurement period, however the MR system upgrade (black arrow) occurred in April 2010 for site 1 and in September 2005 for site 4. This change is more apparent in the length measurement (A). Actual ACR phantom length is 148.0 mm and diameter 190.0 mm.

The longitudinal variation of the inside diameter of the ACR phantom was <0.25% RMS CV% (Table IB, Fig. 1(B)) for all sites. The SD was again below ±0.5 mm (±0.5 pixel) for all sites and was usually less than half this value. The measured phantom length was equivalent to the nominal value of 190.0 mm for all sites during each period (pre- and post-upgrade; P = 0.5 and P = 0.38, respectively). Thus for the diameter measurement, all four sites are equivalent. Consistent diameter measurements were achieved using the RL and two diagonal axis measurements; the anterior–posterior (AP) axis measurements were excluded because of greater variability due to the intermittent presence of an air bubble. The TIM Trio upgrade generally resulted in a decrease in the RMS CV% as well as a significant increase in measured diameter (range: 0.07–0.27 mm, same order of magnitude as the SD; statistically significant, overall P = 0.02).

Longitudinal variation of the inside end-to-end length and diameter of the OAI phantom (data not shown) was ≤0.25% and ≤0.20% RMS CV%, respectively, and these measurements were less sensitive to MR system calibration than the larger ACR phantom. The inner length and diameter measurements of the OAI phantom at R60 and L60 were also highly correlated.

The longitudinal variation of the three-dimensional spherical volume of the OAI phantom was <0.4% RMS CV% (Table IIA, [Fig. 2(A)]). The impact of a phantom change (~5% volume increase) and the MR system upgrade (~1% volume decrease) can be seen in Fig. 2(A) (November 2007 and May 2010, respectively); a facility with large, easily identified changes was selected to demonstrate this worst case example. The TIM Trio upgrade resulted in a decrease in the RMS CV% for all sites except one (site 3) which had identical variance pre- and post-upgrade. In addition, an overall volume decrease due to the upgrade was identified (range: 0.84 mm3–2.1 mm3, a factor of 1.5–4 more than the SD; statistically significant, overall P = 0.01). This significantly decreased volume is presumably caused by the same gradient changes that caused the significant length decrease and diameter increase observed in the ACR phantom. No systematic drifts were identified.

Fig. 2.

Fig. 2

Example longitudinal measurements of the OAI phantom (A) 3D spherical volume (site 3) and (B) T2 value (site 1). In (A), the volume for the right (+60 mm off-isocenter; blue square) and left knee (−60 mm off-isocenter; pink square) coil positions are tracked separately to understand the impact of gradient non-linearity. In (B), in addition to tracking the right and left knee coil positions separately, the outer phantom is mathematically divided into are quadrants to understand the impact of RF non-uniformity as well as the interaction between the RF and gradient fields. Thus four separate measurements are available for each the right and left knee coils positions. The top two quadrants are shown for the left knee coil position (upper left quadrant pink square; upper right quadrant pink triangle) and the right knee coil position (upper left quadrant blue square; upper right quadrant blue triangle). Both site 3 (November 2007) and site 1 (September 2005) replaced OAI phantoms (green arrow) during the measurement period. The MR system upgrade (black arrow) for both sites occurred in May 2010. The 5% volume change in (A) was caused by the different phantom. Approximate (original) phantom spherical volume is 210 mm3 (new phantom: 222 mm3) and the outer compartment target T2 value is 50 ms.

The longitudinal variation of the outer compartment T2 values (target 50 ms) of the OAI phantom was <5.5% RMS CV% (Table IIB, [Fig. 2(B)]). The longitudinal variation of the inner compartment T2 values (target 18 ms) ranged from 2.1% to 3.9% RMS CV% (data not shown). Two sites had large seasonal fluctuations. One site had a systematic downward drift over the measurement period (trend to lower T2 values over study period). These seasonal fluctuations and the slow downward drift are demonstrated in Fig. 2(B) (again the worst case example). Any impact of the MR system upgrade (April 2010) is masked by the impact of environmental variability (overall P = 0.18). T2 values were comparable between sites before the TIM Trio upgrade (except site 1), as well as between the right and left knee. The TIM Trio upgrade resulted in no effective change in the RMS CV%. After the upgrade, the outer compartment T2 values for only two sites (sites 1 and 2) were identical (P = 0.96) and for the other two sites (sites 3 and 4) these values were significantly higher (P = 0.02) by 1.75 ms.

Discussion

We compare 8 years of longitudinal QA on four MR systems, located at environmentally diverse and physically distant facilities. We focused primarily on geometric measurements as they have the largest influence on the accuracy and reproducibility of quantitative measurements of anatomic structures, although the impact of phantom variability on quantitative cartilage analysis has not been determined. Subjective or semi-quantitative assessments will not be altered by the very small changes in the QA measurements. The stability of phantom T2 measurements as well as the impact of MR system changes and phantom changes are also reported. Because of its much smaller size, phantom T2 measurements are sensitive to seasonal environmental fluctuations, such as temperature which may not only impact T2 values but may also impact detection electronics or overall power levels into the various system components. We find greater longitudinal variability in phantom T2 values than those made in human cartilage914. The first 3 years of longitudinal geometric measurement on the four 3 T OAI MR systems resulted in 0.04% and 0.6% RMS CV% for a 190.0 mm diameter and 148.0 mm length object (ACR phantom)3. Over the inclusive 8-year period, this variation increased to <0.25% for diameter and decreased to <0.20% for length due in part to the impact of the MR system upgrade. Keevil et al.15 measured inner diameter reproducibility of 0.6% ± 0.2% over a 160.0 mm diameter object over 3.5 years on a 1.5 T MR system. This QA evaluation as well as several other studies1517 found the largest impact on measurement accuracy and reproducibility to be in-plane spatial resolution, SNR for edge detection, phantom positioning, and air bubbles in the test object.

Due to sampling over a larger gradient field, the larger test object (ACR phantom) was again found to be more sensitive than the smaller (OAI phantom) to changes in spatial dimension3. Length measurements were consistently 1–2 mm shorter than the nominal value for the ACR phantom and were within the spatial error found by other authors using only 2D gradient distortion calibration1820. Mulkern et al.20 found 0.87% RMS CV% for ACR phantom length (mean 146.9 cm) and 0.55% RMS CV% for diameter (mean 189.9 cm), which are comparable to the changes measured in the OAI MR QA program. Over a 95 mm radius sphere, Wang et al.18 found the maximal geometric error to be 2.0–2.5 mm. For a similar sized test object, this error is larger than found in the OAI MR QA program and by Mulkern20. Conversely, in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) trial, 3 years of MR QA in the head coil with a 200.0 mm diameter phantom found approximately 1 mm accumulated stretching in all dimensions following 3D gradient distortion correction10.

Central cross-calibration of the ACR phantoms was reported in the prior study3 and found the phantom inner diameters and length were identical. However when these phantoms were imaged at their local sites, small differences in geometric measurements were observed due different gradient calibrations. In the prior study the four OAI MR facilities were found to have equivalent spatial metric values over the 3-year period and the study images were able to be pooled for analysis3. In this manuscript, all four MR systems during both the pre-upgrade and post-upgrade periods were found to have equivalent ACR phantom geometric values, thereby enabling pooling of the study images for analysis. Differences between pre- and post-upgrade ACR length (0.3–0.5%) and diameter (0.1–0.2%) however are significant and were found to result in significantly decreased OAI spherical volume measurements post-upgrade (0.4–1.0%, P = 0.01).

The first 3 years of longitudinal spherical volume measurements found 0.46% RMS CV%3. These measurements over 8 years, inclusive, decreased to <0.35% RMS CV% in part due to the MR system upgrade. These variations and even the 0.4–1.0% decrease in spherical volume due to the upgrade are smaller than the unpaired (repositioning and reanalysis) precision error found in the 3 T OAI pilot studies for cartilage volume and thickness in the weight-bearing femorotibial compartment with coronal fast low-angle shot (FLASH) 3.0–6.4%, coronal multi-planar reformat (MPR) dual-echo steady state (DESS) 2.4–6.2%, and sagittal DESS 2.3–8.2% RMS CV21. The longitudinal variation in 3D volume and the decrease in volume due to the system upgrade are also smaller than the variation of annualized percentage change using paired analysis in the femorotibial joint (N = 150)16. In this study, Hunter et al.22 found the SD of the cartilage volume metric to vary from 2.97% to 12.29%. Even cartilage thickness metrics in the weight-bearing femorotibial joint measured using either FLASH (N = 239) or DESS (N = 107) images were found to have greater variations over both 1 and 2-year periods23.

The most significant challenge with reproducibility of geometric measurements and MR system changes could be alleviated if gradient calibration allowed manual overrides for the digital setting. This would result in standardized gradient amplitudes and standardized distance measurements, which in turn would guarantee the ability to pool data across study sites and guarantee the ability to perform system upgrades and service recalibrations on an as needed basis without risking quantitative measurements. Similar to our findings, Gunter et al.10 in the ADNI study also documented that service recalibrations and MR system upgrades introduced geometric changes larger than the longitudinal drift. With manual gradient overrides, it would be possible to improve site-to-site accuracy and to ensure hardware and software upgrades as well as service visits do not impact geometric measurement accuracy and reproducibility. Some MR manufacturers have this capability, others do not. This finding is similar to that of Moorhead et al.9 where between scanner variation had 0.8–4.0% and within scanner variation had 0.00–0.02% gray matter volume difference in 14 volunteers across three sites (with two scans at each site). To enable pooling data between sites for the CaliBrain project9, tissue classification software was developed to cross-calibrate the MR systems. Similar to Moorhead9, the ADNI study10 utilized phantom-based scaling correction to reduce observed longitudinal geometric variation in human images by a factor of one-third or more. In contrast, the OAI had more consistent between scanner accuracy and less longitudinal variability and should enable pooling of the data without per scanner correction even with the MR system upgrade. This difference is most likely because only one vendor MR system was utilized in the OAI and, although the knee MR exams were positioned ±60 mm off-isocenter, the smaller imaging field-of-view may have contributed to the smaller longitudinal variation. As demonstrated above, the systematic, scheduled collection of standardized QA exams from MR systems used in longitudinal, multi-center studies is essential for the direct comparison of images as well as for longitudinal data analysis.

The test–retest reproducibility of the T2 values (<5.5% outer and <4.0% inner compartment RMS CV%) were similar to other single site11 as well as multi-site longitudinal phantom studies24. Our re-measurement precision was also within multi-site test–retest human studies14 and encompassed the 1.9%–4.7% RMS CV% reanalysis errors12,13. The stability of our phantom T2 measurements in part reflect the seasonal environmental fluctuations present in the magnet screen room, any external variations of the power supply (even though two layers of independent power conditioning were utilized), the uniformity of knee coil refocusing pulses, as well as any evaporation that may have occurred (resulting in shorter T2 values). A one-to-one adjustment of the human T2 data should not be made based on the measured fluctuations in phantom T2 value, but rather the phantom data should be used to minimize the environmental fluctuations.

In conclusion, independent centralized QA analyses over the first 8 years of the OAI assessed the longitudinal consistency of the MR image geometric distortion found <0.35% RMS CV% spatial variability of 3D spherical volume. These findings are consistent with prior results3. The MR system upgrade resulted in improved stability, but slightly smaller length, diameter and spherical volume measurements, all of which were statistically significant. Spatial reproducibility measurements indicated that longitudinal MR system variations should have minimal impact on the accuracy and reproducibility of cartilage morphometry, including thickness and volume metrics during either the pre-upgrade or post-upgrade periods. Comparison between systems indicates that pooling of results is supported for this 8-year measurement period. Measurements on the larger ACR phantom were more sensitive to spatial dimension changes compared to those made on the smaller OAI phantom. This was expected and we recommend use of an even larger rigid test object for geometric accuracy measurements in any future longitudinal study.

Acknowledgments

Role of the funding source

The NIH provided funding for this work. The study sponsors were not involved in the study design, collection, analysis and interpretation of data; in the writing of the manuscript; and in the decision to submit the manuscript for publication.

The data reported in this article was supported in part by contracts N01-AR-2-2258, N01-AR-2-2259, N01-AR-2-2260, N01-AR-2-2261 and N01-AR-2-2262. We are grateful to all the OAI MR Technologists for their participation in the acquisition of the QA data and excellent attention to detail throughout the study.

Footnotes

Author contributions

Both authors (ES and MN) have made substantial contributions to all three of sections below:

  1. the conception and design of the study, or acquisition of data, or analysis and interpretation of data;
  2. drafting the article or revising it critically for important intellectual content;
  3. final approval of the version to be submitted.

ES (schneie1@ccf.org) takes responsibility for the integrity of the work as a whole, from inception to finished article.

Conflict of interest statement

Both authors had fee for service contracts with the OAI. In particular:

  • ES is the principal of SciTrials, LLC, is the OAI Technical Advisor and is under contract to NIH for this purpose.
  • MN is the principal of SimplyPhysics, LLC, is responsible for OAI MR QA central analysis and is under contract with the UCSF Coordinating Center (N01-AR-2-2258) for this purpose.

References

  • 1.The OAI Images, Clinical Data and Biospecimens are Public Access Resources. www.oai.ucsf.edu and https://niams-imaging.nci.nih.gov/ncia/login.jsf.
  • 2.Peterfy CG, Schneider E, Nevitt M. The Osteoarthritis Initiative: protocol design considerations for magnetic resonance imaging of the knee. Osteoarthritis Cartilage. 2008;16:1433–41. doi: 10.1016/j.joca.2008.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Schneider E, NessAiver M, White D, Purdy D, Martin L, Fanella L, et al. The Osteoarthritis Initiative (OAI) magnetic resonance imaging quality assurance methods and results. Osteoarthritis Cartilage. 2008;16:994–1004. doi: 10.1016/j.joca.2008.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Price RR, Axel L, Morgan T, Newman R, Perman W, Schneiders S, et al. Quality assurance methods and phantoms for magnetic resonance imaging: report of AAPM nuclear magnetic resonance task group no. 1a. Med Phys. 1990;17:287–95. doi: 10.1118/1.596566. [DOI] [PubMed] [Google Scholar]
  • 5.Och JG, Clarke GD, Sobol WT, Rosen CW, Mun SK. Acceptance testing of magnetic resonance imaging systems: report of AAPM nuclear magnetic resonance task group no. 6. Med Phys. 1992;19:217–29. doi: 10.1118/1.596903. [DOI] [PubMed] [Google Scholar]
  • 6.American College of Radiology (ACR) MRI Accreditation Program Requirements. http://www.acr.org/accreditation/mri/mri.html.
  • 7.American College of Radiology (ACR) Phantom Test Guidance for the ACR MRI Accreditation Program. Reston, VA: 1998. [Google Scholar]
  • 8.NEMA. Document MS 2. Washington, DC: National Electrical Manufacturer’s Association; 1989. Determination of Two-dimensional Geometric Distortion in Diagnostic Magnetic Resonance Images. [Google Scholar]
  • 9.Moorhead TW, Gountouna V-E, Job DE, McIntosh AM, Romaniuk L, Lymer GKS, et al. Prospective multi-centre voxel based morphometry study employing scanner specific segmentations: procedure development using CaliBrain structural MRI data. BMC Med Imaging. 2009;9:8–20. doi: 10.1186/1471-2342-9-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Gunter JL, Bernstein MA, Borowski BJ, Ward CP, Britson PJ, Felmlee JP, et al. Measurement of MRI scanner performance with the ADNI phantom. Med Phys. 2009;36:2193–205. doi: 10.1118/1.3116776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kjos BO, Ehman RL, Brant-Zawadzki M. Reproducibility of T1 and T2 relaxation times calculated from routine MR imaging sequences: phantom study. Am J Roentgenology. 1985;144:1157–63. doi: 10.2214/ajr.144.6.1157. [DOI] [PubMed] [Google Scholar]
  • 12.Koff MF, Parratte S, Amrami KK, Kaufman KR. Examiner repeatability of patellar cartilage T2 values. Magn Reson Imaging. 2009;27:131–6. doi: 10.1016/j.mri.2008.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Welsch GH, Mamisch TC, Weber M, Horger W, Bohndorf K, Trattnig S. High-resolution morphological and biochemical imaging of articular cartilage of the ankle joint at 3. 0 T using a new dedicated phased array coil: in vivo reproducibility study. Skeletal Radiol. 2008;37:519–26. doi: 10.1007/s00256-008-0474-z. [DOI] [PubMed] [Google Scholar]
  • 14.Mosher TJ, Zhang Z, Reddy R, Boudhar S, Milestone BN, Morrison WB, et al. Knee articular cartilage damage in osteoarthritis: analysis of MR image biomarker reproducibility in ACRIN-PA 4001 multicenter trial. Radiology. 2011;258(3):832–42. doi: 10.1148/radiol.10101174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Keevil SF, Barbiroli B, Collins DJ, Danielsen ER, Hennig J, Henriksen O, et al. Quality assessment in in vivo NMR spectroscopy: IV. A multi-center trial of test objects and protocols for performance assessment in clinical NMR spectroscopy. Magn Reson Imaging. 1995;13(1):139–57. doi: 10.1016/0730-725x(94)00090-p. [DOI] [PubMed] [Google Scholar]
  • 16.DeWilde J, Price D, Curran J, Williams J, Kitney R. Standardization of performance evaluation in MRI: 13 years’ experience of intersystem comparison. Concepts Magn Reson. 2002;15(1):111–6. [Google Scholar]
  • 17.Hyde RJ, Ellis JH, Gardner EA, Zhang Y, Carson PL. MRI scanner variability studies using a semi-automated analysis system. Magn Reson Imaging. 1994;12(7):1089–97. doi: 10.1016/0730-725x(94)91241-n. [DOI] [PubMed] [Google Scholar]
  • 18.Wang D, Strugnell W, Cowin G, Doddrell DM, Slaughter R. Geometric distortion in clinical MRI systems. Part I: evaluation using a 3D phantom. Magn Reson Imaging. 2004;22(9):1211–21. doi: 10.1016/j.mri.2004.08.012. [DOI] [PubMed] [Google Scholar]
  • 19.Wang D, Doddrell DM. Geometric distortion in structural magnetic resonance imaging. Curr Med Imag Rev. 2005;1:49–60. [Google Scholar]
  • 20.Mulkern RV, Forbes P, Dewey K, Osganian S, Clark M, Wong S, et al. Establishment and results of a magnetic resonance quality assurance program for the pediatric brain tumor consortium. Acad Radiol. 2008;15:1099–110. doi: 10.1016/j.acra.2008.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Eckstein F, Hudelmaier M, Wirth W, Kiefer B, Jackson R, Yu J, et al. Double Echo Steady State (DESS) magnetic resonance imaging of knee articular cartilage at 3 tesla – a pilot study for the Osteoarthritis Initiative. Ann Rheum Dis. 2006;65:433–41. doi: 10.1136/ard.2005.039370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hunter DJ, Niu J, Zhang Y, Totterman S, Tamez J, Dabrowski C, et al. Change in cartilage morphometry: a sample of the progression cohort of the Osteoarthritis Initiative. Ann Rheum Dis. 2009;68:349–56. doi: 10.1136/ard.2007.082107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wirth W, Larroque S, Davies RY, Nevitt M, Gimona A, Baribaud F, et al. Comparison of 1-year vs 2-year change in regional cartilage thickness in osteoarthritis results from 346 participants from the Osteoarthritis Initiative. Osteoarthritis Cartilage. 2011;19:74–83. doi: 10.1016/j.joca.2010.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lerski RA, McRobbie DW, Straughan K, Walker PM, de Certaines JD, Bernard AM. Multi-center trial with protocols and prototype test objects for the assessment of MRI equipment. Magn Reson Imaging. 1998;6(2):201–14. doi: 10.1016/0730-725x(88)90451-1. [DOI] [PubMed] [Google Scholar]

RESOURCES